Some methodologists complain that researchers do not check assumptions before doing a test. But in some cases, it does not help much when researchers do. For example, when there are no clear standards on how to test assumptions, and on the robustness of tests against violation of these assumptions, results can still be interpreted badly.
In the case of the independent t-test, it is often assumed that this test is robust against violation of the normality assumption, at least when the sample sizes of both groups are at least 30. But is becoming clear that even small deviations from normality may lead to a very low power (Wilcox, 2011). So, when a groups do differ, t-tests do often not detect it. This is especially true for mixed normal distributions which have a relative large variance. At the left side of figure 1, two normal distributions are compared and the independent t-test has a power of .9; at the right side of this figure, two mixed normal distributions with variance 10.9 are compared, but here the power is only .28. So, in this case researcher might conclude from a plot that the assumption of normality is not violated. The result of the t-test might be that the null-hypothesis of equal means cannot be discarded, and therefore the conclusion is that the groups do not differ. What is missed is that the distribution is actually a mixed normal distribution, and a difference was not detected due to low power.
Figure 1 At the left two normal distributions that differ 1, power of the independent t-test is .9; at the right two mixed-normal distributions with variance=10.9 that differ 1, power of the independent t-test is now .28
The main problem of the mixed normal distribution is its heavy tails. There are relatively many outliers. One solution is to use a 20% trimmed mean. This means that the highest twenty percent of the values is discarded, as is the lowest twenty percent. A method called Yuen’s method uses this trimmed mean in combination with Winsorized variance (Wilcox, 2011). This is the variance when the highest twenty percent of values is displaced by the remaining highest value, and the lowest twenty percent of values is displaced by the remaining lowest value. Yuen’s method performs better in terms of power, controlling type I errors and getting accurate confidence intervals.
Another assumption that is often violated with large consequences is the assumption of homoscedasticity. Homoscedasticity means that both groups have equal variances. Often Levene’s test and an F-test are used to assess whether the assumption is violated. But both tests are themselves susceptible to violation of the normality assumption. Therefore it is recommended to use a variant of the independent t-test that does not assume equal variances, for example Welch’s test. Zimmerman (2004) even suggests that Welch’s test should be used in all cases, even when distributions are homoscedastic. With Welch’s test the probability of a type I error can be controlled better, and therefore the power is higher in most situations.
Wilcox, R. (2011) Modern Statistics for the Social and Behavioral Sciences. New York: CRCPress.