Performance of Seven Statistics for Mean Difference Testing Between Two Populations Under Combined Assumption Violations Two-sample location test


Montri Sangthong Praphat Klubnual


        The objective of this research was to compare the performance of seven statistics for mean difference testing between two populations when data did not follow assumptions whereas the simulation conditions were determined as 5 distributions, variance, and sample size in both cases are equal and unequal. The results showed that when the population had log-normal distribution, gamma distribution andpoisson distribution and equal variance, the Welch Based on Rank test (WBR test) were most effective. When the population had log-normal distribution, gamma distribution,exponential distribution,poisson distribution and uniform distribution  and unequal variance, the Welch t test was distinctively found to have a higher performance than others testing statistics.

Keywords: Two-sample location test, Parametric test, Non-parametric test, Non-parametric bootstrap test, t test, Welch t test, Welch Based on Rank test, Brunner-Munzel test, Yuen-Welch test, Exact Wilcoxon signed-rank test


Ahad, N. A., Othman, A. R., & Yahaya, S. S. (2011). Type I error rates of the two-sample pseudo-median procedure. Journal of Modern Applied Statistical Methods, 10, 418-423.
Bradley, J. V. (1978). Robustness?. Journal of Mathematical and Statistical Psychology, 31, 321-339.
Bridge, P. D., & Sawilowsky, S. S. (1999). Increasing physicians awareness of the impact of statistics on research outcomes: comparative power of the t-test and Wilcoxon rank-sum test in small samples applied research. Journal of Clinical Epidemiology, 52, 229-235.
Brunner, E., & Munzel, U. (2000) The nonparametric behrens-fisher problem: Asymptotic theory and a small-sample approximation. Biometrical Journal, 42, 17–25.
Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates.
Efron, B., & Tibshirani, R. J. (1993). An Introduction to the Bootstrap. New York: Chapman & Hall.
Fagerland, M. W. & Sandvik, L. (2009). Performance of five two-sample location tests for skewed distributions with unequal variances. Contemporary Clinical Trial, 30, 490-496.
Harwell, M. R., Rubinstein, E. N., Hayes, W. S., & Olds, C. C. (1992). Summarizing Monte Carlo results in methodological research: The one- and two-factor fixed effects ANOVA cases. Journal of Educational Statistics, 17, 315-339.
Harwell, M. R., & Serlin, R. C. (1989). A nonparametric test statistic for the general linear model. Journal of Educational Statistics, 14, 351-371.
Mann, H. B., & Whitney, D. R. (1947). On a Test of Whether One of Two Random Variables is Stochastically Larger than the other. Annals of Mathematical Statistics, 18, 50–60.
Medina, J. M., Kimberg, D. Y., Chatterjee, A., & Coslett, H. B. (2010). Inappropriate usage of the Brunner-Munzel test in recent voxel-based lesion-symtom mapping studies. Neuropsychologia, 48, 341-343.
Reiczigel, J., Zakarias, I., & Rozsa, L. (2005). A bootstrap test of stochastic equality of two populations. The American Statistician, 59, 1-6.
Stonehouse, J. M., & Forrester, G. J. (1998). Robustness of the t and U tests under combined assumption violations. Journal of Applied Statistics, 25, 63-74.
Welch, B. L. (1938). The significance of the difference between two means when the population variancesare unequal. Biometrika, 29, 350-362.
Wilcox, R. R. (1990). Comparing the mean of two independent group. Biometrical Journal, 32, 771-780.
Wilcox, R. R. (1994). Some results on the Tukey-Mclaughlin and Yuen methods for trimmed means when distribution are skewed. Biometrical Journal, 3, 259-273.
Wilcox, R. R. (2005). Introduction to robust estimation and hypothesis testing (2nd ed.). San Diego, CA: Academic Press.
Wilcox, R. R., & Keselman, H. J. (2003). Modern robust data analysis method: measures of central tendency. Psychological Methods, 8, 254-274.
Wilcoxon, F. (1945). Individual Comparisons by Ranking Methods. Biometrics, 1, 80–83.
Winter, J. C. F. (2013). Using the student’s t-test with extremely small sample sizes. Practical Assessment, Research & Evaluation, 18, 1-12.
Yuen, K. K. (1974). The two-sample trimmed t for unequal population variances. Biometrika, 61, 165-170.
Zimmerman, D. W., & Zumbo, B. D. (1993a). Rank transformations and the power of the Student t test and Welch t test non-normal populations with unequal variances. Canadian Journal of Experimental Psychology, 47, 523-539.
Zimmerman, D. W., & Zumbo, B. D. (1993b). The relative power of parametric and nonparametric statistical methods. In G. Keren & C. Lewis (Eds.), A handbook for data analysis in the behavioral sciences: Methodological issues (pp. 481-517). Hillsdale, NJ: Erlbaum.

Research Articles


How to Cite
SANGTHONG, Montri; KLUBNUAL, Praphat. Performance of Seven Statistics for Mean Difference Testing Between Two Populations Under Combined Assumption Violations. Naresuan University Journal: Science and Technology (NUJST), [S.l.], v. 29, n. 4, p. 112-126, may 2021. ISSN 2539-553X. Available at: <>. Date accessed: 18 jan. 2022. doi: