Theory of Probability and Mathematical Statistics
Test for mean matrix in GMANOVA model under heteroscedasticity and non-normality for high-dimensional data
Takayuki Yamada, Tetsuto Himeno, Annika Tillander and Tatjana Pavlenko
Link
Abstract: This paper develops a unified testing methodology for high-dimensional generalized multivariate analysis of variance (GMANOVA) models. We derive a test of the bilateral linear hypothesis on the mean matrix in a general scenario where the dimensions of the observed vector may exceed the sample size, design may be unbalanced, the population distribution may be non-normal and the underlying group covariance matrices may be unequal. The suggested methodology is suitable for many inferential problems, such as the one-way MANOVA test and the test for multivariate linear hypothesis on the mean in the polynomial growth curve model. As a key component of our test procedure, we propose a bias-corrected estimator of the Frobenius norm of the mean matrix. We derive null and non-null asymptotic distributions of the test statistic under a general high-dimensional asymptotic framework that allows the dimensionality to arbitrarily exceed the sample size of a group. The accuracy of the proposed test in a finite sample setting is investigated through simulations conducted for several high-dimensional scenarios and various underlying population distributions in combination with different within-group covariance structures. For a practical demonstration we consider a daily Canadian temperature dataset that exhibits group structure, and conclude that the interaction of latitude and longitude has no effect to predict the temperature.
Keywords: Asymptotic distribution, bilateral linear hypothesis on mean matrix, bias correction approach, (N,p)-asymptotic
Bibliography: T. W. Anderson, An introduction to multivariate statistical analysis, 3rd ed., Wiley Series in Probability and Statistics, Wiley-Interscience [John Wiley & Sons], Hoboken, NJ, 2003. MR 1990662
Zhidong Bai, Kwok Pui Choi, and Yasunori Fujikoshi, Limiting behavior of eigenvalues in high-dimensional MANOVA via RMT, Ann. Statist. 46 (2018), no. 6A, 2985–3013. MR 3851762, DOI 10.1214/17-AOS1646
Zhidong Bai and Hewa Saranadasa, Effect of high dimension: by an example of a two sample problem, Statist. Sinica 6 (1996), no. 2, 311–329. MR 1399305
T. Tony Cai, Weidong Liu, and Yin Xia, Two-sample test of high dimensional means under dependence, J. R. Stat. Soc. Ser. B. Stat. Methodol. 76 (2014), no. 2, 349–372. MR 3164870, DOI 10.1111/rssb.12034
T. Tony Cai and Yin Xia, High-dimensional sparse MANOVA, J. Multivariate Anal. 131 (2014), 174–196. MR 3252643, DOI 10.1016/j.jmva.2014.07.002
Song Xi Chen, Jun Li, and Ping-Shou Zhong, Two-sample and ANOVA tests for high dimensional means, Ann. Statist. 47 (2019), no. 3, 1443–1474. MR 3911118, DOI 10.1214/18-AOS1720
Song Xi Chen and Ying-Li Qin, A two-sample test for high-dimensional data with applications to gene-set testing, Ann. Statist. 38 (2010), no. 2, 808–835. MR 2604697, DOI 10.1214/09-AOS716
Yasunori Fujikoshi, Vladimir V. Ulyanov, and Ryoichi Shimizu, Multivariate statistics, Wiley Series in Probability and Statistics, John Wiley & Sons, Inc., Hoboken, NJ, 2010. High-dimensional and large-sample approximations. MR 2640807, DOI 10.1002/9780470539873
Anil K. Ghosh and Munmun Biswas, Distribution-free high-dimensional two-sample tests based on discriminating hyperplanes, TEST 25 (2016), no. 3, 525–547. MR 3531841, DOI 10.1007/s11749-015-0467-x
C. C. Heyde and B. M. Brown, On the departure from normality of a certain class of martingales, Ann. Math. Statist. 41 (1970), 2161–2165. MR 293702, DOI 10.1214/aoms/1177696722
Sayantee Jana, Narayanaswamy Balakrishnan, Dietrich von Rosen, and Jemila Seid Hamid, High dimensional extension of the growth curve model and its application in genetics, Stat. Methods Appl. 26 (2017), no. 2, 273–292. MR 3652497, DOI 10.1007/s10260-016-0369-4
Robb J. Muirhead, Aspects of multivariate statistical theory, Wiley Series in Probability and Mathematical Statistics, John Wiley & Sons, Inc., New York, 1982. MR 652932, DOI 10.1002/9780470316559
J. O. Ramsay and B. W. Silverman, Functional data analysis, 2nd ed., Springer Series in Statistics, Springer, New York, 2005. MR 2168993, DOI 10.1007/b98888
M. S. Srivastava, Methods of multivariate statistics, Wiley Series in Probability and Statistics, Wiley-Interscience [John Wiley & Sons], New York, 2002. MR 1915968
Muni S. Srivastava and Tatsuya Kubokawa, Tests for multivariate analysis of variance in high dimension under non-normality, J. Multivariate Anal. 115 (2013), 204–216. MR 3004555, DOI 10.1016/j.jmva.2012.10.011
Muni S. Srivastava and Martin Singull, Test for the mean matrix in a growth curve model for high dimensions, Comm. Statist. Theory Methods 46 (2017), no. 13, 6668–6683. MR 3631538, DOI 10.1080/03610926.2015.1132328
Sho Takahashi and Nobumichi Shutoh, Tests for parallelism and flatness hypotheses of two mean vectors in high-dimensional settings, J. Stat. Comput. Simul. 86 (2016), no. 6, 1150–1165. MR 3441561, DOI 10.1080/00949655.2015.1055269
Dietrich von Rosen, Bilinear regression analysis, Lecture Notes in Statistics, vol. 220, Springer, Cham, 2018. An introduction. MR 3823252, DOI 10.1007/978-3-319-78784-8
Lan Wang, Bo Peng, and Runze Li, A high-dimensional nonparametric multivariate test for mean vector, J. Amer. Statist. Assoc. 110 (2015), no. 512, 1658–1669. MR 3449062, DOI 10.1080/01621459.2014.988215
Wei Wang, Nan Lin, and Xiang Tang, Robust two-sample test of high-dimensional mean vectors under dependence, J. Multivariate Anal. 169 (2019), 312–329. MR 3875602, DOI 10.1016/j.jmva.2018.09.013
Takayuki Yamada and Tetsuto Himeno, Testing homogeneity of mean vectors under heteroscedasticity in high-dimension, J. Multivariate Anal. 139 (2015), 7–27. MR 3349477, DOI 10.1016/j.jmva.2015.02.005
Takayuki Yamada and Tetsuro Sakurai, Asymptotic power comparison of three tests in GMANOVA when the number of observed points is large, Statist. Probab. Lett. 82 (2012), no. 3, 692–698. MR 2887488, DOI 10.1016/j.spl.2011.12.004
Bu Zhou, Jia Guo, and Jin-Ting Zhang, High-dimensional general linear hypothesis testing under heteroscedasticity, J. Statist. Plann. Inference 188 (2017), 36–54. MR 3648316, DOI 10.1016/j.jspi.2017.03.005