Abstract
Rolling window analysis is a popular tool in time series research. However, conducting hypothesis tests on all rolling windows simultaneously introduces a multiple testing problem. In the literature, bootstrapping the maximum of all statistics from rolling windows is the most commonly used, if not the only, method to address this issue. This paper seeks to provide a simpler and faster alternative to bootstrap methods by adapting p-value combination techniques that are popular in genome-wide association studies to the context of mean tests in a time series rolling window analysis. Some p-value combination methods in genetics require knowledge of the correlation structure of test statistics, which can typically be obtained from external sources. However, such information is often unavailable for time series datasets. To address this challenge, we employ the autoregressive sieve approach, which allows for the computation of correlation structures based on estimated autoregressive coefficients. We present finite sample simulations to illustrate the performance of p-value combination methods in a rolling window setting and offer recommendations for practitioners and future researchers in this area.
Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Alonso AM, Peña D, Romo J (2003) On sieve bootstrap prediction intervals. Stat & Probab Lett 65(1):13–20
Andrews DW (1991) Heteroskedasticity and autocorrelation consistent covariance matrix estimation. Econ: J Econ Soc. https://doi.org/10.2307/2938229
Andre’es MA, Pena D, Romo J (2002) Forecasting time series with sieve bootstrap. J Stat Plan Inference 100(1):1–11
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc: Ser B (Methodol) 57(1):289–300
Bland JM, Altman DG (1995) Multiple significance tests: the Bonferroni method. BMJ 310(6973):170
Bonferroni C (1936) Teoria statistica delle classi e calcolo delle probabilita. Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commericiali di Firenze 8:3–62
Bühlmann P (1997) Sieve bootstrap for time series. Bernoulli 123–148
Bühlmann P (1998) Sieve bootstrap for smoothing in nonstationary time series. Ann Stat 26(1):48–83
Bühlmann P (2002) Bootstraps for time series. Stat Sci 71(2):435–459
Conneely KN, Boehnke M (2007) So many correlated tests, so little time! Rapid adjustment of P values for multiple correlated tests. Am J Hum Genet 81(6):1158–1168
Cryer JD, Chan K-S (2008) Time series analysis: With applications in R, vol 2. Springer
Good IJ (1958) Significance tests in parallel and in series. J Am Stat Assoc 53(284):799–813
Härdle W, Horowitz J, Kreiss J-P (2003) Bootstrap methods for time series. Int Stat Rev 71(2):435–459
Huang P, Tilley BC, Woolson RF, Lipsitz S (2005) Adjusting O’brien’s test to control type I error for the generalized nonparametric behrens-fisher problem. Biometrics 61(2):532–539
Hušková M, Kirch C, Prášková Z, Steinebach J (2008) On the detection of changes in autoregressive time series, II. Resampling procedures. J Stat Plan Inference 138(6):1697–1721
Hušková M, Prášková Z, Steinebach J (2007) On the detection of changes in autoregressive time series I. Asymptotics. J Stat Plan Inference 137(4):1243–1259
Kim J, Bai Y, Pan W (2015) An adaptive association test for multiple phenotypes with gwas summary statistics. Genet Epidemiol 39(8):651–663
Kreiss J-P, Paparoditis E, Politis DN (2011) On the range of validity of the autoregressive sieve bootstrap. Ann Stat 39(4):2103–2130
Li M-X, Gui H-S, Kwan JS, Sham PC (2011) Gates: a rapid and powerful gene-based association test using extended simes procedure. Am J Hum Genet 88(3):283–293
McCaw ZR, Colthurst T, Yun T, Furlotte NA, Carroll A, Alipanahi B, McLean CY, Hormozdiari F (2022) Deepnull models non-linear covariate effects to improve phenotypic prediction and association power. Nat Commun 13(1):1–10
Moran MD (2003) Arguments for rejecting the sequential Bonferroni in ecological studies. Oikos 100(2):403–405
Müller UK (2014) HAC corrections for strongly autocorrelated time series. J Bus & Econ Stat 32(3):311–322
Newey WK, West KD (1987) A simple, positive semi-definite, heteroskedasticity and autocorrelation. Econometrica 55(3):703–708
Ray D, Boehnke M (2018) Methods for meta-analysis of multiple traits using gwas summary statistics. Genet Epidemiol 42(2):134–145
Shi S, Hurn S, Phillips PC (2020) Causal change detection in possibly integrated systems: revisiting the money-income relationship. J Financ Economet 18(1):158–180
Shi S, Phillips PC, Hurn S (2018) Change detection and the causal impact of the yield curve. J Time Ser Anal 39(6):966–987
Simes RJ (1986) An improved Bonferroni procedure for multiple tests of significance. Biometrika 73:751–754
Teslovich TM, Musunuru K, Smith AV, Edmondson AC, Stylianou IM, Koseki M, Pirruccello JP, Ripatti S, Chasman DI, Willer CJ et al (2010) Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466(7307):707–713
Uchaikin VV, Zolotarev VM (2011) Chance and stability: stable distributions and their applications. Walter de Gruyter
Van der Sluis S, Posthuma D, Dolan CV (2013) Tates: Efficient multivariate genotype-phenotype analysis for genome-wide association studies. PLoS Genet 9(1):e1003235
Westman V (2021) A small sample study of some sandwich estimators to handle heteroscedasticity
Wilson DJ (2019) The harmonic mean \(p\)-value for combining dependent tests. Proc Natl Acad Sci 116(4):1195–1200
Yang Q, Wu H, Guo C-Y, Fox CS (2010) Analyze multivariate phenotypes in genetic association studies by combining univariate association tests. Genet Epidemiol 34(5):444–454
Zeileis A (2004) Econometric computing with HC and HAC covariance matrix estimators. J Stat Softw 11(i10):1–17
Acknowledgements
Rho was supported by NSF-CPS-1739422 and NIH-R15GM135806.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no Conflict of interest and no Conflict of interest. No datasets were generated or analyzed in this study beyond those created for the simulation. The authors would like to thank the two anonymous referees and the Editor for their constructive comments that significantly improved the quality of this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, S., Rho, Y. Multiple testing correction for mean tests in time series rolling window analysis with an application of GWAS methods. Stat Methods Appl (2025). https://doi.org/10.1007/s10260-025-00789-x
Accepted:
Published:
DOI: https://doi.org/10.1007/s10260-025-00789-x