IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v158y2021ics0167947321000025.html
   My bibliography  Save this article

Testing conditional mean through regression model sequence using Yanai’s generalized coefficient of determination

Author

Listed:
  • Ueki, Masao

Abstract

In high-dimensional data analysis such as in genomics, repeated univariate regression for each variable is utilized to screen useful variables. However, signals jointly detectable with other variables may be overlooked. While the saturated model using all variables may not work in high-dimensional data, based on prior knowledge, group-wise analysis for a pre-defined group is often developed, but the power will be limited if the knowledge is insufficient. A flexible test procedure is thus proposed for conditional mean applicable to a variety of model sequences that bridge between low and high complexity models as in penalized regression. The test is based on the model that maximizes a generalization of the Yanai’s generalized coefficient of determination by exploiting the tendency for the dimensionality to be large under the null hypothesis. The test does not require complicated null distribution computation, thereby enabling large-scale testing application. Numerical studies demonstrated that the proposed test applied to the lasso and elastic net had a high power regardless of the simulation scenarios. Applied to a group-wise analysis in real genome-wide association study data from Alzheimer’s Disease Neuroimaging Initiative, the proposal gave a higher association signal than the existing methods.

Suggested Citation

  • Ueki, Masao, 2021. "Testing conditional mean through regression model sequence using Yanai’s generalized coefficient of determination," Computational Statistics & Data Analysis, Elsevier, vol. 158(C).
  • Handle: RePEc:eee:csdana:v:158:y:2021:i:c:s0167947321000025
    DOI: 10.1016/j.csda.2021.107168
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947321000025
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2021.107168?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jiahua Chen & Zehua Chen, 2008. "Extended Bayesian information criteria for model selection with large model spaces," Biometrika, Biometrika Trust, vol. 95(3), pages 759-771.
    2. Li, Baibing & Martin, Elaine B. & Morris, A. Julian, 2002. "On principal component analysis in L1," Computational Statistics & Data Analysis, Elsevier, vol. 40(3), pages 471-474, September.
    3. S. Kaufman & S. Rosset, 2014. "When does more regularization imply fewer degrees of freedom? Sufficient conditions and counterexamples," Biometrika, Biometrika Trust, vol. 101(4), pages 771-784.
    4. Chihiro Hirotsu & Satoshi Aoki & Toshiya Inada & Yoshie Kitao, 2001. "An Exact Test for the Association Between the Disease and Alleles at Highly Polymorphic Loci with Particular Interest in the Haplotype Analysis," Biometrics, The International Biometric Society, vol. 57(3), pages 769-778, September.
    5. Wang, Siyang & Cui, Hengjian, 2013. "Generalized F test for high dimensional linear regression coefficients," Journal of Multivariate Analysis, Elsevier, vol. 117(C), pages 134-149.
    6. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    7. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Tayfun Uyanık & Yunus Yalman & Özcan Kalenderli & Yasin Arslanoğlu & Yacine Terriche & Chun-Lien Su & Josep M. Guerrero, 2022. "Data-Driven Approach for Estimating Power and Fuel Consumption of Ship: A Case of Container Vessel," Mathematics, MDPI, vol. 10(22), pages 1-21, November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Masao Ueki, 2024. "Data-Adaptive Multivariate Test for Genomic Studies Using Fused Lasso," Mathematics, MDPI, vol. 12(10), pages 1-16, May.
    2. Yinjun Chen & Hao Ming & Hu Yang, 2024. "Efficient variable selection for high-dimensional multiplicative models: a novel LPRE-based approach," Statistical Papers, Springer, vol. 65(6), pages 3713-3737, August.
    3. Sacha Epskamp & Mijke Rhemtulla & Denny Borsboom, 2017. "Generalized Network Psychometrics: Combining Network and Latent Variable Models," Psychometrika, Springer;The Psychometric Society, vol. 82(4), pages 904-927, December.
    4. Qingliang Fan & Yaqian Wu, 2020. "Endogenous Treatment Effect Estimation with some Invalid and Irrelevant Instruments," Papers 2006.14998, arXiv.org.
    5. David Degras, 2021. "Sparse group fused lasso for model segmentation: a hybrid approach," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 15(3), pages 625-671, September.
    6. Yunquan Song & Zitong Li & Minglu Fang, 2022. "Robust Variable Selection Based on Penalized Composite Quantile Regression for High-Dimensional Single-Index Models," Mathematics, MDPI, vol. 10(12), pages 1-17, June.
    7. Anshul Verma & Orazio Angelini & Tiziana Di Matteo, 2019. "A new set of cluster driven composite development indicators," Papers 1911.11226, arXiv.org, revised Mar 2020.
    8. Dai, Linlin & Chen, Kani & Sun, Zhihua & Liu, Zhenqiu & Li, Gang, 2018. "Broken adaptive ridge regression and its asymptotic properties," Journal of Multivariate Analysis, Elsevier, vol. 168(C), pages 334-351.
    9. Ruggieri, Eric & Lawrence, Charles E., 2012. "On efficient calculations for Bayesian variable selection," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1319-1332.
    10. Paweł Teisseyre & Robert A. Kłopotek & Jan Mielniczuk, 2016. "Random Subspace Method for high-dimensional regression with the R package regRSM," Computational Statistics, Springer, vol. 31(3), pages 943-972, September.
    11. Daniel Felix Ahelegbey & Monica Billio & Roberto Casarin, 2016. "Sparse Graphical Vector Autoregression: A Bayesian Approach," Annals of Economics and Statistics, GENES, issue 123-124, pages 333-361.
    12. Alain Hecq & Luca Margaritella & Stephan Smeekes, 2023. "Granger Causality Testing in High-Dimensional VARs: A Post-Double-Selection Procedure," Journal of Financial Econometrics, Oxford University Press, vol. 21(3), pages 915-958.
    13. Davood Hajinezhad & Qingjiang Shi, 2018. "Alternating direction method of multipliers for a class of nonconvex bilinear optimization: convergence analysis and applications," Journal of Global Optimization, Springer, vol. 70(1), pages 261-288, January.
    14. Kawano, Shuichi & Fujisawa, Hironori & Takada, Toyoyuki & Shiroishi, Toshihiko, 2018. "Sparse principal component regression for generalized linear models," Computational Statistics & Data Analysis, Elsevier, vol. 124(C), pages 180-196.
    15. Achim Ahrens & Christian B. Hansen & Mark E. Schaffer, 2020. "lassopack: Model selection and prediction with regularized regression in Stata," Stata Journal, StataCorp LP, vol. 20(1), pages 176-235, March.
    16. She, Yiyuan, 2012. "An iterative algorithm for fitting nonconvex penalized generalized linear models with grouped predictors," Computational Statistics & Data Analysis, Elsevier, vol. 56(10), pages 2976-2990.
    17. Shiqiang Jin & Gyuhyeong Goh, 2021. "Bayesian selection of best subsets via hybrid search," Computational Statistics, Springer, vol. 36(3), pages 1991-2007, September.
    18. Sebri, Maamar & Dachraoui, Hajer, 2021. "Natural resources and income inequality: A meta-analytic review," Resources Policy, Elsevier, vol. 74(C).
    19. Zhihua Sun & Yi Liu & Kani Chen & Gang Li, 2022. "Broken adaptive ridge regression for right-censored survival data," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 74(1), pages 69-91, February.
    20. Ardia, David & Bluteau, Keven & Boudt, Kris, 2019. "Questioning the news about economic growth: Sparse forecasting using thousands of news-based sentiment values," International Journal of Forecasting, Elsevier, vol. 35(4), pages 1370-1386.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:158:y:2021:i:c:s0167947321000025. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.