IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v174y2019ics0047259x18305165.html
   My bibliography  Save this article

High-dimensional integrative analysis with homogeneity and sparsity recovery

Author

Listed:
  • Yang, Xinfeng
  • Yan, Xiaodong
  • Huang, Jian

Abstract

This paper studies integrative analysis of multiple units in the context of high-dimensional linear regression. We consider the case where a fraction of the covariates have different effects on the responses across various units, e.g., some coefficients are the same for all the units, while others have grouping structures. We propose a least squares approach, combined with a difference penalty term to penalize the difference between any two units’ coefficients of the same covariate for identifying latent grouping structure, as well as a common sparsity penalty to detect important covariates. Without the need to know the grouping structure of every variable across the data units and the sparsity construction within the variables, the proposed double penalized procedure can automatically identify the covariates with heterogeneous effects, covariates with homogeneous effects, and recover the sparsity, the grouping structures of the heterogeneous covariates, and provide estimates of all regression coefficients simultaneously. We proceed the alternating direction method of multipliers algorithm (ADMM) through effectively utilizing the storage and reading of the datasets, and demonstrate the convergence of the proposed procedure. We show that the proposed estimator enjoys the oracle property. Simulation studies demonstrate the good performance of the new method with finite samples, and a real data example is provided for illustration.

Suggested Citation

  • Yang, Xinfeng & Yan, Xiaodong & Huang, Jian, 2019. "High-dimensional integrative analysis with homogeneity and sparsity recovery," Journal of Multivariate Analysis, Elsevier, vol. 174(C).
  • Handle: RePEc:eee:jmvana:v:174:y:2019:i:c:s0047259x18305165
    DOI: 10.1016/j.jmva.2019.06.007
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047259X18305165
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jmva.2019.06.007?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hansheng Wang & Bo Li & Chenlei Leng, 2009. "Shrinkage tuning parameter selection with a diverging number of parameters," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(3), pages 671-683, June.
    2. Zaiwen Wen & Xianhua Peng & Xin Liu & Xiaoling Sun & Xiaodi Bai, 2013. "Asset Allocation under the Basel Accord Risk Measures," Papers 1308.1321, arXiv.org.
    3. Juan Shen & Xuming He, 2015. "Inference for Subgroup Analysis With a Structured Logistic-Normal Mixture Model," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(509), pages 303-312, March.
    4. Stéphane Bonhomme & Elena Manresa, 2015. "Grouped Patterns of Heterogeneity in Panel Data," Econometrica, Econometric Society, vol. 83(3), pages 1147-1184, May.
    5. Tomohiro Ando & Jushan Bai, 2016. "Panel Data Models with Grouped Factor Structure Under Unknown Group Membership," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 31(1), pages 163-191, January.
    6. Roger Koenker & Kevin F. Hallock, 2001. "Quantile Regression," Journal of Economic Perspectives, American Economic Association, vol. 15(4), pages 143-156, Fall.
    7. Sokbae Lee & Myung Hwan Seo & Youngki Shin, 2016. "The lasso for high dimensional regression with a possible change point," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(1), pages 193-210, January.
    8. Jeon, Jong-June & Kwon, Sunghoon & Choi, Hosik, 2017. "Homogeneity detection for the high-dimensional generalized linear model," Computational Statistics & Data Analysis, Elsevier, vol. 114(C), pages 61-74.
    9. Ariel Kleiner & Ameet Talwalkar & Purnamrita Sarkar & Michael I. Jordan, 2014. "A scalable bootstrap for massive data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(4), pages 795-816, September.
    10. Wang, Hansheng, 2009. "Forward Regression for Ultra-High Dimensional Variable Screening," Journal of the American Statistical Association, American Statistical Association, vol. 104(488), pages 1512-1524.
    11. Nicolas Städler & Peter Bühlmann & Sara Geer, 2010. "ℓ 1 -penalization for mixture regression models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 19(2), pages 209-256, August.
    12. Yunzhang Zhu & Xiaotong Shen & Wei Pan, 2013. "Simultaneous Grouping Pursuit and Feature Selection Over an Undirected Graph," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(502), pages 713-725, June.
    13. Shujie Ma & Jian Huang, 2017. "A Concave Pairwise Fusion Approach to Subgroup Analysis," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(517), pages 410-423, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Zhang, Xiaochen & Zhang, Qingzhao & Ma, Shuangge & Fang, Kuangnan, 2022. "Subgroup analysis for high-dimensional functional regression," Journal of Multivariate Analysis, Elsevier, vol. 192(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wang, Wuyi & Su, Liangjun, 2021. "Identifying latent group structures in nonlinear panels," Journal of Econometrics, Elsevier, vol. 220(2), pages 272-295.
    2. Zhang, Yingying & Wang, Huixia Judy & Zhu, Zhongyi, 2019. "Quantile-regression-based clustering for panel data," Journal of Econometrics, Elsevier, vol. 213(1), pages 54-67.
    3. Okui, Ryo & Wang, Wendun, 2021. "Heterogeneous structural breaks in panel data models," Journal of Econometrics, Elsevier, vol. 220(2), pages 447-473.
    4. Zhang, Xiaochen & Zhang, Qingzhao & Ma, Shuangge & Fang, Kuangnan, 2022. "Subgroup analysis for high-dimensional functional regression," Journal of Multivariate Analysis, Elsevier, vol. 192(C).
    5. Baihua He & Tingyan Zhong & Jian Huang & Yanyan Liu & Qingzhao Zhang & Shuangge Ma, 2021. "Histopathological imaging‐based cancer heterogeneity analysis via penalized fusion with model averaging," Biometrics, The International Biometric Society, vol. 77(4), pages 1397-1408, December.
    6. Mehrabani, Ali, 2023. "Estimation and identification of latent group structures in panel data," Journal of Econometrics, Elsevier, vol. 235(2), pages 1464-1482.
    7. Nibbering, D. & Paap, R., 2019. "Panel Forecasting with Asymmetric Grouping," Econometric Institute Research Papers EI-2019-30, Erasmus University Rotterdam, Erasmus School of Economics (ESE), Econometric Institute.
    8. Lixiong Yang, 2023. "Variable selection in threshold model with a covariate-dependent threshold," Empirical Economics, Springer, vol. 65(1), pages 189-202, July.
    9. Custodio João, Igor & Lucas, André & Schaumburg, Julia & Schwaab, Bernd, 2023. "Dynamic clustering of multivariate panel data," Journal of Econometrics, Elsevier, vol. 237(2).
    10. Miao, Ke & Su, Liangjun & Wang, Wendun, 2020. "Panel threshold regressions with latent group structures," Journal of Econometrics, Elsevier, vol. 214(2), pages 451-481.
    11. Simone Bertoli & Jesus Fernández-Huertas Moraga, 2012. "Visa Policies, Networks and the Cliff at the Border," Working Papers 2012-12, FEDEA.
    12. Denis Chetverikov & Elena Manresa, 2022. "Spectral and post-spectral estimators for grouped panel data models," Papers 2212.13324, arXiv.org, revised Dec 2022.
    13. Zhang, Shucong & Zhou, Yong, 2018. "Variable screening for ultrahigh dimensional heterogeneous data via conditional quantile correlations," Journal of Multivariate Analysis, Elsevier, vol. 165(C), pages 1-13.
    14. Shao, Lihui & Wu, Jiaqi & Zhang, Weiping & Chen, Yu, 2024. "Integrated subgroup identification from multi-source data," Computational Statistics & Data Analysis, Elsevier, vol. 193(C).
    15. Thibaut Lamadon & Elena Manresa & Stephane Bonhomme, 2016. "Discretizing Unobserved Heterogeneity," 2016 Meeting Papers 1536, Society for Economic Dynamics.
    16. Wang, Wei & Xiao, Zhijie & Ren, Yanyan & Yan, Xiaodong, 2023. "A bi-integrative analysis of two-dimensional heterogeneous panel data models," Economics Letters, Elsevier, vol. 230(C).
    17. Claudia Pigini & Alessandro Pionati & Francesco Valentini, 2023. "Specification testing with grouped fixed effects," Papers 2310.01950, arXiv.org.
    18. Liu, Ruiqi & Shang, Zuofeng & Zhang, Yonghui & Zhou, Qiankun, 2020. "Identification and estimation in panel models with overspecified number of groups," Journal of Econometrics, Elsevier, vol. 215(2), pages 574-590.
    19. Shi, Zhentao & Huang, Jingyi, 2023. "Forward-selected panel data approach for program evaluation," Journal of Econometrics, Elsevier, vol. 234(2), pages 512-535.
    20. Jorge A. Rivero, 2023. "Unobserved Grouped Heteroskedasticity and Fixed Effects," Papers 2310.14068, arXiv.org, revised Oct 2023.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:174:y:2019:i:c:s0047259x18305165. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.