IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v179y2020ics0047259x20302256.html
   My bibliography  Save this article

Scalable interpretable learning for multi-response error-in-variables regression

Author

Listed:
  • Wu, Jie
  • Zheng, Zemin
  • Li, Yang
  • Zhang, Yi

Abstract

Corrupted data sets containing noisy or missing observations are prevalent in various contemporary applications such as economics, finance and bioinformatics. Despite the recent methodological and algorithmic advances in high-dimensional multi-response regression, how to achieve scalable and interpretable estimation under contaminated covariates is unclear. In this paper, we develop a new methodology called convex conditioned sequential sparse learning (COSS) for error-in-variables multi-response regression under both additive measurement errors and random missing data. It combines the strengths of the recently developed sequential sparse factor regression and the nearest positive semi-definite matrix projection, thus enjoying stepwise convexity and scalability in large-scale association analyses. Comprehensive theoretical guarantees are provided and we demonstrate the effectiveness of the proposed methodology through numerical studies.

Suggested Citation

  • Wu, Jie & Zheng, Zemin & Li, Yang & Zhang, Yi, 2020. "Scalable interpretable learning for multi-response error-in-variables regression," Journal of Multivariate Analysis, Elsevier, vol. 179(C).
  • Handle: RePEc:eee:jmvana:v:179:y:2020:i:c:s0047259x20302256
    DOI: 10.1016/j.jmva.2020.104644
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047259X20302256
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jmva.2020.104644?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Alexandre Belloni & Mathieu Rosenbaum & Alexandre B. Tsybakov, 2017. "Linear and conic programming estimators in high dimensional errors-in-variables models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(3), pages 939-956, June.
    2. Tingni Sun & Cun-Hui Zhang, 2012. "Scaled sparse linear regression," Biometrika, Biometrika Trust, vol. 99(4), pages 879-898.
    3. Yang, Kai & Lee, Lung-fei, 2019. "Identification and estimation of spatial dynamic panel simultaneous equations models," Regional Science and Urban Economics, Elsevier, vol. 76(C), pages 32-46.
    4. Liang, Hua & Li, Runze, 2009. "Variable Selection for Partially Linear Models With Measurement Errors," Journal of the American Statistical Association, American Statistical Association, vol. 104(485), pages 234-248.
    5. Zheng, Zemin & Li, Yang & Yu, Chongxiu & Li, Gaorong, 2018. "Balanced estimation for high-dimensional measurement error models," Computational Statistics & Data Analysis, Elsevier, vol. 126(C), pages 78-91.
    6. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    7. Yang, Kai & Lee, Lung-fei, 2017. "Identification and QML estimation of multivariate and simultaneous equations spatial autoregressive models," Journal of Econometrics, Elsevier, vol. 196(1), pages 196-214.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jingxuan Luo & Lili Yue & Gaorong Li, 2023. "Overview of High-Dimensional Measurement Error Regression Models," Mathematics, MDPI, vol. 11(14), pages 1-22, July.
    2. Zheng, Zemin & Li, Yang & Yu, Chongxiu & Li, Gaorong, 2018. "Balanced estimation for high-dimensional measurement error models," Computational Statistics & Data Analysis, Elsevier, vol. 126(C), pages 78-91.
    3. Xia Chen & Liyue Mao, 2020. "Penalized empirical likelihood for partially linear errors-in-variables models," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 104(4), pages 597-623, December.
    4. Zemin Zheng & Jie Zhang & Yang Li, 2022. "L 0 -Regularized Learning for High-Dimensional Additive Hazards Regression," INFORMS Journal on Computing, INFORMS, vol. 34(5), pages 2762-2775, September.
    5. Umberto Amato & Anestis Antoniadis & Italia De Feis & Irene Gijbels, 2021. "Penalised robust estimators for sparse and high-dimensional linear models," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(1), pages 1-48, March.
    6. Peter Bühlmann & Jacopo Mandozzi, 2014. "High-dimensional variable screening and bias in subsequent inference, with an empirical comparison," Computational Statistics, Springer, vol. 29(3), pages 407-430, June.
    7. Victor Chernozhukov & Christian Hansen & Yuan Liao, 2015. "A lava attack on the recovery of sums of dense and sparse signals," CeMMAP working papers CWP56/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    8. Lai, Peng & Wang, Qihua & Zhou, Xiao-Hua, 2014. "Variable selection and semiparametric efficient estimation for the heteroscedastic partially linear single-index model," Computational Statistics & Data Analysis, Elsevier, vol. 70(C), pages 241-256.
    9. Zhang, Jun & Feng, Zhenghui & Peng, Heng, 2018. "Estimation and hypothesis test for partial linear multiplicative models," Computational Statistics & Data Analysis, Elsevier, vol. 128(C), pages 87-103.
    10. He, Yong & Zhang, Liang & Ji, Jiadong & Zhang, Xinsheng, 2019. "Robust feature screening for elliptical copula regression model," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 568-582.
    11. Ruiqi Liu & Ben Boukai & Zuofeng Shang, 2019. "Statistical Inference on Partially Linear Panel Model under Unobserved Linearity," Papers 1911.08830, arXiv.org.
    12. Zhu, Xuening & Huang, Danyang & Pan, Rui & Wang, Hansheng, 2020. "Multivariate spatial autoregressive model for large scale social networks," Journal of Econometrics, Elsevier, vol. 215(2), pages 591-606.
    13. Sai Li & T. Tony Cai & Hongzhe Li, 2022. "Transfer learning for high‐dimensional linear regression: Prediction, estimation and minimax optimality," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(1), pages 149-173, February.
    14. Elhorst, J. Paul & Emili, Silvia, 2022. "A spatial econometric multivariate model of Okun's law," Regional Science and Urban Economics, Elsevier, vol. 93(C).
    15. Georgios Sermpinis & Serafeim Tsoukas & Ping Zhang, 2019. "What influences a bank's decision to go public?," International Journal of Finance & Economics, John Wiley & Sons, Ltd., vol. 24(4), pages 1464-1485, October.
    16. Jun Zhang & Zhenghui Feng & Peirong Xu & Hua Liang, 2017. "Generalized varying coefficient partially linear measurement errors models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 69(1), pages 97-120, February.
    17. Song, Yunquan & Liang, Xijun & Zhu, Yanji & Lin, Lu, 2021. "Robust variable selection with exponential squared loss for the spatial autoregressive model," Computational Statistics & Data Analysis, Elsevier, vol. 155(C).
    18. Cui, Jingyu & Yi, Grace Y., 2024. "Variable selection in multivariate regression models with measurement error in covariates," Journal of Multivariate Analysis, Elsevier, vol. 202(C).
    19. Sermpinis, Georgios & Tsoukas, Serafeim & Zhang, Ping, 2018. "Modelling market implied ratings using LASSO variable selection techniques," Journal of Empirical Finance, Elsevier, vol. 48(C), pages 19-35.
    20. Zhao, Peixin & Xue, Liugen, 2010. "Variable selection for semiparametric varying coefficient partially linear errors-in-variables models," Journal of Multivariate Analysis, Elsevier, vol. 101(8), pages 1872-1883, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:179:y:2020:i:c:s0047259x20302256. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.