IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v181y2023ics0167947323000026.html
   My bibliography  Save this article

Densely connected sub-Gaussian linear structural equation model learning via ℓ1- and ℓ2-regularized regressions

Author

Listed:
  • Choi, Semin
  • Kim, Yesool
  • Park, Gunwoong

Abstract

This paper develops a new algorithm for learning densely connected sub-Gaussian linear structural equation models (SEMs) in high-dimensional settings, where the number of nodes increases with increasing number of samples. The proposed algorithm consists of two main steps: (i) the component-wise ordering estimation using ℓ2-regularized regression and (ii) the presence of edge estimation using ℓ1-regularized regression. Hence, the proposed algorithm can recover a large degree graph with a small indegree constraint. Also proven is that the sample size n=Ω(p) is sufficient for the proposed algorithm to recover a sub-Gaussian linear SEM provided that d=O(plog⁡p), where p is the number of nodes and d is the maximum indegree. In addition, the computational complexity is polynomial, O(np2max⁡(n,p)). Therefore, the proposed algorithm is statistically consistent and computationally feasible for learning a densely connected sub-Gaussian linear SEM with large maximum degree. Numerical experiments verified that the proposed algorithm is consistent, and performs better than the state-of-the-art high-dimensional linear SEM learning HGSM, LISTEN, and TD algorithms in both sparse and dense graph settings. Also demonstrated through real data is that the proposed algorithm is well-suited to estimating the Seoul public bike usage patterns in 2019.

Suggested Citation

  • Choi, Semin & Kim, Yesool & Park, Gunwoong, 2023. "Densely connected sub-Gaussian linear structural equation model learning via ℓ1- and ℓ2-regularized regressions," Computational Statistics & Data Analysis, Elsevier, vol. 181(C).
  • Handle: RePEc:eee:csdana:v:181:y:2023:i:c:s0167947323000026
    DOI: 10.1016/j.csda.2023.107691
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947323000026
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2023.107691?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jianqing Fan & Shaojun Guo & Ning Hao, 2012. "Variance estimation using refitted cross‐validation in ultrahigh dimensional regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 74(1), pages 37-65, January.
    2. Ali Shojaie & George Michailidis, 2010. "Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs," Biometrika, Biometrika Trust, vol. 97(3), pages 519-538.
    3. X Liu & S Zheng & X Feng, 2020. "Estimation of error variance via ridge regression," Biometrika, Biometrika Trust, vol. 107(2), pages 481-488.
    4. Wenyu Chen & Mathias Drton & Y Samuel Wang, 2019. "On causal discovery with an equal-variance assumption," Biometrika, Biometrika Trust, vol. 106(4), pages 973-980.
    5. J. Peters & P. Bühlmann, 2014. "Identifiability of Gaussian structural equation models with equal error variances," Biometrika, Biometrika Trust, vol. 101(1), pages 219-228.
    6. Park, Gunwoong & Kim, Yesool, 2021. "Learning high-dimensional Gaussian linear structural equation models with heterogeneous error variances," Computational Statistics & Data Analysis, Elsevier, vol. 154(C).
    7. Davide Altomare & Guido Consonni & Luca La Rocca, 2013. "Objective Bayesian Search of Gaussian Directed Acyclic Graphical Models for Ordered Variables with Non-Local Priors," Biometrics, The International Biometric Society, vol. 69(2), pages 478-487, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fangting Zhou & Kejun He & Yang Ni, 2023. "Individualized causal discovery with latent trajectory embedded Bayesian networks," Biometrics, The International Biometric Society, vol. 79(4), pages 3191-3202, December.
    2. Yang Ni & Francesco C. Stingo & Veerabhadran Baladandayuthapani, 2015. "Bayesian nonlinear model selection for gene regulatory networks," Biometrics, The International Biometric Society, vol. 71(3), pages 585-595, September.
    3. Nikolaos Petrakis & Stefano Peluso & Dimitris Fouskakis & Guido Consonni, 2020. "Objective methods for graphical structural learning," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 74(3), pages 420-438, August.
    4. Zhang, Hongmei & Huang, Xianzheng & Han, Shengtong & Rezwan, Faisal I. & Karmaus, Wilfried & Arshad, Hasan & Holloway, John W., 2021. "Gaussian Bayesian network comparisons with graph ordering unknown," Computational Statistics & Data Analysis, Elsevier, vol. 157(C).
    5. Xin Wang & Lingchen Kong & Liqun Wang, 2022. "Estimation of Error Variance in Regularized Regression Models via Adaptive Lasso," Mathematics, MDPI, vol. 10(11), pages 1-19, June.
    6. Sayanti Guha Majumdar & Anil Rai & Dwijesh Chandra Mishra, 2023. "Estimation of Error Variance in Genomic Selection for Ultrahigh Dimensional Data," Agriculture, MDPI, vol. 13(4), pages 1-16, April.
    7. Xiao Guo & Hai Zhang, 2020. "Sparse directed acyclic graphs incorporating the covariates," Statistical Papers, Springer, vol. 61(5), pages 2119-2148, October.
    8. Park, Gunwoong & Kim, Yesool, 2021. "Learning high-dimensional Gaussian linear structural equation models with heterogeneous error variances," Computational Statistics & Data Analysis, Elsevier, vol. 154(C).
    9. Zemin Zheng & Jie Zhang & Yang Li, 2022. "L 0 -Regularized Learning for High-Dimensional Additive Hazards Regression," INFORMS Journal on Computing, INFORMS, vol. 34(5), pages 2762-2775, September.
    10. Jingxin Zhao & Heng Peng & Tao Huang, 2018. "Variance estimation for semiparametric regression models by local averaging," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 27(2), pages 453-476, June.
    11. Xu Wang & JinRong Wang & Michal Fečkan, 2020. "BP Neural Network Calculus in Economic Growth Modelling of the Group of Seven," Mathematics, MDPI, vol. 8(1), pages 1-11, January.
    12. Zhou, Jia & Li, Yang & Zheng, Zemin & Li, Daoji, 2022. "Reproducible learning in large-scale graphical models," Journal of Multivariate Analysis, Elsevier, vol. 189(C).
    13. Wang, Luheng & Chen, Zhao & Wang, Christina Dan & Li, Runze, 2020. "Ultrahigh dimensional precision matrix estimation via refitted cross validation," Journal of Econometrics, Elsevier, vol. 215(1), pages 118-130.
    14. Federico Castelletti & Guido Consonni, 2021. "Bayesian inference of causal effects from observational data in Gaussian graphical models," Biometrics, The International Biometric Society, vol. 77(1), pages 136-149, March.
    15. Fangting Zhou & Kejun He & Kunbo Wang & Yanxun Xu & Yang Ni, 2023. "Functional Bayesian networks for discovering causality from multivariate functional data," Biometrics, The International Biometric Society, vol. 79(4), pages 3279-3293, December.
    16. Guido Consonni & Roberta Paroli, 2017. "Objective Bayesian Comparison of Constrained Analysis of Variance Models," Psychometrika, Springer;The Psychometric Society, vol. 82(3), pages 589-609, September.
    17. Hu, Jianhua & Liu, Xiaoqian & Liu, Xu & Xia, Ningning, 2022. "Some aspects of response variable selection and estimation in multivariate linear regression," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    18. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    19. Ma, Yingying & Guo, Shaojun & Wang, Hansheng, 2023. "Sparse spatio-temporal autoregressions by profiling and bagging," Journal of Econometrics, Elsevier, vol. 232(1), pages 132-147.
    20. Lucas Janson & Rina Foygel Barber & Emmanuel Candès, 2017. "EigenPrism: inference for high dimensional signal-to-noise ratios," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(4), pages 1037-1065, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:181:y:2023:i:c:s0167947323000026. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.