IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v181y2023ics0167947323000026.html
   My bibliography  Save this article

Densely connected sub-Gaussian linear structural equation model learning via ℓ1- and ℓ2-regularized regressions

Author

Listed:
  • Choi, Semin
  • Kim, Yesool
  • Park, Gunwoong

Abstract

This paper develops a new algorithm for learning densely connected sub-Gaussian linear structural equation models (SEMs) in high-dimensional settings, where the number of nodes increases with increasing number of samples. The proposed algorithm consists of two main steps: (i) the component-wise ordering estimation using ℓ2-regularized regression and (ii) the presence of edge estimation using ℓ1-regularized regression. Hence, the proposed algorithm can recover a large degree graph with a small indegree constraint. Also proven is that the sample size n=Ω(p) is sufficient for the proposed algorithm to recover a sub-Gaussian linear SEM provided that d=O(plog⁡p), where p is the number of nodes and d is the maximum indegree. In addition, the computational complexity is polynomial, O(np2max⁡(n,p)). Therefore, the proposed algorithm is statistically consistent and computationally feasible for learning a densely connected sub-Gaussian linear SEM with large maximum degree. Numerical experiments verified that the proposed algorithm is consistent, and performs better than the state-of-the-art high-dimensional linear SEM learning HGSM, LISTEN, and TD algorithms in both sparse and dense graph settings. Also demonstrated through real data is that the proposed algorithm is well-suited to estimating the Seoul public bike usage patterns in 2019.

Suggested Citation

  • Choi, Semin & Kim, Yesool & Park, Gunwoong, 2023. "Densely connected sub-Gaussian linear structural equation model learning via ℓ1- and ℓ2-regularized regressions," Computational Statistics & Data Analysis, Elsevier, vol. 181(C).
  • Handle: RePEc:eee:csdana:v:181:y:2023:i:c:s0167947323000026
    DOI: 10.1016/j.csda.2023.107691
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947323000026
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2023.107691?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jianqing Fan & Shaojun Guo & Ning Hao, 2012. "Variance estimation using refitted cross‐validation in ultrahigh dimensional regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 74(1), pages 37-65, January.
    2. Ali Shojaie & George Michailidis, 2010. "Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs," Biometrika, Biometrika Trust, vol. 97(3), pages 519-538.
    3. X Liu & S Zheng & X Feng, 2020. "Estimation of error variance via ridge regression," Biometrika, Biometrika Trust, vol. 107(2), pages 481-488.
    4. Wenyu Chen & Mathias Drton & Y Samuel Wang, 2019. "On causal discovery with an equal-variance assumption," Biometrika, Biometrika Trust, vol. 106(4), pages 973-980.
    5. J. Peters & P. Bühlmann, 2014. "Identifiability of Gaussian structural equation models with equal error variances," Biometrika, Biometrika Trust, vol. 101(1), pages 219-228.
    6. Park, Gunwoong & Kim, Yesool, 2021. "Learning high-dimensional Gaussian linear structural equation models with heterogeneous error variances," Computational Statistics & Data Analysis, Elsevier, vol. 154(C).
    7. Davide Altomare & Guido Consonni & Luca La Rocca, 2013. "Objective Bayesian Search of Gaussian Directed Acyclic Graphical Models for Ordered Variables with Non-Local Priors," Biometrics, The International Biometric Society, vol. 69(2), pages 478-487, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fangting Zhou & Kejun He & Yang Ni, 2023. "Individualized causal discovery with latent trajectory embedded Bayesian networks," Biometrics, The International Biometric Society, vol. 79(4), pages 3191-3202, December.
    2. Xiao Guo & Hai Zhang, 2020. "Sparse directed acyclic graphs incorporating the covariates," Statistical Papers, Springer, vol. 61(5), pages 2119-2148, October.
    3. Park, Gunwoong & Kim, Yesool, 2021. "Learning high-dimensional Gaussian linear structural equation models with heterogeneous error variances," Computational Statistics & Data Analysis, Elsevier, vol. 154(C).
    4. Yang Ni & Francesco C. Stingo & Veerabhadran Baladandayuthapani, 2015. "Bayesian nonlinear model selection for gene regulatory networks," Biometrics, The International Biometric Society, vol. 71(3), pages 585-595, September.
    5. Nikolaos Petrakis & Stefano Peluso & Dimitris Fouskakis & Guido Consonni, 2020. "Objective methods for graphical structural learning," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 74(3), pages 420-438, August.
    6. Zhang, Hongmei & Huang, Xianzheng & Han, Shengtong & Rezwan, Faisal I. & Karmaus, Wilfried & Arshad, Hasan & Holloway, John W., 2021. "Gaussian Bayesian network comparisons with graph ordering unknown," Computational Statistics & Data Analysis, Elsevier, vol. 157(C).
    7. Xin Wang & Lingchen Kong & Liqun Wang, 2022. "Estimation of Error Variance in Regularized Regression Models via Adaptive Lasso," Mathematics, MDPI, vol. 10(11), pages 1-19, June.
    8. Sayanti Guha Majumdar & Anil Rai & Dwijesh Chandra Mishra, 2023. "Estimation of Error Variance in Genomic Selection for Ultrahigh Dimensional Data," Agriculture, MDPI, vol. 13(4), pages 1-16, April.
    9. Zemin Zheng & Jie Zhang & Yang Li, 2022. "L 0 -Regularized Learning for High-Dimensional Additive Hazards Regression," INFORMS Journal on Computing, INFORMS, vol. 34(5), pages 2762-2775, September.
    10. Tommaso Proietti, 2016. "On the Selection of Common Factors for Macroeconomic Forecasting," Advances in Econometrics, in: Dynamic Factor Models, volume 35, pages 593-628, Emerald Group Publishing Limited.
    11. Davide Altomare & Guido Consonni & Luca La Rocca, 2011. "Objective Bayesian Search of Gaussian DAG Models with Non-local Priors," Quaderni di Dipartimento 140, University of Pavia, Department of Economics and Quantitative Methods.
    12. Jingxin Zhao & Heng Peng & Tao Huang, 2018. "Variance estimation for semiparametric regression models by local averaging," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 27(2), pages 453-476, June.
    13. Jianqing Fan & Quefeng Li & Yuyan Wang, 2017. "Estimation of high dimensional mean regression in the absence of symmetry and light tail assumptions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(1), pages 247-265, January.
    14. Chen, Zhao & Cheng, Vivian Xinyi & Liu, Xu, 2024. "Reprint: Hypothesis testing on high dimensional quantile regression," Journal of Econometrics, Elsevier, vol. 239(2).
    15. Zhang, Tonglin & Lin, Ge, 2021. "Generalized k-means in GLMs with applications to the outbreak of COVID-19 in the United States," Computational Statistics & Data Analysis, Elsevier, vol. 159(C).
    16. D García Rasines & G A Young, 2023. "Splitting strategies for post-selection inference," Biometrika, Biometrika Trust, vol. 110(3), pages 597-614.
    17. Nilotpal Sanyal & Marco A. R. Ferreira, 2017. "Bayesian Wavelet Analysis Using Nonlocal Priors with an Application to fMRI Analysis," Sankhya B: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 79(2), pages 361-388, November.
    18. Xu Wang & JinRong Wang & Michal Fečkan, 2020. "BP Neural Network Calculus in Economic Growth Modelling of the Group of Seven," Mathematics, MDPI, vol. 8(1), pages 1-11, January.
    19. Zhou, Jia & Li, Yang & Zheng, Zemin & Li, Daoji, 2022. "Reproducible learning in large-scale graphical models," Journal of Multivariate Analysis, Elsevier, vol. 189(C).
    20. Wang, Luheng & Chen, Zhao & Wang, Christina Dan & Li, Runze, 2020. "Ultrahigh dimensional precision matrix estimation via refitted cross validation," Journal of Econometrics, Elsevier, vol. 215(1), pages 118-130.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:181:y:2023:i:c:s0167947323000026. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.