IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2502.09740.html
   My bibliography  Save this paper

High-dimensional censored MIDAS logistic regression for corporate survival forecasting

Author

Listed:
  • Wei Miao
  • Jad Beyhum
  • Jonas Striaukas
  • Ingrid Van Keilegom

Abstract

This paper addresses the challenge of forecasting corporate distress, a problem marked by three key statistical hurdles: (i) right censoring, (ii) high-dimensional predictors, and (iii) mixed-frequency data. To overcome these complexities, we introduce a novel high-dimensional censored MIDAS (Mixed Data Sampling) logistic regression. Our approach handles censoring through inverse probability weighting and achieves accurate estimation with numerous mixed-frequency predictors by employing a sparse-group penalty. We establish finite-sample bounds for the estimation error, accounting for censoring, the MIDAS approximation error, and heavy tails. The superior performance of the method is demonstrated through Monte Carlo simulations. Finally, we present an extensive application of our methodology to predict the financial distress of Chinese-listed firms. Our novel procedure is implemented in the R package 'Survivalml'.

Suggested Citation

  • Wei Miao & Jad Beyhum & Jonas Striaukas & Ingrid Van Keilegom, 2025. "High-dimensional censored MIDAS logistic regression for corporate survival forecasting," Papers 2502.09740, arXiv.org.
  • Handle: RePEc:arx:papers:2502.09740
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2502.09740
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Eric Ghysels & Arthur Sinko & Rossen Valkanov, 2007. "MIDAS Regressions: Further Results and New Directions," Econometric Reviews, Taylor & Francis Journals, vol. 26(1), pages 53-90.
    2. Duffie, Darrell & Saita, Leandro & Wang, Ke, 2007. "Multi-period corporate default prediction with stochastic covariates," Journal of Financial Economics, Elsevier, vol. 83(3), pages 635-665, March.
    3. Caner, Mehmet, 2023. "Generalized linear models with structured sparsity estimators," Journal of Econometrics, Elsevier, vol. 236(2).
    4. Ghysels, Eric & Santa-Clara, Pedro & Valkanov, Rossen, 2006. "Predicting volatility: getting the most out of return data sampled at different frequencies," Journal of Econometrics, Elsevier, vol. 131(1-2), pages 59-95.
    5. Lukas Meier & Sara Van De Geer & Peter Bühlmann, 2008. "The group lasso for logistic regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(1), pages 53-71, February.
    6. Babii, Andrii & Ball, Ryan T. & Ghysels, Eric & Striaukas, Jonas, 2023. "Machine learning panel data regressions with heavy-tailed dependent data: Theory and application," Journal of Econometrics, Elsevier, vol. 237(2).
    7. Andrii Babii & Eric Ghysels & Jonas Striaukas, 2022. "Machine Learning Time Series Regressions With an Application to Nowcasting," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 40(3), pages 1094-1106, June.
    8. Patrick J. Heagerty & Yingye Zheng, 2005. "Survival Model Predictive Accuracy and ROC Curves," Biometrics, The International Biometric Society, vol. 61(1), pages 92-105, March.
    9. Victor Chernozhukov & Denis Chetverikov & Kengo Kato & Aureo de Paula, 2019. "Inference on Causal and Structural Parameters using Many Moment Inequalities," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 86(5), pages 1867-1900.
    10. Yingye Zheng & Tianxi Cai & Ziding Feng, 2006. "Application of the Time-Dependent ROC Curves for Prognostic Accuracy with Multiple Biomarkers," Biometrics, The International Biometric Society, vol. 62(1), pages 279-287, March.
    11. Luca Barbaglia & Sebastiano Manzan & Elisa Tosetti, 2023. "Forecasting Loan Default in Europe with Machine Learning," Journal of Financial Econometrics, Oxford University Press, vol. 21(2), pages 569-596.
    12. Paul Frédéric Blanche & Anders Holt & Thomas Scheike, 2023. "On logistic regression with right censored data, with or without competing risks, and its use for estimating treatment effects," Lifetime Data Analysis: An International Journal Devoted to Statistical Methods and Applications for Time-to-Event Data, Springer, vol. 29(2), pages 441-482, April.
    13. Rebel Cole & Lawrence White, 2012. "Déjà Vu All Over Again: The Causes of U.S. Commercial Bank Failures This Time Around," Journal of Financial Services Research, Springer;Western Finance Association, vol. 42(1), pages 5-29, October.
    14. Jad Beyhum & Samuele Centorrino & Jean-Pierre Florens & Ingrid Van Keilegom, 2024. "Instrumental Variable Estimation of Dynamic Treatment Effects on a Duration Outcome," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 42(2), pages 732-742, April.
    15. Edward I. Altman, 1968. "Financial Ratios, Discriminant Analysis And The Prediction Of Corporate Bankruptcy," Journal of Finance, American Finance Association, vol. 23(4), pages 589-609, September.
    16. Jad Beyhum & Lorenzo Tedesco & Ingrid Van Keilegom, 2024. "Instrumental variable quantile regression under random right censoring," The Econometrics Journal, Royal Economic Society, vol. 27(1), pages 21-36.
    17. Hung Hung & Chin‐Tsang Chiang, 2010. "Optimal Composite Markers for Time‐Dependent Receiver Operating Characteristic Curves with Censored Survival Data," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 37(4), pages 664-679, December.
    18. Jiang, Cuixia & Xiong, Wei & Xu, Qifa & Liu, Yezheng, 2021. "Predicting default of listed companies in mainland China via U-MIDAS Logit model with group lasso penalty," Finance Research Letters, Elsevier, vol. 38(C).
    19. Shaobo Li & Shaonan Tian & Yan Yu & Xiaorui Zhu & Heng Lian, 2023. "Corporate Probability of Default: A Single-Index Hazard Model Approach," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 41(4), pages 1288-1299, October.
    20. Audrino, Francesco & Kostrov, Alexander & Ortega, Juan-Pablo, 2019. "Predicting U.S. Bank Failures with MIDAS Logit Models," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 54(6), pages 2575-2603, December.
    21. Shumway, Tyler, 2001. "Forecasting Bankruptcy More Accurately: A Simple Hazard Model," The Journal of Business, University of Chicago Press, vol. 74(1), pages 101-124, January.
    22. Ohlson, Ja, 1980. "Financial Ratios And The Probabilistic Prediction Of Bankruptcy," Journal of Accounting Research, Wiley Blackwell, vol. 18(1), pages 109-131.
    23. A. Adam Ding & Shaonan Tian & Yan Yu & Hui Guo, 2012. "A Class of Discrete Transformation Survival Models With Application to Default Probability Prediction," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(499), pages 990-1003, September.
    24. Maria Heui-Yeong Kim & Shiguang Ma & Yanran Annie Zhou, 2016. "Survival prediction of distressed firms: evidence from the Chinese special treatment firms," Journal of the Asia Pacific Economy, Taylor & Francis Journals, vol. 21(3), pages 418-443, July.
    25. Ghysels, Eric & Qian, Hang, 2019. "Estimating MIDAS regressions via OLS with polynomial parameter profiling," Econometrics and Statistics, Elsevier, vol. 9(C), pages 1-16.
    26. Jad Beyhum & Jonas Striaukas, 2023. "Factor-augmented sparse MIDAS regressions with an application to nowcasting," Papers 2306.13362, arXiv.org, revised Nov 2024.
    27. Petropoulos, Anastasios & Siakoulis, Vasilis & Stavroulakis, Evangelos & Vlachogiannakis, Nikolaos E., 2020. "Predicting bank insolvencies using machine learning techniques," International Journal of Forecasting, Elsevier, vol. 36(3), pages 1092-1113.
    28. Ming Yuan & Yi Lin, 2006. "Model selection and estimation in regression with grouped variables," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(1), pages 49-67, February.
    29. Patrick J. Heagerty & Thomas Lumley & Margaret S. Pepe, 2000. "Time-Dependent ROC Curves for Censored Survival Data and a Diagnostic Marker," Biometrics, The International Biometric Society, vol. 56(2), pages 337-344, June.
    30. Tianxi Cai & Thomas A Gerds & Yingye Zheng & Jinbo Chen, 2011. "Robust Prediction of t-Year Survival with Data from Multiple Studies," Biometrics, The International Biometric Society, vol. 67(2), pages 436-444, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tian, Shaonan & Yu, Yan & Guo, Hui, 2015. "Variable selection and corporate bankruptcy forecasts," Journal of Banking & Finance, Elsevier, vol. 52(C), pages 89-100.
    2. Zhou, Fanyin & Fu, Lijun & Li, Zhiyong & Xu, Jiawei, 2022. "The recurrence of financial distress: A survival analysis," International Journal of Forecasting, Elsevier, vol. 38(3), pages 1100-1115.
    3. Goldmann, Leonie & Crook, Jonathan & Calabrese, Raffaella, 2024. "A new ordinal mixed-data sampling model with an application to corporate credit rating levels," European Journal of Operational Research, Elsevier, vol. 314(3), pages 1111-1126.
    4. Dong, Manh Cuong & Tian, Shaonan & Chen, Cathy W.S., 2018. "Predicting failure risk using financial ratios: Quantile hazard model approach," The North American Journal of Economics and Finance, Elsevier, vol. 44(C), pages 204-220.
    5. Giordani, Paolo & Jacobson, Tor & Schedvin, Erik von & Villani, Mattias, 2014. "Taking the Twists into Account: Predicting Firm Bankruptcy Risk with Splines of Financial Ratios," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 49(4), pages 1071-1099, August.
    6. Koresh Galil & Neta Gilat, 2019. "Predicting Default More Accurately: To Proxy or Not to Proxy for Default?," International Review of Finance, International Review of Finance Ltd., vol. 19(4), pages 731-758, December.
    7. Huang, Hsing-Hua & Lee, Han-Hsing, 2013. "Product market competition and credit risk," Journal of Banking & Finance, Elsevier, vol. 37(2), pages 324-340.
    8. Ruey-Ching Hwang & Huimin Chung & Jiun-Yi Ku, 2013. "Predicting Recurrent Financial Distresses with Autocorrelation Structure: An Empirical Analysis from an Emerging Market," Journal of Financial Services Research, Springer;Western Finance Association, vol. 43(3), pages 321-341, June.
    9. Giesecke, Kay & Longstaff, Francis A. & Schaefer, Stephen & Strebulaev, Ilya, 2011. "Corporate bond default risk: A 150-year perspective," Journal of Financial Economics, Elsevier, vol. 102(2), pages 233-250.
    10. Jiang, Cuixia & Xiong, Wei & Xu, Qifa & Liu, Yezheng, 2021. "Predicting default of listed companies in mainland China via U-MIDAS Logit model with group lasso penalty," Finance Research Letters, Elsevier, vol. 38(C).
    11. Lyandres, Evgeny & Zhdanov, Alexei, 2013. "Investment opportunities and bankruptcy prediction," Journal of Financial Markets, Elsevier, vol. 16(3), pages 439-476.
    12. Sigrist, Fabio & Leuenberger, Nicola, 2023. "Machine learning for corporate default risk: Multi-period prediction, frailty correlation, loan portfolios, and tail probabilities," European Journal of Operational Research, Elsevier, vol. 305(3), pages 1390-1406.
    13. Ilyes Abid & Farid Mkaouar & Olfa Kaabia, 2018. "Dynamic analysis of the forecasting bankruptcy under presence of unobserved heterogeneity," Annals of Operations Research, Springer, vol. 262(2), pages 241-256, March.
    14. Andrii Babii & Eric Ghysels & Jonas Striaukas, 2022. "Machine Learning Time Series Regressions With an Application to Nowcasting," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 40(3), pages 1094-1106, June.
    15. Wikil Kwak & Yong Shi & Gang Kou, 2012. "Bankruptcy prediction for Korean firms after the 1997 financial crisis: using a multiple criteria linear programming data mining approach," Review of Quantitative Finance and Accounting, Springer, vol. 38(4), pages 441-453, May.
    16. Wenlang Zhang & Gaofeng Han & Steven Chan, 2014. "How Strong are the Linkages between Real Estate and Other Sectors in China?," Working Papers 112014, Hong Kong Institute for Monetary Research.
    17. Anand Deo & Sandeep Juneja, 2021. "Credit Risk: Simple Closed-Form Approximate Maximum Likelihood Estimator," Operations Research, INFORMS, vol. 69(2), pages 361-379, March.
    18. David A. Hensher & Stewart Jones & William H. Greene, 2007. "An Error Component Logit Analysis of Corporate Bankruptcy and Insolvency Risk in Australia," The Economic Record, The Economic Society of Australia, vol. 83(260), pages 86-103, March.
    19. Zhang, Xuan & Ouyang, Ruolan & Liu, Ding & Xu, Liao, 2020. "Determinants of corporate default risk in China: The role of financial constraints," Economic Modelling, Elsevier, vol. 92(C), pages 87-98.
    20. Babii, Andrii & Ball, Ryan T. & Ghysels, Eric & Striaukas, Jonas, 2023. "Machine learning panel data regressions with heavy-tailed dependent data: Theory and application," Journal of Econometrics, Elsevier, vol. 237(2).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2502.09740. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.