Convex and Nonconvex Risk-Based Linear Regression at Scale

My bibliography Save this article

Convex and Nonconvex Risk-Based Linear Regression at Scale

Author

Listed:

Can Wu
(School of Mathematical Sciences, South China Normal University, Guangzhou, 510631, China; Department of Applied Mathematics, The Hong Kong Polytechnic University, Hung Hom, Hong Kong)
Ying Cui
(Department of Industrial and Systems Engineering, University of Minnesota, Minneapolis, Minnesota 55455)
Donghui Li
(School of Mathematical Sciences, South China Normal University, Guangzhou, 510631, China)
Defeng Sun
(Department of Applied Mathematics, The Hong Kong Polytechnic University, Hung Hom, Hong Kong)

Registered:

Abstract

The value at risk (VaR) and the conditional value at risk (CVaR) are two popular risk measures to hedge against the uncertainty of data. In this paper, we provide a computational toolbox for solving high-dimensional sparse linear regression problems under either VaR or CVaR measures, the former being nonconvex and the latter convex. Unlike the empirical risk (neutral) minimization models in which the overall losses are decomposable across data, the aforementioned risk-sensitive models have nonseparable objective functions so that typical first order algorithms are not easy to scale. We address this scaling issue by adopting a semismooth Newton-based proximal augmented Lagrangian method of the convex CVaR linear regression problem. The matrix structures of the Newton systems are carefully explored to reduce the computational cost per iteration. The method is further embedded in a majorization–minimization algorithm as a subroutine to tackle the nonconvex VaR-based regression problem. We also discuss an adaptive sieving strategy to iteratively guess and adjust the effective problem dimension, which is particularly useful when a solution path associated with a sequence of tuning parameters is needed. Extensive numerical experiments on both synthetic and real data demonstrate the effectiveness of our proposed methods. In particular, they are about 53 times faster than the commercial package Gurobi for the CVaR-based sparse linear regression with 4,265,669 features and 16,087 observations.

Suggested Citation

Can Wu & Ying Cui & Donghui Li & Defeng Sun, 2023. "Convex and Nonconvex Risk-Based Linear Regression at Scale," INFORMS Journal on Computing, INFORMS, vol. 35(4), pages 797-816, July.

Handle: RePEc:inm:orijoc:v:35:y:2023:i:4:p:797-816
DOI: 10.1287/ijoc.2023.1282

Download full text from publisher

References listed on IDEAS

R. T. Rockafellar, 1976. "Augmented Lagrangians and Applications of the Proximal Point Algorithm in Convex Programming," Mathematics of Operations Research, INFORMS, vol. 1(2), pages 97-116, May.
Wang, Hansheng & Li, Guodong & Jiang, Guohua, 2007. "Robust Regression Shrinkage and Consistent Variable Selection Through the LAD-Lasso," Journal of Business & Economic Statistics, American Statistical Association, vol. 25, pages 347-355, July.
Jianqing Fan & Jinchi Lv, 2008. "Sure independence screening for ultrahigh dimensional feature space," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(5), pages 849-911, November.
Robert Tibshirani & Jacob Bien & Jerome Friedman & Trevor Hastie & Noah Simon & Jonathan Taylor & Ryan J. Tibshirani, 2012. "Strong rules for discarding predictors in lasso‐type problems," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 74(2), pages 245-266, March.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Chen, Huangyue & Kong, Lingchen & Shang, Pan & Pan, Shanshan, 2020. "Safe feature screening rules for the regularized Huber regression," Applied Mathematics and Computation, Elsevier, vol. 386(C).
Guang Cheng & Hao Zhang & Zuofeng Shang, 2015. "Sparse and efficient estimation for partial spline models with increasing dimension," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 67(1), pages 93-127, February.
Hu Yang & Ning Li & Jing Yang, 2020. "A robust and efficient estimation and variable selection method for partially linear models with large-dimensional covariates," Statistical Papers, Springer, vol. 61(5), pages 1911-1937, October.
Junlong Zhao & Chao Liu & Lu Niu & Chenlei Leng, 2019. "Multiple influential point detection in high dimensional regression spaces," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 81(2), pages 385-408, April.
Gabriel E Hoffman & Benjamin A Logsdon & Jason G Mezey, 2013. "PUMA: A Unified Framework for Penalized Multiple Regression Analysis of GWAS Data," PLOS Computational Biology, Public Library of Science, vol. 9(6), pages 1-19, June.
Kean Ming Tan & Lan Wang & Wen‐Xin Zhou, 2022. "High‐dimensional quantile regression: Convolution smoothing and concave regularization," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(1), pages 205-233, February.
Weiyan Mu & Shifeng Xiong, 2014. "Some notes on robust sure independence screening," Journal of Applied Statistics, Taylor & Francis Journals, vol. 41(10), pages 2092-2102, October.
Zeng, Yaohui & Yang, Tianbao & Breheny, Patrick, 2021. "Hybrid safe–strong rules for efficient optimization in lasso-type problems," Computational Statistics & Data Analysis, Elsevier, vol. 153(C).
Aneiros, Germán & Novo, Silvia & Vieu, Philippe, 2022. "Variable selection in functional regression models: A review," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
N. Neykov & P. Filzmoser & P. Neytchev, 2014. "Ultrahigh dimensional variable selection through the penalized maximum trimmed likelihood estimator," Statistical Papers, Springer, vol. 55(1), pages 187-207, February.
Qiang Li & Liming Wang, 2020. "Robust change point detection method via adaptive LAD-LASSO," Statistical Papers, Springer, vol. 61(1), pages 109-121, February.
Muhammad Amin & Lixin Song & Milton Abdul Thorlie & Xiaoguang Wang, 2015. "SCAD-penalized quantile regression for high-dimensional data analysis and variable selection," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 69(3), pages 212-235, August.
Yi Chu & Lu Lin, 2020. "Conditional SIRS for nonparametric and semiparametric models by marginal empirical likelihood," Statistical Papers, Springer, vol. 61(4), pages 1589-1606, August.
Guo, Yi & Berman, Mark & Gao, Junbin, 2014. "Group subset selection for linear regression," Computational Statistics & Data Analysis, Elsevier, vol. 75(C), pages 39-52.
Nahapetyan Yervand, 2019. "The benefits of the Velvet Revolution in Armenia: Estimation of the short-term economic gains using deep neural networks," Central European Economic Journal, Sciendo, vol. 6(53), pages 286-303, January.
Shuichi Kawano, 2014. "Selection of tuning parameters in bridge regression models via Bayesian information criterion," Statistical Papers, Springer, vol. 55(4), pages 1207-1223, November.
Yang, Xuzhi & Wang, Tengyao, 2024. "Multiple-output composite quantile regression through an optimal transport lens," LSE Research Online Documents on Economics 125589, London School of Economics and Political Science, LSE Library.
Jean-Pierre Crouzeix & Abdelhak Hassouni & Eladio Ocaña, 2023. "A Short Note on the Twice Differentiability of the Marginal Function of a Convex Function," Journal of Optimization Theory and Applications, Springer, vol. 198(2), pages 857-867, August.
Sauvenier, Mathieu & Van Bellegem, Sébastien, 2023. "Direction Identification and Minimax Estimation by Generalized Eigenvalue Problem in High Dimensional Sparse Regression," LIDAM Discussion Papers CORE 2023005, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
Zhu Wang, 2022. "MM for penalized estimation," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(1), pages 54-75, March.

More about this item

Keywords

risk measures; (conditional) value-at-risk; sparsity; semismooth Newton; augmented Lagrangian; nonconvexity;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:orijoc:v:35:y:2023:i:4:p:797-816. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Convex and Nonconvex Risk-Based Linear Regression at Scale

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data