IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v52y2008i7p3528-3542.html
   My bibliography  Save this article

Efficient methods for estimating constrained parameters with applications to regularized (lasso) logistic regression

Author

Listed:
  • Tian, Guo-Liang
  • Tang, Man-Lai
  • Fang, Hong-Bin
  • Tan, Ming

Abstract

Fitting logistic regression models is challenging when their parameters are restricted. In this article, we first develop a quadratic lower-bound (QLB) algorithm for optimization with box or linear inequality constraints and derive the fastest QLB algorithm corresponding to the smallest global majorization matrix. The proposed QLB algorithm is particularly suited to problems to which the EM-type algorithms are not applicable (e.g., logistic, multinomial logistic, and Cox's proportional hazards models) while it retains the same EM ascent property and thus assures the monotonic convergence. Secondly, we generalize the QLB algorithm to penalized problems in which the penalty functions may not be totally differentiable. The proposed method thus provides an alternative algorithm for estimation in lasso logistic regression, where the convergence of the existing lasso algorithm is not generally ensured. Finally, by relaxing the ascent requirement, convergence speed can be further accelerated. We introduce a pseudo-Newton method that retains the simplicity of the QLB algorithm and the fast convergence of the Newton method. Theoretical justification and numerical examples show that the pseudo-Newton method is up to 71 (in terms of CPU time) or 107 (in terms of number of iterations) times faster than the fastest QLB algorithm and thus makes bootstrap variance estimation feasible. Simulations and comparisons are performed and three real examples (Down syndrome data, kyphosis data, and colon microarray data) are analyzed to illustrate the proposed methods.

Suggested Citation

  • Tian, Guo-Liang & Tang, Man-Lai & Fang, Hong-Bin & Tan, Ming, 2008. "Efficient methods for estimating constrained parameters with applications to regularized (lasso) logistic regression," Computational Statistics & Data Analysis, Elsevier, vol. 52(7), pages 3528-3542, March.
  • Handle: RePEc:eee:csdana:v:52:y:2008:i:7:p:3528-3542
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167-9473(07)00440-9
    Download Restriction: Full text for ScienceDirect subscribers only.
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Dankmar Böhning & Bruce Lindsay, 1988. "Monotonicity of quadratic-approximation algorithms," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 40(4), pages 641-663, December.
    2. Dankmar Böhning, 1992. "Multinomial logistic regression algorithm," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 44(1), pages 197-200, March.
    3. Kim, Yongdai & Kwon, Sunghoon & Heun Song, Seuck, 2006. "Multiclass sparse logistic regression for classification of multiple cancer types using gene expression data," Computational Statistics & Data Analysis, Elsevier, vol. 51(3), pages 1643-1655, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Diego Vidaurre & Concha Bielza & Pedro Larrañaga, 2013. "A Survey of L1 Regression," International Statistical Review, International Statistical Institute, vol. 81(3), pages 361-387, December.
    2. Pierre Alquier & Vincent Cottet & Guillaume Lecué, 2017. "Estimation bounds and sharp oracle inequalities of regularized procedures with Lipschitz loss functions," Working Papers 2017-30, Center for Research in Economics and Statistics.
    3. Zhang, Chun-Xia & Xu, Shuang & Zhang, Jiang-She, 2019. "A novel variational Bayesian method for variable selection in logistic regression models," Computational Statistics & Data Analysis, Elsevier, vol. 133(C), pages 1-19.
    4. Colubi, Ana & González-Rodri­guez, Gil & Domi­nguez-Cuesta, Mari­a José & Jiménez-Sánchez, Montserrat, 2008. "Favorability functions based on kernel density estimation for logistic models: A case study," Computational Statistics & Data Analysis, Elsevier, vol. 52(9), pages 4533-4543, May.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tian, Guo-Liang & Tang, Man-Lai & Liu, Chunling, 2012. "Accelerating the quadratic lower-bound algorithm via optimizing the shrinkage parameter," Computational Statistics & Data Analysis, Elsevier, vol. 56(2), pages 255-265.
    2. Bohning, Dankmar, 1999. "The lower bound method in probit regression," Computational Statistics & Data Analysis, Elsevier, vol. 30(1), pages 13-17, March.
    3. Jonathan James, 2012. "A tractable estimator for general mixed multinomial logit models," Working Papers (Old Series) 1219, Federal Reserve Bank of Cleveland.
    4. Lee, Sangin & Kwon, Sunghoon & Kim, Yongdai, 2016. "A modified local quadratic approximation algorithm for penalized optimization problems," Computational Statistics & Data Analysis, Elsevier, vol. 94(C), pages 275-286.
    5. Vincent, Martin & Hansen, Niels Richard, 2014. "Sparse group lasso and high dimensional multinomial classification," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 771-786.
    6. Liu, Shen & Maharaj, Elizabeth Ann & Inder, Brett, 2014. "Polarization of forecast densities: A new approach to time series classification," Computational Statistics & Data Analysis, Elsevier, vol. 70(C), pages 345-361.
    7. Wang, Fa, 2017. "Maximum likelihood estimation and inference for high dimensional nonlinear factor models with application to factor-augmented regressions," MPRA Paper 93484, University Library of Munich, Germany, revised 19 May 2019.
    8. de Leeuw, Jan, 2006. "Principal component analysis of binary data by iterated singular value decomposition," Computational Statistics & Data Analysis, Elsevier, vol. 50(1), pages 21-39, January.
    9. Roussille, Nina & Scuderi, Benjamin, 2023. "Bidding for Talent: A Test of Conduct in a High-Wage Labor Market," IZA Discussion Papers 16352, Institute of Labor Economics (IZA).
    10. Kenneth Lange & Hua Zhou, 2022. "A Legacy of EM Algorithms," International Statistical Review, International Statistical Institute, vol. 90(S1), pages 52-66, December.
    11. Wang, Fa, 2022. "Maximum likelihood estimation and inference for high dimensional generalized factor models with application to factor-augmented regressions," Journal of Econometrics, Elsevier, vol. 229(1), pages 180-200.
    12. Colubi, Ana & González-Rodri­guez, Gil & Domi­nguez-Cuesta, Mari­a José & Jiménez-Sánchez, Montserrat, 2008. "Favorability functions based on kernel density estimation for logistic models: A case study," Computational Statistics & Data Analysis, Elsevier, vol. 52(9), pages 4533-4543, May.
    13. Utkarsh J. Dang & Michael P.B. Gallaugher & Ryan P. Browne & Paul D. McNicholas, 2023. "Model-Based Clustering and Classification Using Mixtures of Multivariate Skewed Power Exponential Distributions," Journal of Classification, Springer;The Classification Society, vol. 40(1), pages 145-167, April.
    14. Douzal-Chouakria, Ahlame & Diallo, Alpha & Giroud, Françoise, 2009. "Adaptive clustering for time series: Application for identifying cell cycle expressed genes," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 1414-1426, February.
    15. Dankmar Böhning, 1992. "Multinomial logistic regression algorithm," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 44(1), pages 197-200, March.
    16. Liu, Wenchen & Tang, Yincai & Wu, Xianyi, 2020. "Separating variables to accelerate non-convex regularized optimization," Computational Statistics & Data Analysis, Elsevier, vol. 147(C).
    17. Bansal, Prateek & Daziano, Ricardo A & Guerra, Erick, 2018. "Minorization-Maximization (MM) algorithms for semiparametric logit models: Bottlenecks, extensions, and comparisons," Transportation Research Part B: Methodological, Elsevier, vol. 115(C), pages 17-40.
    18. Liu, Shen & Maharaj, Elizabeth Ann, 2013. "A hypothesis test using bias-adjusted AR estimators for classifying time series in small samples," Computational Statistics & Data Analysis, Elsevier, vol. 60(C), pages 32-49.
    19. de Leeuw, Jan & Lange, Kenneth, 2009. "Sharp quadratic majorization in one dimension," Computational Statistics & Data Analysis, Elsevier, vol. 53(7), pages 2471-2484, May.
    20. Dalmau, Oscar & Alarcón, Teresa E. & González, Graciela, 2015. "Kernel multilogit algorithm for multiclass classification," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 199-206.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:52:y:2008:i:7:p:3528-3542. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.