IDEAS home Printed from https://ideas.repec.org/p/tse/wpaper/126767.html
   My bibliography  Save this paper

Nonsmooth Implicit Differentiation for Machine Learning and Optimization

Author

Listed:
  • Bolte, Jérôme
  • Pauwels, Edouard
  • Silveti-Falls, Antonio
  • Le, Tam

Abstract

In view of training increasingly complex learning architectures, we establish a nonsmooth implicit function theorem with an operational calculus. Our result applies to most practical problems (i.e., definable problems) provided that a nonsmooth form of the classical invertibility condition is fulfilled. This approach allows for formal subdifferentiation: for instance, replacing derivatives by Clarke Jacobians in the usual differentiation formulas is fully justified for a wide class of nonsmooth problems. Moreover this calculus is entirely compatible with algorithmic differentiation (e.g., backpropagation). We provide several applications such as training deep equilibrium networks, training neural nets with conic optimization layers, or hyperparameter-tuning for nonsmooth Lasso-type models. To show the sharpness of our assumptions, we present numerical experiments showcasing the extremely pathological gradient dynamics one can encounter when applying implicit algorithmic differentiation without any hypothesis.

Suggested Citation

  • Bolte, Jérôme & Pauwels, Edouard & Silveti-Falls, Antonio & Le, Tam, 2022. "Nonsmooth Implicit Differentiation for Machine Learning and Optimization," TSE Working Papers 22-1314, Toulouse School of Economics (TSE).
  • Handle: RePEc:tse:wpaper:126767
    as

    Download full text from publisher

    File URL: https://www.tse-fr.eu/sites/default/files/TSE/documents/doc/wp/2022/wp_tse_1314.pdf
    File Function: Full Text
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Hédy Attouch & Jérôme Bolte & Patrick Redont & Antoine Soubeyran, 2010. "Proximal Alternating Minimization and Projection Methods for Nonconvex Problems: An Approach Based on the Kurdyka-Łojasiewicz Inequality," Mathematics of Operations Research, INFORMS, vol. 35(2), pages 438-457, May.
    2. Bolte, Jérôme & Castera, Camille & Pauwels, Edouard & Févotte, Cédric, 2019. "An Inertial Newton Algorithm for Deep Learning," TSE Working Papers 19-1043, Toulouse School of Economics (TSE).
    3. Stephen M. Robinson, 1991. "An Implicit-Function Theorem for a Class of Nonsmooth Functions," Mathematics of Operations Research, INFORMS, vol. 16(2), pages 292-309, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bolte, Jérôme & Le, Tam & Pauwels, Edouard & Silveti-Falls, Antonio, 2022. "Nonsmooth Implicit Differentiation for Machine Learning and Optimization," TSE Working Papers 22-1314, Toulouse School of Economics (TSE).
    2. Bolte, Jérôme & Glaudin, Lilian & Pauwels, Edouard & Serrurier, Matthieu, 2021. "A Hölderian backtracking method for min-max and min-min problems," TSE Working Papers 21-1243, Toulouse School of Economics (TSE).
    3. Le Thi Khanh Hien & Duy Nhat Phan & Nicolas Gillis, 2022. "Inertial alternating direction method of multipliers for non-convex non-smooth optimization," Computational Optimization and Applications, Springer, vol. 83(1), pages 247-285, September.
    4. Francesco Rinaldi & Damiano Zeffiro, 2023. "Avoiding bad steps in Frank-Wolfe variants," Computational Optimization and Applications, Springer, vol. 84(1), pages 225-264, January.
    5. Kely D. V. Villacorta & Paulo R. Oliveira & Antoine Soubeyran, 2014. "A Trust-Region Method for Unconstrained Multiobjective Problems with Applications in Satisficing Processes," Journal of Optimization Theory and Applications, Springer, vol. 160(3), pages 865-889, March.
    6. Bo Jiang & Tianyi Lin & Shiqian Ma & Shuzhong Zhang, 2019. "Structured nonconvex and nonsmooth optimization: algorithms and iteration complexity analysis," Computational Optimization and Applications, Springer, vol. 72(1), pages 115-157, January.
    7. Zehui Jia & Jieru Huang & Xingju Cai, 2021. "Proximal-like incremental aggregated gradient method with Bregman distance in weakly convex optimization problems," Journal of Global Optimization, Springer, vol. 80(4), pages 841-864, August.
    8. Glaydston Carvalho Bento & João Xavier Cruz Neto & Antoine Soubeyran & Valdinês Leite Sousa Júnior, 2016. "Dual Descent Methods as Tension Reduction Systems," Journal of Optimization Theory and Applications, Springer, vol. 171(1), pages 209-227, October.
    9. Masoud Ahookhosh & Le Thi Khanh Hien & Nicolas Gillis & Panagiotis Patrinos, 2021. "A Block Inertial Bregman Proximal Algorithm for Nonsmooth Nonconvex Problems with Application to Symmetric Nonnegative Matrix Tri-Factorization," Journal of Optimization Theory and Applications, Springer, vol. 190(1), pages 234-258, July.
    10. Alexander Y. Kruger & Nguyen H. Thao, 2015. "Quantitative Characterizations of Regularity Properties of Collections of Sets," Journal of Optimization Theory and Applications, Springer, vol. 164(1), pages 41-67, January.
    11. Jing Zhao & Qiao-Li Dong & Michael Th. Rassias & Fenghui Wang, 2022. "Two-step inertial Bregman alternating minimization algorithm for nonconvex and nonsmooth problems," Journal of Global Optimization, Springer, vol. 84(4), pages 941-966, December.
    12. Fornasier, Massimo & Maly, Johannes & Naumova, Valeriya, 2021. "Robust recovery of low-rank matrices with non-orthogonal sparse decomposition from incomplete measurements," Applied Mathematics and Computation, Elsevier, vol. 392(C).
    13. Emanuel Laude & Peter Ochs & Daniel Cremers, 2020. "Bregman Proximal Mappings and Bregman–Moreau Envelopes Under Relative Prox-Regularity," Journal of Optimization Theory and Applications, Springer, vol. 184(3), pages 724-761, March.
    14. W. Ackooij & S. Demassey & P. Javal & H. Morais & W. Oliveira & B. Swaminathan, 2021. "A bundle method for nonsmooth DC programming with application to chance-constrained problems," Computational Optimization and Applications, Springer, vol. 78(2), pages 451-490, March.
    15. Nguyen Hieu Thao, 2018. "A convergent relaxation of the Douglas–Rachford algorithm," Computational Optimization and Applications, Springer, vol. 70(3), pages 841-863, July.
    16. Jérôme Bolte & Edouard Pauwels, 2016. "Majorization-Minimization Procedures and Convergence of SQP Methods for Semi-Algebraic and Tame Programs," Mathematics of Operations Research, INFORMS, vol. 41(2), pages 442-465, May.
    17. D. Russell Luke & Shoham Sabach & Marc Teboulle & Kobi Zatlawey, 2017. "A simple globally convergent algorithm for the nonsmooth nonconvex single source localization problem," Journal of Global Optimization, Springer, vol. 69(4), pages 889-909, December.
    18. Bian, Fengmiao & Zhang, Xiaoqun, 2021. "A parameterized Douglas–Rachford splitting algorithm for nonconvex optimization," Applied Mathematics and Computation, Elsevier, vol. 410(C).
    19. Amos Uderzo, 2023. "Conditions for the stability of ideal efficient solutions in parametric vector optimization via set-valued inclusions," Journal of Global Optimization, Springer, vol. 85(4), pages 917-940, April.
    20. J. X. Cruz Neto & P. R. Oliveira & P. A. Soares & A. Soubeyran, 2014. "Proximal Point Method on Finslerian Manifolds and the “Effort–Accuracy” Trade-off," Journal of Optimization Theory and Applications, Springer, vol. 162(3), pages 873-891, September.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:tse:wpaper:126767. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/tsetofr.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.