IDEAS home Printed from https://ideas.repec.org/p/tse/wpaper/123630.html
   My bibliography  Save this paper

An Inertial Newton Algorithm for Deep Learning

Author

Listed:
  • Bolte, Jérôme
  • Castera, Camille
  • Pauwels, Edouard
  • Févotte, Cédric

Abstract

We devise a learning algorithm for possibly nonsmooth deep neural networks featuring inertia and Newtonian directional intelligence only by means of a backpropagation oracle. Our algorithm, called INDIAN, has an appealing mechanical interpretation, making the role of its two hyperparameters transparent. An elementary phase space lifting allows both for its implementation and its theoretical study under very general assumptions. We handle in particular a stochastic version of our method (which encompasses usual mini-batch approaches) for nonsmooth activation functions (such as ReLU). Our algorithm shows high efficiency and reaches state of the art on image classification problems.

Suggested Citation

  • Bolte, Jérôme & Castera, Camille & Pauwels, Edouard & Févotte, Cédric, 2019. "An Inertial Newton Algorithm for Deep Learning," TSE Working Papers 19-1043, Toulouse School of Economics (TSE).
  • Handle: RePEc:tse:wpaper:123630
    as

    Download full text from publisher

    File URL: https://www.tse-fr.eu/sites/default/files/TSE/documents/doc/wp/2019/wp_tse_1043.pdf
    File Function: Full Text
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Hédy Attouch & Jérôme Bolte & Patrick Redont & Antoine Soubeyran, 2010. "Proximal Alternating Minimization and Projection Methods for Nonconvex Problems: An Approach Based on the Kurdyka-Łojasiewicz Inequality," Mathematics of Operations Research, INFORMS, vol. 35(2), pages 438-457, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Samir Adly & Hedy Attouch & Van Nam Vo, 2023. "Convergence of Inertial Dynamics Driven by Sums of Potential and Nonpotential Operators with Implicit Newton-Like Damping," Journal of Optimization Theory and Applications, Springer, vol. 198(1), pages 290-331, July.
    2. Bolte, Jérôme & Le, Tam & Pauwels, Edouard & Silveti-Falls, Antonio, 2022. "Nonsmooth Implicit Differentiation for Machine Learning and Optimization," TSE Working Papers 22-1314, Toulouse School of Economics (TSE).
    3. Claire Boyer & Antoine Godichon-Baggioni, 2023. "On the asymptotic rate of convergence of Stochastic Newton algorithms and their Weighted Averaged versions," Computational Optimization and Applications, Springer, vol. 84(3), pages 921-972, April.
    4. Emilie Chouzenoux & Jean-Baptiste Fest, 2022. "SABRINA: A Stochastic Subspace Majorization-Minimization Algorithm," Journal of Optimization Theory and Applications, Springer, vol. 195(3), pages 919-952, December.
    5. Bolte, Jérôme & Pauwels, Edouard, 2019. "Conservative set valued fields, automatic differentiation, stochastic gradient methods and deep learning," TSE Working Papers 19-1044, Toulouse School of Economics (TSE).
    6. Bolte, Jérôme & Glaudin, Lilian & Pauwels, Edouard & Serrurier, Matthieu, 2021. "A Hölderian backtracking method for min-max and min-min problems," TSE Working Papers 21-1243, Toulouse School of Economics (TSE).
    7. Bolte, Jérôme & Pauwels, Edouard, 2021. "A mathematical model for automatic differentiation in machine learning," TSE Working Papers 21-1184, Toulouse School of Economics (TSE).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Le Thi Khanh Hien & Duy Nhat Phan & Nicolas Gillis, 2022. "Inertial alternating direction method of multipliers for non-convex non-smooth optimization," Computational Optimization and Applications, Springer, vol. 83(1), pages 247-285, September.
    2. Francesco Rinaldi & Damiano Zeffiro, 2023. "Avoiding bad steps in Frank-Wolfe variants," Computational Optimization and Applications, Springer, vol. 84(1), pages 225-264, January.
    3. Kely D. V. Villacorta & Paulo R. Oliveira & Antoine Soubeyran, 2014. "A Trust-Region Method for Unconstrained Multiobjective Problems with Applications in Satisficing Processes," Journal of Optimization Theory and Applications, Springer, vol. 160(3), pages 865-889, March.
    4. Bo Jiang & Tianyi Lin & Shiqian Ma & Shuzhong Zhang, 2019. "Structured nonconvex and nonsmooth optimization: algorithms and iteration complexity analysis," Computational Optimization and Applications, Springer, vol. 72(1), pages 115-157, January.
    5. Zehui Jia & Jieru Huang & Xingju Cai, 2021. "Proximal-like incremental aggregated gradient method with Bregman distance in weakly convex optimization problems," Journal of Global Optimization, Springer, vol. 80(4), pages 841-864, August.
    6. Glaydston Carvalho Bento & João Xavier Cruz Neto & Antoine Soubeyran & Valdinês Leite Sousa Júnior, 2016. "Dual Descent Methods as Tension Reduction Systems," Journal of Optimization Theory and Applications, Springer, vol. 171(1), pages 209-227, October.
    7. Bolte, Jérôme & Le, Tam & Pauwels, Edouard & Silveti-Falls, Antonio, 2022. "Nonsmooth Implicit Differentiation for Machine Learning and Optimization," TSE Working Papers 22-1314, Toulouse School of Economics (TSE).
    8. Masoud Ahookhosh & Le Thi Khanh Hien & Nicolas Gillis & Panagiotis Patrinos, 2021. "A Block Inertial Bregman Proximal Algorithm for Nonsmooth Nonconvex Problems with Application to Symmetric Nonnegative Matrix Tri-Factorization," Journal of Optimization Theory and Applications, Springer, vol. 190(1), pages 234-258, July.
    9. Alexander Y. Kruger & Nguyen H. Thao, 2015. "Quantitative Characterizations of Regularity Properties of Collections of Sets," Journal of Optimization Theory and Applications, Springer, vol. 164(1), pages 41-67, January.
    10. Jing Zhao & Qiao-Li Dong & Michael Th. Rassias & Fenghui Wang, 2022. "Two-step inertial Bregman alternating minimization algorithm for nonconvex and nonsmooth problems," Journal of Global Optimization, Springer, vol. 84(4), pages 941-966, December.
    11. Fornasier, Massimo & Maly, Johannes & Naumova, Valeriya, 2021. "Robust recovery of low-rank matrices with non-orthogonal sparse decomposition from incomplete measurements," Applied Mathematics and Computation, Elsevier, vol. 392(C).
    12. Emanuel Laude & Peter Ochs & Daniel Cremers, 2020. "Bregman Proximal Mappings and Bregman–Moreau Envelopes Under Relative Prox-Regularity," Journal of Optimization Theory and Applications, Springer, vol. 184(3), pages 724-761, March.
    13. W. Ackooij & S. Demassey & P. Javal & H. Morais & W. Oliveira & B. Swaminathan, 2021. "A bundle method for nonsmooth DC programming with application to chance-constrained problems," Computational Optimization and Applications, Springer, vol. 78(2), pages 451-490, March.
    14. Nguyen Hieu Thao, 2018. "A convergent relaxation of the Douglas–Rachford algorithm," Computational Optimization and Applications, Springer, vol. 70(3), pages 841-863, July.
    15. Jérôme Bolte & Edouard Pauwels, 2016. "Majorization-Minimization Procedures and Convergence of SQP Methods for Semi-Algebraic and Tame Programs," Mathematics of Operations Research, INFORMS, vol. 41(2), pages 442-465, May.
    16. D. Russell Luke & Shoham Sabach & Marc Teboulle & Kobi Zatlawey, 2017. "A simple globally convergent algorithm for the nonsmooth nonconvex single source localization problem," Journal of Global Optimization, Springer, vol. 69(4), pages 889-909, December.
    17. Bian, Fengmiao & Zhang, Xiaoqun, 2021. "A parameterized Douglas–Rachford splitting algorithm for nonconvex optimization," Applied Mathematics and Computation, Elsevier, vol. 410(C).
    18. J. X. Cruz Neto & P. R. Oliveira & P. A. Soares & A. Soubeyran, 2014. "Proximal Point Method on Finslerian Manifolds and the “Effort–Accuracy” Trade-off," Journal of Optimization Theory and Applications, Springer, vol. 162(3), pages 873-891, September.
    19. Liu, Jingjing & Ma, Ruijie & Zeng, Xiaoyang & Liu, Wanquan & Wang, Mingyu & Chen, Hui, 2021. "An efficient non-convex total variation approach for image deblurring and denoising," Applied Mathematics and Computation, Elsevier, vol. 397(C).
    20. Tianxiang Liu & Ting Kei Pong, 2017. "Further properties of the forward–backward envelope with applications to difference-of-convex programming," Computational Optimization and Applications, Springer, vol. 67(3), pages 489-520, July.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:tse:wpaper:123630. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/tsetofr.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.