IDEAS home Printed from https://ideas.repec.org/a/spr/topjnl/v32y2024i3d10.1007_s11750-024-00683-x.html
   My bibliography  Save this article

Tuning parameters of deep neural network training algorithms pays off: a computational study

Author

Listed:
  • Corrado Coppola

    (Sapienza University of Rome)

  • Lorenzo Papa

    (Sapienza University of Rome)

  • Marco Boresta

    (Consiglio Nazionale delle Ricerche)

  • Irene Amerini

    (Sapienza University of Rome)

  • Laura Palagi

    (Sapienza University of Rome)

Abstract

The paper aims to investigate the impact of the optimization algorithms on the training of deep neural networks with an eye to the interaction between the optimizer and the generalization performance. In particular, we aim to analyze the behavior of state-of-the-art optimization algorithms in relationship to their hyperparameters setting to detect robustness with respect to the choice of a certain starting point in ending on different local solutions. We conduct extensive computational experiments using nine open-source optimization algorithms to train deep Convolutional Neural Network architectures on an image multi-class classification task. Precisely, we consider several architectures by changing the number of layers and neurons per layer, to evaluate the impact of different width and depth structures on the computational optimization performance. We show that the optimizers often return different local solutions and highlight the strong correlation between the quality of the solution found and the generalization capability of the trained network. We also discuss the role of hyperparameters tuning and show how a tuned hyperparameters setting can be re-used for the same task on different problems achieving better efficiency and generalization performance than a default setting.

Suggested Citation

  • Corrado Coppola & Lorenzo Papa & Marco Boresta & Irene Amerini & Laura Palagi, 2024. "Tuning parameters of deep neural network training algorithms pays off: a computational study," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(3), pages 579-620, October.
  • Handle: RePEc:spr:topjnl:v:32:y:2024:i:3:d:10.1007_s11750-024-00683-x
    DOI: 10.1007/s11750-024-00683-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11750-024-00683-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11750-024-00683-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Keiji Tatsumi & Tetsuzo Tanino, 2014. "Rejoinder on: Support Vector Machines Maximizing Geometric Margins for Multi-Class Classification," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 22(3), pages 856-859, October.
    2. Keiji Tatsumi & Tetsuzo Tanino, 2014. "Support vector machines maximizing geometric margins for multi-class classification," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 22(3), pages 815-840, October.
    3. Laura Palagi, 2019. "Global optimization issues in deep network regression: an overview," Journal of Global Optimization, Springer, vol. 73(2), pages 239-277, February.
    4. Emilio Carrizosa & Cristina Molero-Río & Dolores Romero Morales, 2021. "Mathematical optimization in classification and regression trees," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(1), pages 5-33, April.
    5. Baumann, P. & Hochbaum, D.S. & Yang, Y.T., 2019. "A comparative study of the leading machine learning techniques and two new optimization algorithms," European Journal of Operational Research, Elsevier, vol. 272(3), pages 1041-1057.
    6. Gambella, Claudio & Ghaddar, Bissan & Naoum-Sawaya, Joe, 2021. "Optimization problems for machine learning: A survey," European Journal of Operational Research, Elsevier, vol. 290(3), pages 807-828.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Emilio Carrizosa & Dolores Romero Morales, 2024. "Guest editorial to the Special Issue on Machine Learning and Mathematical Optimization in TOP-Transactions in Operations Research," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(3), pages 351-353, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Astorino, Annabella & Avolio, Matteo & Fuduli, Antonio, 2022. "A maximum-margin multisphere approach for binary Multiple Instance Learning," European Journal of Operational Research, Elsevier, vol. 299(2), pages 642-652.
    2. Carrizosa, Emilio & Ramírez-Ayerbe, Jasone & Romero Morales, Dolores, 2024. "Mathematical optimization modelling for group counterfactual explanations," European Journal of Operational Research, Elsevier, vol. 319(2), pages 399-412.
    3. Andreas Dellnitz & Andreas Kleine & Madjid Tavana, 2024. "An integrated data envelopment analysis and regression tree method for new product price estimation," OR Spectrum: Quantitative Approaches in Management, Springer;Gesellschaft für Operations Research e.V., vol. 46(4), pages 1189-1211, December.
    4. Martín Barragán, Belén, 2016. "A Partial parametric path algorithm for multiclass classification," DES - Working Papers. Statistics and Econometrics. WS 22390, Universidad Carlos III de Madrid. Departamento de Estadística.
    5. Emilio Carrizosa & Vanesa Guerrero & Dolores Romero Morales, 2023. "On mathematical optimization for clustering categories in contingency tables," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(2), pages 407-429, June.
    6. Miguel Angel Ortíz-Barrios & Dayana Milena Coba-Blanco & Juan-José Alfaro-Saíz & Daniela Stand-González, 2021. "Process Improvement Approaches for Increasing the Response of Emergency Departments against the COVID-19 Pandemic: A Systematic Review," IJERPH, MDPI, vol. 18(16), pages 1-31, August.
    7. Dimitris Bertsimas & Cheol Woo Kim, 2023. "A Prescriptive Machine Learning Approach to Mixed-Integer Convex Optimization," INFORMS Journal on Computing, INFORMS, vol. 35(6), pages 1225-1241, November.
    8. Ikeda, Shunnosuke & Nishimura, Naoki & Sukegawa, Noriyoshi & Takano, Yuichi, 2023. "Prescriptive price optimization using optimal regression trees," Operations Research Perspectives, Elsevier, vol. 11(C).
    9. Benati, Stefano & Ponce, Diego & Puerto, Justo & Rodríguez-Chía, Antonio M., 2022. "A branch-and-price procedure for clustering data that are graph connected," European Journal of Operational Research, Elsevier, vol. 297(3), pages 817-830.
    10. Davila-Pena, Laura & García-Jurado, Ignacio & Casas-Méndez, Balbina, 2022. "Assessment of the influence of features on a classification problem: An application to COVID-19 patients," European Journal of Operational Research, Elsevier, vol. 299(2), pages 631-641.
    11. Akhtar, Pervaiz & Ghouri, Arsalan Mujahid & Ashraf, Aniqa & Lim, Jia Jia & Khan, Naveed R & Ma, Shuang, 2024. "Smart product platforming powered by AI and generative AI: Personalization for the circular economy," International Journal of Production Economics, Elsevier, vol. 273(C).
    12. Ifaei, Pouya & Nazari-Heris, Morteza & Tayerani Charmchi, Amir Saman & Asadi, Somayeh & Yoo, ChangKyoo, 2023. "Sustainable energies and machine learning: An organized review of recent applications and challenges," Energy, Elsevier, vol. 266(C).
    13. Blanquero, Rafael & Carrizosa, Emilio & Molero-Río, Cristina & Morales, Dolores Romero, 2022. "On sparse optimal regression trees," European Journal of Operational Research, Elsevier, vol. 299(3), pages 1045-1054.
    14. Roberto Asín Achá & Dorit S. Hochbaum & Quico Spaen, 2020. "HNCcorr: combinatorial optimization for neuron identification," Annals of Operations Research, Springer, vol. 289(1), pages 5-32, June.
    15. Emilio Carrizosa & Cristina Molero-Río & Dolores Romero Morales, 2021. "Mathematical optimization in classification and regression trees," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(1), pages 5-33, April.
    16. Laura Palagi & Ruggiero Seccia, 2020. "Block layer decomposition schemes for training deep neural networks," Journal of Global Optimization, Springer, vol. 77(1), pages 97-124, May.
    17. Teddy Lazebnik & Tzach Fleischer & Amit Yaniv-Rosenfeld, 2023. "Benchmarking Biologically-Inspired Automatic Machine Learning for Economic Tasks," Sustainability, MDPI, vol. 15(14), pages 1-9, July.
    18. Wang, Mingsheng & Huang, Yong, 2024. "A digital Technology–Cultural resource strategy to drive innovation in cultural industries: A dynamic analysis based on machine learning," Technology in Society, Elsevier, vol. 77(C).
    19. Fu, Kun & Chen, Meiqian & Li, Qinghai, 2024. "Decontamination performance of metallic radionuclides in irradiated graphite via a fluidized bed reactor," Energy, Elsevier, vol. 305(C).
    20. Marc Gürtler & Marvin Zöllner, 2023. "Heterogeneities among credit risk parameter distributions: the modality defines the best estimation method," OR Spectrum: Quantitative Approaches in Management, Springer;Gesellschaft für Operations Research e.V., vol. 45(1), pages 251-287, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:topjnl:v:32:y:2024:i:3:d:10.1007_s11750-024-00683-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.