IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v10y2022i6p909-d769571.html
   My bibliography  Save this article

Combining Data Envelopment Analysis and Machine Learning

Author

Listed:
  • Nadia M. Guerrero

    (Center of Operations Research (CIO), Miguel Hernandez University of Elche (UMH), 03202 Elche, Spain)

  • Juan Aparicio

    (Center of Operations Research (CIO), Miguel Hernandez University of Elche (UMH), 03202 Elche, Spain)

  • Daniel Valero-Carreras

    (Center of Operations Research (CIO), Miguel Hernandez University of Elche (UMH), 03202 Elche, Spain)

Abstract

Data Envelopment Analysis (DEA) is one of the most used non-parametric techniques for technical efficiency assessment. DEA is exclusively concerned about the minimization of the empirical error, satisfying, at the same time, some shape constraints (convexity and free disposability). Unfortunately, by construction, DEA is a descriptive methodology that is not concerned about preventing overfitting. In this paper, we introduce a new methodology that allows for estimating polyhedral technologies following the Structural Risk Minimization (SRM) principle. This technique is called Data Envelopment Analysis-based Machines (DEAM). Given that the new method controls the generalization error of the model, the corresponding estimate of the technology does not suffer from overfitting. Moreover, the notion of ε -insensitivity is also introduced, generating a new and more robust definition of technical efficiency. Additionally, we show that DEAM can be seen as a machine learning-type extension of DEA, satisfying the same microeconomic postulates except for minimal extrapolation. Finally, the performance of DEAM is evaluated through simulations. We conclude that the frontier estimator derived from DEAM is better than that associated with DEA. The bias and mean squared error obtained for DEAM are smaller in all the scenarios analyzed, regardless of the number of variables and DMUs.

Suggested Citation

  • Nadia M. Guerrero & Juan Aparicio & Daniel Valero-Carreras, 2022. "Combining Data Envelopment Analysis and Machine Learning," Mathematics, MDPI, vol. 10(6), pages 1-22, March.
  • Handle: RePEc:gam:jmathe:v:10:y:2022:i:6:p:909-:d:769571
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/10/6/909/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/10/6/909/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Leopold Simar & Paul Wilson, 2000. "A general methodology for bootstrapping in non-parametric frontier models," Journal of Applied Statistics, Taylor & Francis Journals, vol. 27(6), pages 779-802.
    2. Léopold Simar & Paul Wilson, 2000. "Statistical Inference in Nonparametric Frontier Models: The State of the Art," Journal of Productivity Analysis, Springer, vol. 13(1), pages 49-78, January.
    3. Afriat, Sidney N, 1972. "Efficiency Estimation of Production Function," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 13(3), pages 568-598, October.
    4. Charles, Vincent & Aparicio, Juan & Zhu, Joe, 2019. "The curse of dimensionality of decision-making units: A simple approach to increase the discriminatory power of data envelopment analysis," European Journal of Operational Research, Elsevier, vol. 279(3), pages 929-940.
    5. Valero-Carreras, Daniel & Aparicio, Juan & Guerrero, Nadia M., 2021. "Support vector frontiers: A new approach for estimating production functions through support vector machines," Omega, Elsevier, vol. 104(C).
    6. Charnes, A. & Cooper, W. W. & Rhodes, E., 1978. "Measuring the efficiency of decision making units," European Journal of Operational Research, Elsevier, vol. 2(6), pages 429-444, November.
    7. W. Briec & J. B. Lesourd, 1999. "Metric Distance Function and Profit: Some Duality Results," Journal of Optimization Theory and Applications, Springer, vol. 101(1), pages 15-33, April.
    8. Olesen, O.B. & Ruggiero, J., 2022. "The hinging hyperplanes: An alternative nonparametric representation of a production function," European Journal of Operational Research, Elsevier, vol. 296(1), pages 254-266.
    9. Léopold Simar & Paul W. Wilson, 1998. "Sensitivity Analysis of Efficiency Scores: How to Bootstrap in Nonparametric Frontier Models," Management Science, INFORMS, vol. 44(1), pages 49-61, January.
    10. Alireza Amirteimoori & Biresh K. Sahoo & Vincent Charles & Saber Mehdizadeh, 2022. "Stochastic Network Data Envelopment Analysis," International Series in Operations Research & Management Science, in: Stochastic Benchmarking, chapter 0, pages 77-117, Springer.
    11. Kuosmanen, Timo & Johnson, Andrew, 2017. "Modeling joint production of multiple outputs in StoNED: Directional distance function approach," European Journal of Operational Research, Elsevier, vol. 262(2), pages 792-801.
    12. Alireza Amirteimoori & Biresh K. Sahoo & Vincent Charles & Saber Mehdizadeh, 2022. "Stochastic Benchmarking," International Series in Operations Research and Management Science, Springer, number 978-3-030-89869-4, December.
    13. Timo Kuosmanen & Andrew L. Johnson, 2010. "Data Envelopment Analysis as Nonparametric Least-Squares Regression," Operations Research, INFORMS, vol. 58(1), pages 149-160, February.
    14. Alireza Amirteimoori & Biresh K. Sahoo & Vincent Charles & Saber Mehdizadeh, 2022. "Stochastic Data Envelopment Analysis," International Series in Operations Research & Management Science, in: Stochastic Benchmarking, chapter 0, pages 55-76, Springer.
    15. Gabriel Villa & Sebastián Lozano & Sandra Redondo, 2021. "Data Envelopment Analysis Approach to Energy-Saving Projects Selection in an Energy Service Company," Mathematics, MDPI, vol. 9(2), pages 1-15, January.
    16. R. D. Banker & A. Charnes & W. W. Cooper, 1984. "Some Models for Estimating Technical and Scale Inefficiencies in Data Envelopment Analysis," Management Science, INFORMS, vol. 30(9), pages 1078-1092, September.
    17. Biresh K. Sahoo & Hilda Saleh & Morteza Shafiee & Kaoru Tone & Joe Zhu, 2021. "An Alternative Approach to Dealing with the Composition Approach for Series Network Production Processes," Asia-Pacific Journal of Operational Research (APJOR), World Scientific Publishing Co. Pte. Ltd., vol. 38(06), pages 1-27, December.
    18. William W. Cooper & Lawrence M. Seiford & Kaoru Tone, 2007. "Data Envelopment Analysis," Springer Books, Springer, edition 0, number 978-0-387-45283-8, June.
    19. Rajiv D. Banker, 1993. "Maximum Likelihood, Consistency and Data Envelopment Analysis: A Statistical Foundation," Management Science, INFORMS, vol. 39(10), pages 1265-1273, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Raul Moragues & Juan Aparicio & Miriam Esteve, 2023. "Ranking the Importance of Variables in a Nonparametric Frontier Analysis Using Unsupervised Machine Learning Techniques," Mathematics, MDPI, vol. 11(11), pages 1-24, June.
    2. Georgios Tsaples & Jason Papathanasiou & Andreas C. Georgiou, 2022. "An Exploratory DEA and Machine Learning Framework for the Evaluation and Analysis of Sustainability Composite Indicators in the EU," Mathematics, MDPI, vol. 10(13), pages 1-27, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Esteve, Miriam & Aparicio, Juan & Rodriguez-Sala, Jesus J. & Zhu, Joe, 2023. "Random Forests and the measurement of super-efficiency in the context of Free Disposal Hull," European Journal of Operational Research, Elsevier, vol. 304(2), pages 729-744.
    2. Valero-Carreras, Daniel & Aparicio, Juan & Guerrero, Nadia M., 2021. "Support vector frontiers: A new approach for estimating production functions through support vector machines," Omega, Elsevier, vol. 104(C).
    3. España, Victor J. & Aparicio, Juan & Barber, Xavier & Esteve, Miriam, 2024. "Estimating production functions through additive models based on regression splines," European Journal of Operational Research, Elsevier, vol. 312(2), pages 684-699.
    4. Raul Moragues & Juan Aparicio & Miriam Esteve, 2023. "Ranking the Importance of Variables in a Nonparametric Frontier Analysis Using Unsupervised Machine Learning Techniques," Mathematics, MDPI, vol. 11(11), pages 1-24, June.
    5. Valentin Zelenyuk, 2019. "Data Envelopment Analysis and Business Analytics: The Big Data Challenges and Some Solutions," CEPA Working Papers Series WP072019, School of Economics, University of Queensland, Australia.
    6. Moragues, Raul & Aparicio, Juan & Esteve, Miriam, 2023. "An unsupervised learning-based generalization of Data Envelopment Analysis," Operations Research Perspectives, Elsevier, vol. 11(C).
    7. Franz R. Hahn, 2007. "Determinants of Bank Efficiency in Europe. Assessing Bank Performance Across Markets," WIFO Studies, WIFO, number 31499.
    8. Zervopoulos, Panagiotis & Emrouznejad, Ali & Sklavos, Sokratis, 2019. "A Bayesian approach for correcting bias of data envelopment analysis estimators," MPRA Paper 91886, University Library of Munich, Germany.
    9. Léopold Simar & Paul W. Wilson, 2015. "Statistical Approaches for Non-parametric Frontier Models: A Guided Tour," International Statistical Review, International Statistical Institute, vol. 83(1), pages 77-110, April.
    10. Chien-Ming Chen & Magali A. Delmas, 2012. "Measuring Eco-Inefficiency: A New Frontier Approach," Operations Research, INFORMS, vol. 60(5), pages 1064-1079, October.
    11. Luis R. Murillo‐Zamorano, 2004. "Economic Efficiency and Frontier Techniques," Journal of Economic Surveys, Wiley Blackwell, vol. 18(1), pages 33-77, February.
    12. Halkos, George & Tzeremes, Nickolaos, 2010. "Measuring the effect of virtual mergers on banks’ efficiency levels:A non parametric analysis," MPRA Paper 23696, University Library of Munich, Germany.
    13. Gounopoulos, Dimitrios & Kallias, Konstantinos & Newton, David & Tzeremes, Nickolaos, 2016. "Political connections and IPO underpricing: An efficiency problem," MPRA Paper 69427, University Library of Munich, Germany.
    14. Keshvari, Abolfazl & Kuosmanen, Timo, 2013. "Stochastic non-convex envelopment of data: Applying isotonic regression to frontier estimation," European Journal of Operational Research, Elsevier, vol. 231(2), pages 481-491.
    15. Hung, Shiu-Wan & Lu, Wen-Min & Wang, Tung-Pao, 2010. "Benchmarking the operating efficiency of Asia container ports," European Journal of Operational Research, Elsevier, vol. 203(3), pages 706-713, June.
    16. Thilakaweera, Bolanda Hewa & Harvie, Charles & Arjomandi, Amir, 2016. "Branch expansion and banking efficiency in Sri Lanka’s post‐conflict era," Journal of Asian Economics, Elsevier, vol. 47(C), pages 45-57.
    17. Quaranta, Anna Grazia & Raffoni, Anna & Visani, Franco, 2018. "A multidimensional approach to measuring bank branch efficiency," European Journal of Operational Research, Elsevier, vol. 266(2), pages 746-760.
    18. Fragoudaki, Alexandra & Giokas, Dimitris, 2016. "Airport performance in a tourism receiving country: Evidence from Greece," Journal of Air Transport Management, Elsevier, vol. 52(C), pages 80-89.
    19. Ioannis E. Tsolas, 2023. "Efficiency Measurement of Lignite-Fired Power Plants in Greece Using a DEA-Bootstrap Approach," Sustainability, MDPI, vol. 15(4), pages 1-10, February.
    20. Sinuany-Stern, Zilla, 2023. "Foundations of operations research: From linear programming to data envelopment analysis," European Journal of Operational Research, Elsevier, vol. 306(3), pages 1069-1080.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:10:y:2022:i:6:p:909-:d:769571. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.