IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2404.09528.html
   My bibliography  Save this paper

Overfitting Reduction in Convex Regression

Author

Listed:
  • Zhiqiang Liao
  • Sheng Dai
  • Eunji Lim
  • Timo Kuosmanen

Abstract

Convex regression is a method for estimating the convex function from a data set. This method has played an important role in operations research, economics, machine learning, and many other areas. However, it has been empirically observed that convex regression produces inconsistent estimates of convex functions and extremely large subgradients near the boundary as the sample size increases. In this paper, we provide theoretical evidence of this overfitting behavior. To eliminate this behavior, we propose two new estimators by placing a bound on the subgradients of the convex function. We further show that our proposed estimators can reduce overfitting by proving that they converge to the underlying true convex function and that their subgradients converge to the gradient of the underlying function, both uniformly over the domain with probability one as the sample size is increasing to infinity. An application to Finnish electricity distribution firms confirms the superior performance of the proposed methods in predictive power over the existing methods.

Suggested Citation

  • Zhiqiang Liao & Sheng Dai & Eunji Lim & Timo Kuosmanen, 2024. "Overfitting Reduction in Convex Regression," Papers 2404.09528, arXiv.org, revised Oct 2024.
  • Handle: RePEc:arx:papers:2404.09528
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2404.09528
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Rahul Mazumder & Arkopal Choudhury & Garud Iyengar & Bodhisattva Sen, 2019. "A Computational Framework for Multivariate Convex Regression and Its Variants," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(525), pages 318-331, January.
    2. Daisuke Yagi & Yining Chen & Andrew L. Johnson & Timo Kuosmanen, 2020. "Shape-Constrained Kernel-Weighted Least Squares: Estimating Production Functions for Chilean Manufacturing Industries," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 38(1), pages 43-54, January.
    3. Lee, Chia-Yen & Johnson, Andrew L. & Moreno-Centeno, Erick & Kuosmanen, Timo, 2013. "A more efficient algorithm for Convex Nonparametric Least Squares," European Journal of Operational Research, Elsevier, vol. 227(2), pages 391-400.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dai, Sheng & Kuosmanen, Timo & Zhou, Xun, 2023. "Generalized quantile and expectile properties for shape constrained nonparametric estimation," European Journal of Operational Research, Elsevier, vol. 310(2), pages 914-927.
    2. Liao, Zhiqiang & Dai, Sheng & Kuosmanen, Timo, 2024. "Convex support vector regression," European Journal of Operational Research, Elsevier, vol. 313(3), pages 858-870.
    3. Lee, Chia-Yen & Wang, Ke, 2019. "Nash marginal abatement cost estimation of air pollutant emissions using the stochastic semi-nonparametric frontier," European Journal of Operational Research, Elsevier, vol. 273(1), pages 390-400.
    4. José Luis Preciado Arreola & Daisuke Yagi & Andrew L. Johnson, 2020. "Insights from machine learning for evaluating production function estimators on manufacturing survey data," Journal of Productivity Analysis, Springer, vol. 53(2), pages 181-225, April.
    5. Jose M. Cordero & Cristina Polo & Daniel Santín, 2020. "Assessment of new methods for incorporating contextual variables into efficiency measures: a Monte Carlo simulation," Operational Research, Springer, vol. 20(4), pages 2245-2265, December.
    6. Dai, Sheng, 2023. "Variable selection in convex quantile regression: L1-norm or L0-norm regularization?," European Journal of Operational Research, Elsevier, vol. 305(1), pages 338-355.
    7. Tsionas, Mike, 2022. "Efficiency estimation using probabilistic regression trees with an application to Chilean manufacturing industries," International Journal of Production Economics, Elsevier, vol. 249(C).
    8. Ruitu Xu & Yifei Min & Tianhao Wang & Zhaoran Wang & Michael I. Jordan & Zhuoran Yang, 2023. "Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models with Reinforcement Learning," Papers 2303.04833, arXiv.org.
    9. Eunji Lim, 2021. "Consistency of Penalized Convex Regression," International Journal of Statistics and Probability, Canadian Center of Science and Education, vol. 10(1), pages 1-69, January.
    10. Tsionas, Mike G. & Izzeldin, Marwan, 2018. "Smooth approximations to monotone concave functions in production analysis: An alternative to nonparametric concave least squares," European Journal of Operational Research, Elsevier, vol. 271(3), pages 797-807.
    11. Cristina Polo & Julián Ramajo & Alejandro Ricci‐Risquete, 2021. "A stochastic semi‐non‐parametric analysis of regional efficiency in the European Union," Regional Science Policy & Practice, Wiley Blackwell, vol. 13(1), pages 7-24, February.
    12. Timo Kuosmanen & Sheng Dai, 2023. "Modeling economies of scope in joint production: Convex regression of input distance function," Papers 2311.11637, arXiv.org.
    13. K. Hervé Dakpo & Yann Desjeux & Laure Latruffe, 2023. "Cost of abating excess nitrogen on wheat plots in France: An assessment with multi‐technology modelling," Journal of Agricultural Economics, Wiley Blackwell, vol. 74(3), pages 800-815, September.
    14. Maziotis, Alexandros & Sala-Garrido, Ramon & Mocholi-Arce, Manuel & Molinos-Senante, Maria, 2023. "Cost and quality of service performance in the Chilean water industry: A comparison of stochastic approaches," Structural Change and Economic Dynamics, Elsevier, vol. 67(C), pages 211-219.
    15. Feng, Oliver Y. & Chen, Yining & Han, Qiyang & Carroll, Raymond J & Samworth, Richard J., 2022. "Nonparametric, tuning-free estimation of S-shaped functions," LSE Research Online Documents on Economics 111889, London School of Economics and Political Science, LSE Library.
    16. repec:diw:diwwpp:dp1526 is not listed on IDEAS
    17. Zhiqiang Liao, 2024. "Variable selection in convex nonparametric least squares via structured Lasso: An application to the Swedish electricity distribution networks," Papers 2409.01911, arXiv.org, revised Nov 2024.
    18. Pang Du & Christopher F. Parmeter & Jeffrey S. Racine, 2012. "Nonparametric Kernel Regression with Multiple Predictors and Multiple Shape Constraints," Department of Economics Working Papers 2012-08, McMaster University.
    19. Chia-Yen Lee, 2017. "Directional marginal productivity: a foundation of meta-data envelopment analysis," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 68(5), pages 544-555, May.
    20. Layer, Kevin & Johnson, Andrew L. & Sickles, Robin C. & Ferrier, Gary D., 2020. "Direction selection in stochastic directional distance functions," European Journal of Operational Research, Elsevier, vol. 280(1), pages 351-364.
    21. Eunji Lim & Kihwan Kim, 2020. "Estimating Smooth and Convex Functions," International Journal of Statistics and Probability, Canadian Center of Science and Education, vol. 9(5), pages 1-40, September.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2404.09528. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.