IDEAS home Printed from https://ideas.repec.org/p/aim/wpaimx/2336.html
   My bibliography  Save this paper

Explicit solutions for the asymptotically-optimal bandwidth in cross validation

Author

Listed:

Abstract

We show that least squares cross-validation (CV) methods share a common structure which has an explicit asymptotic solution, when the chosen kernel is asymptotically separable in bandwidth and data. For density estimation with a multivariate Student t(ν) kernel, the CV criterion becomes asymptotically equivalent to a polynomial of only three terms. Our bandwidth formulae are simple and non-iterative (leading to very fast computations), their integrated squared-error dominates traditional CV implementations, they alleviate the notorious sample variability of CV, and overcome its breakdown in the case of repeated observations. We illustrate with univariate and bivariate applications, of density estimation and nonparametric regressions, to a large dataset of Michigan State University academic wages and experience.

Suggested Citation

  • Karim M Abadir & Michel Lubrano, 2023. "Explicit solutions for the asymptotically-optimal bandwidth in cross validation," AMSE Working Papers 2336, Aix-Marseille School of Economics, France.
  • Handle: RePEc:aim:wpaimx:2336
    as

    Download full text from publisher

    File URL: https://www.amse-aixmarseille.fr/sites/default/files/working_papers/wp_2023_-_nr_36.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Newey, Whitney & West, Kenneth, 2014. "A simple, positive semi-definite, heteroscedasticity and autocorrelation consistent covariance matrix," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 33(1), pages 125-132.
    2. Qi Li & Jeffrey Scott Racine, 2006. "Nonparametric Econometrics: Theory and Practice," Economics Books, Princeton University Press, edition 1, volume 1, number 8355.
    3. Kim, W. C. & Park, B. U. & Marron, J. S., 1994. "Asymptotically best bandwidth selectors in kernel density estimation," Statistics & Probability Letters, Elsevier, vol. 19(2), pages 119-127, January.
    4. Qi Li & Jeffrey Scott Racine, 2006. "Density Estimation, from Nonparametric Econometrics: Theory and Practice," Introductory Chapters, in: Nonparametric Econometrics: Theory and Practice, Princeton University Press.
    5. Karim Abadir, 1999. "An introduction to hypergeometric functions for economists," Econometric Reviews, Taylor & Francis Journals, vol. 18(3), pages 287-330.
    6. Tarn Duong & Martin L. Hazelton, 2005. "Cross‐validation Bandwidth Matrices for Multivariate Kernel Density Estimation," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 32(3), pages 485-506, September.
    7. Robinson, P.M., 2005. "Robust Covariance Matrix Estimation: Hac Estimates With Long Memory/Antipersistence Correction," Econometric Theory, Cambridge University Press, vol. 21(1), pages 171-180, February.
    8. Duong, Tarn, 2007. "ks: Kernel Density Estimation and Kernel Discriminant Analysis for Multivariate Data in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 21(i07).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gregory Connor & Lisa R. Goldberg & Robert A. Korajczyk, 2010. "Portfolio Risk Analysis," Economics Books, Princeton University Press, edition 1, number 9224.
    2. Chaohua Dong & Jiti Gao & Oliver Linton & Bin peng, 2020. "On Time Trend of COVID-19: A Panel Data Study," Monash Econometrics and Business Statistics Working Papers 22/20, Monash University, Department of Econometrics and Business Statistics.
    3. Walter Sosa-Escudero & Sergio Petralia, 2011. "Anatomy of Distributive Changes in Argentina," Chapters, in: Werner Baer & David Fleischer (ed.), The Economies of Argentina and Brazil, chapter 10, Edward Elgar Publishing.
    4. Luis Alvarez & Cristine Pinto & Vladimir Ponczek, 2022. "Homophily in preferences or meetings? Identifying and estimating an iterative network formation model," Papers 2201.06694, arXiv.org, revised Mar 2024.
    5. repec:asg:wpaper:1006 is not listed on IDEAS
    6. Eduardo Fé & Bruce Hollingsworth, 2016. "Short- and long-run estimates of the local effects of retirement on health," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 179(4), pages 1051-1067, October.
    7. Ruiz-Castillo, Javier, 2012. "From the “European Paradox” to a European Drama in citation impact," UC3M Working papers. Economics we1211, Universidad Carlos III de Madrid. Departamento de Economía.
    8. Tobias Adrian & Richard K. Crump & Erik Vogt, 2019. "Nonlinearity and Flight‐to‐Safety in the Risk‐Return Trade‐Off for Stocks and Bonds," Journal of Finance, American Finance Association, vol. 74(4), pages 1931-1973, August.
    9. Camelia Minoiu & Sanjay Reddy, 2014. "Kernel density estimation on grouped data: the case of poverty assessment," The Journal of Economic Inequality, Springer;Society for the Study of Economic Inequality, vol. 12(2), pages 163-189, June.
    10. Wenger, Kai & Leschinski, Christian & Sibbertsen, Philipp, 2018. "A simple test on structural change in long-memory time series," Economics Letters, Elsevier, vol. 163(C), pages 90-94.
    11. Kai Wenger & Christian Leschinski & Philipp Sibbertsen, 2019. "Change-in-mean tests in long-memory time series: a review of recent developments," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 103(2), pages 237-256, June.
    12. repec:hal:spmain:info:hdl:2441/4hgajj9cf48dladkd9pn9jcj4p is not listed on IDEAS
    13. Manabu Asai & Michael McAleer, 2017. "A fractionally integrated Wishart stochastic volatility model," Econometric Reviews, Taylor & Francis Journals, vol. 36(1-3), pages 42-59, March.
    14. Cornelius Christian & Lukas Hensel & Christopher Roth, 2019. "Income Shocks and Suicides: Causal Evidence From Indonesia," The Review of Economics and Statistics, MIT Press, vol. 101(5), pages 905-920, December.
    15. Quoc-Anh Do & Kieu-Trang Nguyen & Anh N. Tran, 2017. "One Mandarin Benefits the Whole Clan: Hometown Favoritism in an Authoritarian Regime," American Economic Journal: Applied Economics, American Economic Association, vol. 9(4), pages 1-29, October.
    16. Campante, Filipe R. & Do, Quoc-Anh & Guimaraes, Bernardo, 2012. "Isolated Capital Cities and Misgovernance: Theory and Evidence," Working Paper Series rwp12-058, Harvard University, John F. Kennedy School of Government.
    17. Hugo Bodory & Martin Huber & Michael Lechner, 2022. "The finite sample performance of instrumental variable-based estimators of the Local Average Treatment Effect when controlling for covariates," Papers 2212.07379, arXiv.org.
    18. Manuel Hernandez & Maximo Torero, 2014. "Parametric versus nonparametric methods in risk scoring: an application to microcredit," Empirical Economics, Springer, vol. 46(3), pages 1057-1079, May.
    19. Xiaohong Chen & Zhipeng Liao & Yixiao Sun, 2012. "Sieve Inference on Semi-nonparametric Time Series Models," Cowles Foundation Discussion Papers 1849, Cowles Foundation for Research in Economics, Yale University.
    20. Filipe Campante & Quoc-Anh Do & Bernardo Guimaraes, 2014. "Capital Cities, Conflict, and Misgovernance: Theory and Evidence," Sciences Po Economics Discussion Papers 2014-13, Sciences Po Departement of Economics.
    21. Damian Kozbur, 2013. "Inference in additively separable models with a high-dimensional set of conditioning variables," ECON - Working Papers 284, Department of Economics - University of Zurich, revised Apr 2018.
    22. Filipe R. Campante & Quoc-Anh Do & Bernardo Guimaraes, 2019. "Capital Cities, Conflict, and Misgovernance," American Economic Journal: Applied Economics, American Economic Association, vol. 11(3), pages 298-337, July.

    More about this item

    Keywords

    Bandwidth Choice; Cross Validation; Explicit Analytical Solution; Nonparametric Density Estimation; Academic Wages;
    All these keywords.

    JEL classification:

    • C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General
    • J31 - Labor and Demographic Economics - - Wages, Compensation, and Labor Costs - - - Wage Level and Structure; Wage Differentials

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:aim:wpaimx:2336. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Gregory Cornu (email available below). General contact details of provider: https://edirc.repec.org/data/amseafr.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.