IDEAS home Printed from https://ideas.repec.org/a/spr/compst/v28y2013i6p2599-2619.html
   My bibliography  Save this article

Benchmarking local classification methods

Author

Listed:
  • Bernd Bischl
  • Julia Schiffner
  • Claus Weihs

Abstract

In recent years in the fields of statistics and machine learning an increasing amount of so called local classification methods has been developed. Local approaches to classification are not new, but have lately become popular. Well-known examples are the $$k$$ nearest neighbors method and classification trees. However, in most publications on this topic the term “local” is used without further explanation of its particular meaning. Only little is known about the properties of local methods and the types of classification problems for which they may be beneficial. We explain the basic principles and introduce the most important variants of local methods. To our knowledge there are very few extensive studies in the literature that compare several types of local methods and global methods across many data sets. In order to assess their performance we conduct a benchmark study on real-world and synthetic tasks. We cluster data sets and considered learning algorithms with regard to the obtained performance structures and try to relate our theoretical considerations and intuitions to these results. We also address some general issues of benchmark studies and cover some pitfalls, extensions and improvements. Copyright Springer-Verlag Berlin Heidelberg 2013

Suggested Citation

  • Bernd Bischl & Julia Schiffner & Claus Weihs, 2013. "Benchmarking local classification methods," Computational Statistics, Springer, vol. 28(6), pages 2599-2619, December.
  • Handle: RePEc:spr:compst:v:28:y:2013:i:6:p:2599-2619
    DOI: 10.1007/s00180-013-0420-y
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1007/s00180-013-0420-y
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1007/s00180-013-0420-y?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Binder Harald & Schumacher Martin, 2008. "Adapting Prediction Error Estimates for Biased Complexity Selection in High-Dimensional Bootstrap Samples," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 7(1), pages 1-28, March.
    2. Zhang, Chun-Xia & Zhang, Jiang-She, 2008. "A local boosting algorithm for solving classification problems," Computational Statistics & Data Analysis, Elsevier, vol. 52(4), pages 1928-1941, January.
    3. Hand D.J. & Vinciotti V., 2003. "Local Versus Global Models for Classification Problems: Fitting Models Where it Matters," The American Statistician, American Statistical Association, vol. 57, pages 124-131, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Stefanie Hieke & Axel Benner & Richard F Schlenk & Martin Schumacher & Lars Bullinger & Harald Binder, 2016. "Identifying Prognostic SNPs in Clinical Cohorts: Complementing Univariate Analyses by Resampling and Multivariable Modeling," PLOS ONE, Public Library of Science, vol. 11(5), pages 1-18, May.
    2. Rokach, Lior, 2009. "Taxonomy for characterizing ensemble methods in classification tasks: A review and annotated bibliography," Computational Statistics & Data Analysis, Elsevier, vol. 53(12), pages 4046-4072, October.
    3. Christine Porzelius & Martin Schumacher & Harald Binder, 2011. "The benefit of data-based model complexity selection via prediction error curves in time-to-event data," Computational Statistics, Springer, vol. 26(2), pages 293-302, June.
    4. Travis J. Berge, 2015. "Predicting Recessions with Leading Indicators: Model Averaging and Selection over the Business Cycle," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 34(6), pages 455-471, September.
    5. Rybizki, Lydia, 2014. "Learning cost sensitive binary classification rules accounting for uncertain and unequal misclassification costs," FAU Discussion Papers in Economics 01/2014, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
    6. Mogensen, Ulla B. & Ishwaran, Hemant & Gerds, Thomas A., 2012. "Evaluating Random Forests for Survival Analysis Using Prediction Error Curves," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 50(i11).
    7. Xinyu Zhang & Alan T. K. Wan & Sherry Z. Zhou, 2011. "Focused Information Criteria, Model Selection, and Model Averaging in a Tobit Model With a Nonzero Threshold," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 30(1), pages 132-142, June.
    8. Werner Ehm & Tilmann Gneiting & Alexander Jordan & Fabian Krüger, 2016. "Of quantiles and expectiles: consistent scoring functions, Choquet representations and forecast rankings," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(3), pages 505-562, June.
    9. Jordà, Òscar & Taylor, Alan M., 2012. "The carry trade and fundamentals: Nothing to fear but FEER itself," Journal of International Economics, Elsevier, vol. 88(1), pages 74-90.
    10. Chun-Xia Zhang & Guan-Wei Wang & Jun-Min Liu, 2015. "RandGA: injecting randomness into parallel genetic algorithm for variable selection," Journal of Applied Statistics, Taylor & Francis Journals, vol. 42(3), pages 630-647, March.
    11. Hand, David J., 2009. "Mining the past to determine the future: Problems and possibilities," International Journal of Forecasting, Elsevier, vol. 25(3), pages 441-451, July.
    12. Lore Zumeta-Olaskoaga & Maximilian Weigert & Jon Larruskain & Eder Bikandi & Igor Setuain & Josean Lekue & Helmut Küchenhoff & Dae-Jin Lee, 2023. "Prediction of sports injuries in football: a recurrent time-to-event approach using regularized Cox models," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 107(1), pages 101-126, March.
    13. Michael T. Owyang & Jeremy Piger & Howard J. Wall, 2015. "Forecasting National Recessions Using State‐Level Data," Journal of Money, Credit and Banking, Blackwell Publishing, vol. 47(5), pages 847-866, August.
    14. Czogiel, Irina & Luebke, Karsten & Zentgraf, Marc & Weihs, Claus, 2006. "Localized Linear Discriminant Analysis," Technical Reports 2006,10, Technische Universität Dortmund, Sonderforschungsbereich 475: Komplexitätsreduktion in multivariaten Datenstrukturen.
    15. Zhang, Chun-Xia & Zhang, Jiang-She & Zhang, Gai-Ying, 2009. "Using Boosting to prune Double-Bagging ensembles," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 1218-1231, February.
    16. Edgar C. Merkle & Mark Steyvers, 2013. "Choosing a Strictly Proper Scoring Rule," Decision Analysis, INFORMS, vol. 10(4), pages 292-304, December.
    17. Jasdeep S. Banga & B. Wade Brorsen, 2019. "Profitability of alternative methods of combining the signals from technical trading systems," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 26(1), pages 32-45, January.
    18. Chun-Xia Zhang & Guan-Wei Wang & Jiang-She Zhang, 2012. "An empirical bias--variance analysis of DECORATE ensemble method at different training sample sizes," Journal of Applied Statistics, Taylor & Francis Journals, vol. 39(4), pages 829-850, September.
    19. Travis J. Berge, 2014. "Forecasting Disconnected Exchange Rates," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 29(5), pages 713-735, August.
    20. Hofer, Vera & Krempl, Georg, 2013. "Drift mining in data: A framework for addressing drift in classification," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 377-391.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:compst:v:28:y:2013:i:6:p:2599-2619. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.