IDEAS home Printed from https://ideas.repec.org/a/spr/annopr/v263y2018i1d10.1007_s10479-016-2328-8.html
   My bibliography  Save this article

Distant diversity in dynamic class prediction

Author

Listed:
  • Şenay Yaşar Sağlam

    (Ministry of Business, Innovation, and Employment)

  • W. Nick Street

    (University of Iowa)

Abstract

Instead of using the same ensemble for all data instances, recent studies have focused on dynamic ensembles in which a new ensemble is chosen from a pool of classifiers for each new data instance. Classifiers agreement in the region where a new data instance resides in has been considered as a major factor in dynamic ensembles. We postulate that the classifiers chosen for a dynamic ensemble should behave similarly in the region in which the new instance resides, but differently outside of this area. In other words, we hypothesize that high local accuracy, combined with high diversity in other regions, is desirable. To verify the validity of this hypothesis we propose two approaches. The first approach focuses on finding the k-nearest data instances to the new instance, which then defines a neighborhood, and maximizes simultaneously local accuracy and distant diversity, based on data instances outside of the neighborhood. The second method makes use of an alternative definition of the neighborhood: all data instances are in the neighborhood. However, the importance of data instances for accuracy and diversity depends on the distance to the new instance. We demonstrate through several experiments that the distance-based diversity and accuracy outperform all benchmark methods.

Suggested Citation

  • Şenay Yaşar Sağlam & W. Nick Street, 2018. "Distant diversity in dynamic class prediction," Annals of Operations Research, Springer, vol. 263(1), pages 5-19, April.
  • Handle: RePEc:spr:annopr:v:263:y:2018:i:1:d:10.1007_s10479-016-2328-8
    DOI: 10.1007/s10479-016-2328-8
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10479-016-2328-8
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10479-016-2328-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hyunchul Ahn & Kyoung-jae Kim, 2008. "Using genetic algorithms to optimize nearest neighbors for data mining," Annals of Operations Research, Springer, vol. 163(1), pages 5-18, October.
    2. Genetha Gray & Pamela Williams & W. Brown & Jean-Loup Faulon & Kenneth Sale, 2010. "Disparate data fusion for protein phosphorylation prediction," Annals of Operations Research, Springer, vol. 174(1), pages 219-235, February.
    3. Hsinchun Chen, 2003. "Introduction to the JASIST Special Topic issue on web retrieval and mining: A machine learning perspective," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 54(7), pages 621-624, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Sergey Kovalev & Isabelle Chalamon & Fabio J. Petani, 2023. "Maximizing single attribute diversity in group selection," Annals of Operations Research, Springer, vol. 320(1), pages 535-540, January.
    2. Pablo Aparicio-Ruiz & Elena Barbadilla-Martín & José Guadix & Pablo Cortés, 2021. "KNN and adaptive comfort applied in decision making for HVAC systems," Annals of Operations Research, Springer, vol. 303(1), pages 217-231, August.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Alaleh Razmjoo & Petros Xanthopoulos & Qipeng Phil Zheng, 2019. "Feature importance ranking for classification in mixed online environments," Annals of Operations Research, Springer, vol. 276(1), pages 315-330, May.
    2. Konstantin Kogan & Avi Herbon & Beatrice Venturi, 2020. "Direct marketing of an event under hazards of customer saturation and forgetting," Annals of Operations Research, Springer, vol. 295(1), pages 207-227, December.
    3. Sylvie Tchumtchoua & Dipak Dey, 2012. "Modeling Associations Among Multivariate Longitudinal Categorical Variables in Survey Data: A Semiparametric Bayesian Approach," Psychometrika, Springer;The Psychometric Society, vol. 77(4), pages 670-692, October.
    4. Filipa Fernandes & Charalampos Stasinakis & Zivile Zekaite, 2019. "Forecasting government bond spreads with heuristic models: evidence from the Eurozone periphery," Annals of Operations Research, Springer, vol. 282(1), pages 87-118, November.
    5. Yiting Xing & Ling Li & Zhuming Bi & Marzena Wilamowska‐Korsak & Li Zhang, 2013. "Operations Research (OR) in Service Industries: A Comprehensive Review," Systems Research and Behavioral Science, Wiley Blackwell, vol. 30(3), pages 300-353, May.
    6. Andrew Kusiak & Xiupeng Wei, 2014. "Prediction of methane production in wastewater treatment facility: a data-mining approach," Annals of Operations Research, Springer, vol. 216(1), pages 71-81, May.
    7. Kazim Topuz & Hasmet Uner & Asil Oztekin & Mehmet Bayram Yildirim, 2018. "Predicting pediatric clinic no-shows: a decision analytic framework using elastic net and Bayesian belief network," Annals of Operations Research, Springer, vol. 263(1), pages 479-499, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:annopr:v:263:y:2018:i:1:d:10.1007_s10479-016-2328-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.