IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v11y2023i2p355-d1030302.html
   My bibliography  Save this article

A Semisupervised Concept Drift Adaptation via Prototype-Based Manifold Regularization Approach with Knowledge Transfer

Author

Listed:
  • Muhammad Zafran Muhammad Zaly Shah

    (Faculty of Computing, Universiti Teknologi Malaysia, Iskandar Puteri 81310, Malaysia)

  • Anazida Zainal

    (Faculty of Computing, Universiti Teknologi Malaysia, Iskandar Puteri 81310, Malaysia)

  • Taiseer Abdalla Elfadil Eisa

    (Department of Information Systems-Girls Section, King Khalid University, Mahayil 62529, Saudi Arabia)

  • Hashim Albasheer

    (Department of Information Systems, College of Computer Science, King Khalid University, Abha 61421, Saudi Arabia)

  • Fuad A. Ghaleb

    (Faculty of Computing, Universiti Teknologi Malaysia, Iskandar Puteri 81310, Malaysia)

Abstract

Data stream mining deals with processing large amounts of data in nonstationary environments, where the relationship between the data and the labels often changes. Such dynamic relationships make it difficult to design a computationally efficient data stream processing algorithm that is also adaptable to the nonstationarity of the environment. To make the algorithm adaptable to the nonstationarity of the environment, concept drift detectors are attached to detect the changes in the environment by monitoring the error rates and adapting to the environment’s current state. Unfortunately, current approaches to adapt to environmental changes assume that the data stream is fully labeled. Assuming a fully labeled data stream is a flawed assumption as the labeling effort would be too impractical due to the rapid arrival and volume of the data. To address this issue, this study proposes to detect concept drift by anticipating a possible change in the true label in the high confidence prediction region. This study also proposes an ensemble-based concept drift adaptation approach that transfers reliable classifiers to the new concept. The significance of our proposed approach compared to the current baselines is that our approach does not use a performance measur as the drift signal or assume a change in data distribution when concept drift occurs. As a result, our proposed approach can detect concept drift when labeled data are scarce, even when the data distribution remains static. Based on the results, this proposed approach can detect concept drifts and fully supervised data stream mining approaches and performs well on mixed-severity concept drift datasets.

Suggested Citation

  • Muhammad Zafran Muhammad Zaly Shah & Anazida Zainal & Taiseer Abdalla Elfadil Eisa & Hashim Albasheer & Fuad A. Ghaleb, 2023. "A Semisupervised Concept Drift Adaptation via Prototype-Based Manifold Regularization Approach with Knowledge Transfer," Mathematics, MDPI, vol. 11(2), pages 1-30, January.
  • Handle: RePEc:gam:jmathe:v:11:y:2023:i:2:p:355-:d:1030302
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/11/2/355/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/11/2/355/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Paul Fearnhead & Guillem Rigaill, 2019. "Changepoint Detection in the Presence of Outliers," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(525), pages 169-183, January.
    2. De Caigny, Arno & Coussement, Kristof & De Bock, Koen W. & Lessmann, Stefan, 2020. "Incorporating textual information in customer churn prediction models based on a convolutional neural network," International Journal of Forecasting, Elsevier, vol. 36(4), pages 1563-1578.
    3. Li, Jing & Stones, Rebecca J. & Wang, Gang & Liu, Xiaoguang & Li, Zhongwei & Xu, Ming, 2017. "Hard drive failure prediction using Decision Trees," Reliability Engineering and System Safety, Elsevier, vol. 164(C), pages 55-65.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Abedin, Mohammad Zoynul & Hajek, Petr & Sharif, Taimur & Satu, Md. Shahriare & Khan, Md. Imran, 2023. "Modelling bank customer behaviour using feature engineering and classification techniques," Research in International Business and Finance, Elsevier, vol. 65(C).
    2. Louis Geiler & Séverine Affeldt & Mohamed Nadif, 2022. "A survey on machine learning methods for churn prediction," Post-Print hal-03824873, HAL.
    3. Christopher Gerling & Stefan Lessmann, 2023. "Multimodal Document Analytics for Banking Process Automation," Papers 2307.11845, arXiv.org, revised Nov 2023.
    4. Kang-Ping Lu & Shao-Tung Chang, 2021. "Robust Algorithms for Change-Point Regressions Using the t -Distribution," Mathematics, MDPI, vol. 9(19), pages 1-28, September.
    5. Liu, Zhenkun & Zhang, Ying & Abedin, Mohammad Zoynul & Wang, Jianzhou & Yang, Hufang & Gao, Yuyang & Chen, Yinghao, 2024. "Profit-driven fusion framework based on bagging and boosting classifiers for potential purchaser prediction," Journal of Retailing and Consumer Services, Elsevier, vol. 79(C).
    6. Chen, Yan & Zhang, Lei & Zhao, Yulu & Xu, Bing, 2022. "Implementation of penalized survival models in churn prediction of vehicle insurance," Journal of Business Research, Elsevier, vol. 153(C), pages 162-171.
    7. Lu Shaochuan, 2023. "Scalable Bayesian Multiple Changepoint Detection via Auxiliary Uniformisation," International Statistical Review, International Statistical Institute, vol. 91(1), pages 88-113, April.
    8. Ricardo C. Pedroso & Rosangela H. Loschi & Fernando Andrés Quintana, 2023. "Multipartition model for multiple change point identification," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(2), pages 759-783, June.
    9. Liu, Zhenkun & Jiang, Ping & De Bock, Koen W. & Wang, Jianzhou & Zhang, Lifang & Niu, Xinsong, 2024. "Extreme gradient boosting trees with efficient Bayesian optimization for profit-driven customer churn prediction," Technological Forecasting and Social Change, Elsevier, vol. 198(C).
    10. Kang-Ping Lu & Shao-Tung Chang, 2023. "An Advanced Segmentation Approach to Piecewise Regression Models," Mathematics, MDPI, vol. 11(24), pages 1-23, December.
    11. Trevor Harris & Bo Li & J. Derek Tucker, 2022. "Scalable multiple changepoint detection for functional data sequences," Environmetrics, John Wiley & Sons, Ltd., vol. 33(2), March.
    12. Lewlisa Saha & Hrudaya Kumar Tripathy & Tarek Gaber & Hatem El-Gohary & El-Sayed M. El-kenawy, 2023. "Deep Churn Prediction Method for Telecommunication Industry," Sustainability, MDPI, vol. 15(5), pages 1-21, March.
    13. Cho, Haeran & Kirch, Claudia, 2024. "Data segmentation algorithms: Univariate mean change and beyond," Econometrics and Statistics, Elsevier, vol. 30(C), pages 76-95.
    14. Lee, Sangyeol & Meintanis, Simos G. & Pretorius, Charl, 2022. "Monitoring procedures for strict stationarity based on the multivariate characteristic function," Journal of Multivariate Analysis, Elsevier, vol. 189(C).
    15. Haoran Lu & Dianpeng Wang, 2024. "Grouped Change-Points Detection and Estimation in Panel Data," Mathematics, MDPI, vol. 12(5), pages 1-20, March.
    16. Borchert, Philipp & Coussement, Kristof & De Caigny, Arno & De Weerdt, Jochen, 2023. "Extending business failure prediction models with textual website content using deep learning," European Journal of Operational Research, Elsevier, vol. 306(1), pages 348-357.
    17. K. Coussement & K. W. Bock & S. Geuens, 2022. "A decision-analytic framework for interpretable recommendation systems with multiple input data sources: a case study for a European e-tailer," Annals of Operations Research, Springer, vol. 315(2), pages 671-694, August.
    18. David Hason Rudd & Huan Huo & Md. Rafiqul Islam & Guandong Xu, 2023. "Churn Prediction via Multimodal Fusion Learning: Integrating Customer Financial Literacy, Voice, and Behavioral Data [Prédiction du churn par apprentissage fusionné multimodal : intégration de la l," Post-Print hal-04320145, HAL.
    19. Lamrhari, Soumaya & Ghazi, Hamid El & Oubrich, Mourad & Faker, Abdellatif El, 2022. "A social CRM analytic framework for improving customer retention, acquisition, and conversion," Technological Forecasting and Social Change, Elsevier, vol. 174(C).
    20. Lazar, Emese & Wang, Shixuan & Xue, Xiaohan, 2023. "Loss function-based change point detection in risk measures," European Journal of Operational Research, Elsevier, vol. 310(1), pages 415-431.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:2:p:355-:d:1030302. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.