IDEAS home Printed from https://ideas.repec.org/a/wsi/ijitdm/v05y2006i04ns0219622006002258.html
   My bibliography  Save this article

10 Challenging Problems In Data Mining Research

Author

Listed:
  • QIANG YANG

    (Department of Computer Science, Hong Kong University of Science and Technology, Clearwater Bay, Kowloon, Hong Kong, China)

  • XINDONG WU

    (Department of Computer Science, University of Vermont, 33 Colchester Avenue, Burlington, Vermont 05405, USA)

Abstract

In October 2005, we took an initiative to identify 10 challenging problems in data mining research, by consulting some of the most active researchers in data mining and machine learning for their opinions on what are considered important and worthy topics for future research in data mining. We hope their insights will inspire new research efforts, and give young researchers (including PhD students) a high-level guideline as to where the hot problems are located in data mining.Due to the limited amount of time, we were only able to send out our survey requests to the organizers of the IEEE ICDM and ACM KDD conferences, and we received an overwhelming response. We are very grateful for the contributions provided by these researchers despite their busy schedules. This short article serves to summarize the 10 most challenging problems of the 14 responses we have received from this survey. The order of the listing doesnotreflect their level of importance.

Suggested Citation

  • Qiang Yang & Xindong Wu, 2006. "10 Challenging Problems In Data Mining Research," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 5(04), pages 597-604.
  • Handle: RePEc:wsi:ijitdm:v:05:y:2006:i:04:n:s0219622006002258
    DOI: 10.1142/S0219622006002258
    as

    Download full text from publisher

    File URL: http://www.worldscientific.com/doi/abs/10.1142/S0219622006002258
    Download Restriction: Access to full text is restricted to subscribers

    File URL: https://libkey.io/10.1142/S0219622006002258?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Harshita Patel & Dharmendra Singh Rajput & G Thippa Reddy & Celestine Iwendi & Ali Kashif Bashir & Ohyun Jo, 2020. "A review on classification of imbalanced data for wireless sensor networks," International Journal of Distributed Sensor Networks, , vol. 16(4), pages 15501477209, April.
    2. Ionuţ ŢĂRANU, 2016. "Data mining in healthcare: decision making and precision," Database Systems Journal, Academy of Economic Studies - Bucharest, Romania, vol. 6(4), pages 33-40, May.
    3. Qi Liu & Gengzhong Feng & Nengmin Wang & Giri Kumar Tayi, 2018. "A multi-objective model for discovering high-quality knowledge based on data quality and prior knowledge," Information Systems Frontiers, Springer, vol. 20(2), pages 401-416, April.
    4. Hady Suryono & Heri Kuswanto & Nur Iriawan, 2022. "Two-Phase Stratified Random Forest for Paddy Growth Phase Classification: A Case of Imbalanced Data," Sustainability, MDPI, vol. 14(22), pages 1-13, November.
    5. Pancheng Wang & Shasha Li & Haifang Zhou & Jintao Tang & Ting Wang, 2019. "Cited text spans identification with an improved balanced ensemble model," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(3), pages 1111-1145, September.
    6. Yan Li & Manoj Thomas & Kweku-Muata Osei-Bryson & Jason Levy, 2016. "Problem Formulation in Knowledge Discovery via Data Analytics (KDDA) for Environmental Risk Management," IJERPH, MDPI, vol. 13(12), pages 1-17, December.
    7. Neda Abdelhamid & Arun Padmavathy & David Peebles & Fadi Thabtah & Daymond Goulder-Horobin, 2020. "Data Imbalance in Autism Pre-Diagnosis Classification Systems: An Experimental Study," Journal of Information & Knowledge Management (JIKM), World Scientific Publishing Co. Pte. Ltd., vol. 19(01), pages 1-16, March.
    8. Li, Hailin, 2017. "Distance measure with improved lower bound for multivariate time series," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 468(C), pages 622-637.
    9. Liao, Jui-Jung & Shih, Ching-Hui & Chen, Tai-Feng & Hsu, Ming-Fu, 2014. "An ensemble-based model for two-class imbalanced financial problem," Economic Modelling, Elsevier, vol. 37(C), pages 175-183.
    10. Keng-Hoong Ng & Chin-Kuan Ho & Somnuk Phon-Amnuaisuk, 2012. "A Hybrid Distance Measure for Clustering Expressed Sequence Tags Originating from the Same Gene Family," PLOS ONE, Public Library of Science, vol. 7(10), pages 1-14, October.
    11. DE CNUDDE, Sofie & MARTENS, David & EVGENIOU, Theodoros & PROVOST, Foster, 2017. "A benchmarking study of classification techniques for behavioral data," Working Papers 2017005, University of Antwerp, Faculty of Business and Economics.
    12. Vilém Novák & Soheyla Mirshahi, 2021. "On the Similarity and Dependence of Time Series," Mathematics, MDPI, vol. 9(5), pages 1-14, March.
    13. Riesgo García, María Victoria & Krzemień, Alicja & Manzanedo del Campo, Miguel Ángel & Escanciano García-Miranda, Carmen & Sánchez Lasheras, Fernando, 2018. "Rare earth elements price forecasting by means of transgenic time series developed with ARIMA models," Resources Policy, Elsevier, vol. 59(C), pages 95-102.
    14. Qi Liu & Gengzhong Feng & Nengmin Wang & Giri Kumar Tayi, 0. "A multi-objective model for discovering high-quality knowledge based on data quality and prior knowledge," Information Systems Frontiers, Springer, vol. 0, pages 1-16.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wsi:ijitdm:v:05:y:2006:i:04:n:s0219622006002258. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Tai Tone Lim (email available below). General contact details of provider: http://www.worldscinet.com/ijitdm/ijitdm.shtml .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.