IDEAS home Printed from https://ideas.repec.org/a/ibn/masjnl/v3y2009i12p28.html
   My bibliography  Save this article

Prediction of Stock Market Index Movement by Ten Data Mining Techniques

Author

Listed:
  • Phichhang Ou
  • Hengshan Wang

Abstract

Ability to predict direction of stock/index price accurately is crucial for market dealers or investors to maximize their profits. Data mining techniques have been successfully shown to generate high forecasting accuracy of stock price movement. Nowadays, in stead of a single method, traders need to use various forecasting techniques to gain multiple signals and more information about the future of the markets. In this paper, ten different techniques of data mining are discussed and applied to predict price movement of Hang Seng index of Hong Kong stock market. The approaches include Linear discriminant analysis (LDA), Quadratic discriminant analysis (QDA), K-nearest neighbor classification, Naïve Bayes based on kernel estimation, Logit model, Tree based classification, neural network, Bayesian classification with Gaussian process, Support vector machine (SVM) and Least squares support vector machine (LS-SVM). Experimental results show that the SVM and LS-SVM generate superior predictive performances among the other models. Specifically, SVM is better than LS-SVM for in-sample prediction but LS-SVM is, in turn, better than the SVM for the out-of-sample forecasts in term of hit rate and error rate criteria.

Suggested Citation

  • Phichhang Ou & Hengshan Wang, 2009. "Prediction of Stock Market Index Movement by Ten Data Mining Techniques," Modern Applied Science, Canadian Center of Science and Education, vol. 3(12), pages 1-28, December.
  • Handle: RePEc:ibn:masjnl:v:3:y:2009:i:12:p:28
    as

    Download full text from publisher

    File URL: https://ccsenet.org/journal/index.php/mas/article/download/4586/3925
    Download Restriction: no

    File URL: https://ccsenet.org/journal/index.php/mas/article/view/4586
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Yangru Wu & Hua Zhang, 1997. "Forward premiums as unbiased predictors of future currency depreciation: a non-parametric analysis," Journal of International Money and Finance, Elsevier, vol. 16(4), pages 609-623, August.
    2. Karatzoglou, Alexandros & Meyer, David & Hornik, Kurt, 2006. "Support Vector Machines in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 15(i09).
    3. Tay, Francis E. H. & Cao, Lijuan, 2001. "Application of support vector machines in financial time series forecasting," Omega, Elsevier, vol. 29(4), pages 309-317, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jasleen Kaur & Khushdeep Dharni, 2022. "Application and performance of data mining techniques in stock market: A review," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 29(4), pages 219-241, October.
    2. Görkem Ataman & Serpil Kahraman, 2022. "Comparing Decision Trees and Association Rules for Stock Market Expectations in BIST100 and BIST30," Scientific Annals of Economics and Business (continues Analele Stiintifice), Alexandru Ioan Cuza University, Faculty of Economics and Business Administration, vol. 69(3), pages 459-475, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fethi, Meryem Duygun & Pasiouras, Fotios, 2010. "Assessing bank efficiency and performance with operational research and artificial intelligence techniques: A survey," European Journal of Operational Research, Elsevier, vol. 204(2), pages 189-198, July.
    2. Deng, S. & Yeh, Tsung-Han, 2011. "Using least squares support vector machines for the airframe structures manufacturing cost estimation," International Journal of Production Economics, Elsevier, vol. 131(2), pages 701-708, June.
    3. Yanqin Bai & Xin Yan, 2016. "Conic Relaxations for Semi-supervised Support Vector Machines," Journal of Optimization Theory and Applications, Springer, vol. 169(1), pages 299-313, April.
    4. Paolo Sorino & Maria Gabriella Caruso & Giovanni Misciagna & Caterina Bonfiglio & Angelo Campanella & Antonella Mirizzi & Isabella Franco & Antonella Bianco & Claudia Buongiorno & Rosalba Liuzzi & Ann, 2020. "Selecting the best machine learning algorithm to support the diagnosis of Non-Alcoholic Fatty Liver Disease: A meta learner study," PLOS ONE, Public Library of Science, vol. 15(10), pages 1-15, October.
    5. Benítez-Peña, Sandra & Blanquero, Rafael & Carrizosa, Emilio & Ramírez-Cobo, Pepa, 2024. "Cost-sensitive probabilistic predictions for support vector machines," European Journal of Operational Research, Elsevier, vol. 314(1), pages 268-279.
    6. Helder Sebastião & Pedro Godinho & Sjur Westgaard, 2020. "Using Machine Learning to Profit on the Risk Premium of the Nordic Electricity Futures," Scientific Annals of Economics and Business (continues Analele Stiintifice), Alexandru Ioan Cuza University, Faculty of Economics and Business Administration, vol. 67(si), pages 1-17, December.
    7. Hong-Yu Lin & Kuentai Chen, 2015. "The Trend of Average Unit Price in Taipei City," Research in World Economy, Research in World Economy, Sciedu Press, vol. 6(1), pages 133-142, March.
    8. Na Tang & Maoxiang Yuan & Zhijun Chen & Jian Ma & Rui Sun & Yide Yang & Quanyuan He & Xiaowei Guo & Shixiong Hu & Junhua Zhou, 2023. "Machine Learning Prediction Model of Tuberculosis Incidence Based on Meteorological Factors and Air Pollutants," IJERPH, MDPI, vol. 20(5), pages 1-17, February.
    9. Noemi Nava & Tiziana Di Matteo & Tomaso Aste, 2018. "Financial Time Series Forecasting Using Empirical Mode Decomposition and Support Vector Regression," Risks, MDPI, vol. 6(1), pages 1-21, February.
    10. Wei-Chiang Hong & Yucheng Dong & Chien-Yuan Lai & Li-Yueh Chen & Shih-Yung Wei, 2011. "SVR with Hybrid Chaotic Immune Algorithm for Seasonal Load Demand Forecasting," Energies, MDPI, vol. 4(6), pages 1-18, June.
    11. Marius Lux & Wolfgang Karl Härdle & Stefan Lessmann, 2020. "Data driven value-at-risk forecasting using a SVR-GARCH-KDE hybrid," Computational Statistics, Springer, vol. 35(3), pages 947-981, September.
    12. Georgi Nalbantov & Philip Hans Franses & Patrick Groenen & Jan Bioch, 2010. "Estimating the Market Share Attraction Model using Support Vector Regressions," Econometric Reviews, Taylor & Francis Journals, vol. 29(5-6), pages 688-716.
    13. Moro Russ A. & Härdle Wolfgang K. & Schäfer Dorothea, 2017. "Company rating with support vector machines," Statistics & Risk Modeling, De Gruyter, vol. 34(1-2), pages 55-67, June.
    14. T. Law & J. Shawe-Taylor, 2017. "Practical Bayesian support vector regression for financial time series prediction and market condition change detection," Quantitative Finance, Taylor & Francis Journals, vol. 17(9), pages 1403-1416, September.
    15. Cang, Shuang & Yu, Hongnian, 2014. "A combination selection algorithm on forecasting," European Journal of Operational Research, Elsevier, vol. 234(1), pages 127-139.
    16. Hyejung Chung & Kyung-shik Shin, 2018. "Genetic Algorithm-Optimized Long Short-Term Memory Network for Stock Market Prediction," Sustainability, MDPI, vol. 10(10), pages 1-18, October.
    17. Ślepaczuk Robert & Zenkova Maryna, 2018. "Robustness of Support Vector Machines in Algorithmic Trading on Cryptocurrency Market," Central European Economic Journal, Sciendo, vol. 5(52), pages 186-205, January.
    18. Nava, Noemi & Di Matteo, Tiziana & Aste, Tomaso, 2018. "Financial time series forecasting using empirical mode decomposition and support vector regression," LSE Research Online Documents on Economics 91028, London School of Economics and Political Science, LSE Library.
    19. N. Loukeris & I. Eleftheriadis & E. Livanis, 2016. "The Portfolio Heuristic Optimisation System (PHOS)," Computational Economics, Springer;Society for Computational Economics, vol. 48(4), pages 627-648, December.
    20. Heni Boubaker & Giorgio Canarella & Rangan Gupta & Stephen M. Miller, 2023. "A Hybrid ARFIMA Wavelet Artificial Neural Network Model for DJIA Index Forecasting," Computational Economics, Springer;Society for Computational Economics, vol. 62(4), pages 1801-1843, December.

    More about this item

    JEL classification:

    • R00 - Urban, Rural, Regional, Real Estate, and Transportation Economics - - General - - - General
    • Z0 - Other Special Topics - - General

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ibn:masjnl:v:3:y:2009:i:12:p:28. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Canadian Center of Science and Education (email available below). General contact details of provider: https://edirc.repec.org/data/cepflch.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.