IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v12y2024i9p1336-d1384579.html
   My bibliography  Save this article

Ensemble Approach Using k-Partitioned Isolation Forests for the Detection of Stock Market Manipulation

Author

Listed:
  • Hugo Núñez Delafuente

    (Doctorado en Sistemas de Ingeniería, Faculty of Engineering, Universidad de Talca, Curicó 3340000, Chile)

  • César A. Astudillo

    (Department of Computer Science, Faculty of Engineering, Universidad de Talca, Curicó 3340000, Chile)

  • David Díaz

    (Departamento de Administración, Facultad de Economía y Negocios, Universidad de Chile, Santiago 8330111, Chile)

Abstract

Stock market manipulation, defined as any attempt to artificially influence stock prices, poses significant challenges by causing financial losses and eroding investor trust. The prevalent reliance on supervised learning models for detecting such manipulations, while showing promise, faces notable hurdles due to the dearth of labeled data and the inability to recognize novel manipulation tactics beyond those explicitly labeled. This study ventures into addressing these gaps by proposing a novel detection framework aimed at identifying suspicious hourly manipulation blocks through an unsupervised learning approach, thereby circumventing the limitations of data labeling and enhancing the adaptability to emerging manipulation strategies. Our methodology involves the innovative creation of features reflecting the behavior of stocks across various time windows followed by the segmentation of the dataset into k subsets. This setup facilitates the identification of potential manipulation instances via a voting ensemble composed of k isolation forest models, which have been chosen for their efficiency in pinpointing anomalies and their linear computational complexity—attributes that are critical for analyzing vast datasets. Evaluated against eight real stocks known to have undergone manipulation, our approach demonstrated a remarkable capability to identify up to 89% of manipulated blocks, thus significantly outperforming previous methods that do not utilize a voting ensemble. This finding not only surpasses the detection rates reported in prior studies but also underscores the enhanced robustness and adaptability of our unsupervised model in uncovering varied manipulation schemes. Through this research, we contribute to the field by offering a scalable and efficient unsupervised learning strategy for stock manipulation detection, thereby marking a substantial advancement over traditional supervised methods and paving the way for more resilient financial markets.

Suggested Citation

  • Hugo Núñez Delafuente & César A. Astudillo & David Díaz, 2024. "Ensemble Approach Using k-Partitioned Isolation Forests for the Detection of Stock Market Manipulation," Mathematics, MDPI, vol. 12(9), pages 1-18, April.
  • Handle: RePEc:gam:jmathe:v:12:y:2024:i:9:p:1336-:d:1384579
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/12/9/1336/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/12/9/1336/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Hanke, Michael & Hauser, Florian, 2008. "On the effects of stock spam e-mails," Journal of Financial Markets, Elsevier, vol. 11(1), pages 57-83, February.
    2. Ayed Alwadain & Rao Faizan Ali & Amgad Muneer, 2023. "Estimating Financial Fraud through Transaction-Level Features and Machine Learning," Mathematics, MDPI, vol. 11(5), pages 1-15, February.
    3. Jia Zhai & Yi Cao & Xuemei Ding, 2018. "Data analytic approach for manipulation detection in stock market," Review of Quantitative Finance and Accounting, Springer, vol. 50(3), pages 897-932, April.
    4. Ding, Zhiguo & Xing, Liudong, 2020. "Improved software defect prediction using Pruned Histogram-based isolation forest," Reliability Engineering and System Safety, Elsevier, vol. 204(C).
    5. Fabián Silva-Aravena & Hugo Núñez Delafuente & César A. Astudillo, 2022. "A Novel Strategy to Classify Chronic Patients at Risk: A Hybrid Machine Learning Approach," Mathematics, MDPI, vol. 10(17), pages 1-17, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Cécile Carpentier & Jean-Marc Suret, 2009. "Investir dans des titres de petite capitalisation : le cas de la Bourse de croissance TSX," CIRANO Working Papers 2009s-07, CIRANO.
    2. Hauser, Florian & Huber, Jürgen, 2012. "Short-selling constraints as cause for price distortions: An experimental study," Journal of International Money and Finance, Elsevier, vol. 31(5), pages 1279-1298.
    3. Rojas-de-Gracia, María-Mercedes & Casado-Molina, Ana-María & Alarcón-Urbistondo, Pilar, 2021. "Relationship between reputational aspects of companies and their share price in the online environment," Technology in Society, Elsevier, vol. 64(C).
    4. Zhang, Yongjie & Song, Weixin & Shen, Dehua & Zhang, Wei, 2016. "Market reaction to internet news: Information diffusion and price pressure," Economic Modelling, Elsevier, vol. 56(C), pages 43-49.
    5. Steven Crawford & Wesley Gray & Bryan R. Johnson & Richard A. Price, 2018. "What Motivates Buy-Side Analysts to Share Recommendations Online?," Management Science, INFORMS, vol. 64(6), pages 2574-2589, June.
    6. Lee, Eun Jung & Eom, Kyong Shik & Park, Kyung Suh, 2013. "Microstructure-based manipulation: Strategic behavior and performance of spoofing traders," Journal of Financial Markets, Elsevier, vol. 16(2), pages 227-252.
    7. Sulaiman Al-Jassar, 2019. "Fundamental and Technical Trading in the Emerging Market of an Oil-Based Economy," Review of Pacific Basin Financial Markets and Policies (RPBFMP), World Scientific Publishing Co. Pte. Ltd., vol. 22(01), pages 1-19, March.
    8. He, Feng & Qin, Shuqi & Zhang, Xiaotao, 2021. "Investor attention and platform interest rate in Chinese peer-to-peer lending market," Finance Research Letters, Elsevier, vol. 39(C).
    9. Yinsheng Fu & Jullius Kumar & Bibhu Prasad Ganthia & Rahul Neware, 2022. "Nonlinear dynamic measurement method of software reliability based on data mining," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 13(1), pages 273-280, March.
    10. Alexey Ruchay & Elena Feldman & Dmitriy Cherbadzhi & Alexander Sokolov, 2023. "The Imbalanced Classification of Fraudulent Bank Transactions Using Machine Learning," Mathematics, MDPI, vol. 11(13), pages 1-15, June.
    11. Gao, Lu & Lu, Pan & Ren, Yihao, 2021. "A deep learning approach for imbalanced crash data in predicting highway-rail grade crossings accidents," Reliability Engineering and System Safety, Elsevier, vol. 216(C).
    12. Simon Albrecht & Bernhard Lutz & Dirk Neumann, 2020. "The behavior of blockchain ventures on Twitter as a determinant for funding success," Electronic Markets, Springer;IIM University of St. Gallen, vol. 30(2), pages 241-257, June.
    13. Shimon Kogan & Tobias J Moskowitz & Marina Niessner, 2023. "Social Media and Financial News Manipulation," Review of Finance, European Finance Association, vol. 27(4), pages 1229-1268.
    14. Taoufik Bouraoui, 2009. "The impact of stock spams on volatility," Working Papers hal-04140863, HAL.
    15. Marco Caliendo & Michel Clement & Dominik Papies & Sabine Scheel-Kopeinig, 2012. "Research Note ---The Cost Impact of Spam Filters: Measuring the Effect of Information System Technologies in Organizations," Information Systems Research, INFORMS, vol. 23(3-part-2), pages 1068-1080, September.
    16. Bollen, Nicolas P.B. & Christie, William G., 2009. "Market microstructure of the Pink Sheets," Journal of Banking & Finance, Elsevier, vol. 33(7), pages 1326-1339, July.
    17. Taoufik Bouraoui, 2011. "The impact of stock spams on volatility," Applied Financial Economics, Taylor & Francis Journals, vol. 21(13), pages 969-977.
    18. Florian Hauser, 2011. "Auswirkungen von Aktienspam in Deutschland," Schmalenbach Journal of Business Research, Springer, vol. 63(5), pages 485-507, August.
    19. Neil Gandal & J. T. Hamrick & Tyler Moore & Marie Vasek, 2021. "The rise and fall of cryptocurrency coins and tokens," Decisions in Economics and Finance, Springer;Associazione per la Matematica, vol. 44(2), pages 981-1014, December.
    20. Zhang, Yongjie & Zhang, Zuochao & Liu, Lanbiao & Shen, Dehua, 2017. "The interaction of financial news between mass media and new media: Evidence from news on Chinese stock market," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 486(C), pages 535-541.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:12:y:2024:i:9:p:1336-:d:1384579. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.