IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2403.00707.html
   My bibliography  Save this paper

Dimensionality reduction techniques to support insider trading detection

Author

Listed:
  • Adele Ravagnani
  • Fabrizio Lillo
  • Paola Deriu
  • Piero Mazzarisi
  • Francesca Medda
  • Antonio Russo

Abstract

Identification of market abuse is an extremely complicated activity that requires the analysis of large and complex datasets. We propose an unsupervised machine learning method for contextual anomaly detection, which allows to support market surveillance aimed at identifying potential insider trading activities. This method lies in the reconstruction-based paradigm and employs principal component analysis and autoencoders as dimensionality reduction techniques. The only input of this method is the trading position of each investor active on the asset for which we have a price sensitive event (PSE). After determining reconstruction errors related to the trading profiles, several conditions are imposed in order to identify investors whose behavior could be suspicious of insider trading related to the PSE. As a case study, we apply our method to investor resolved data of Italian stocks around takeover bids.

Suggested Citation

  • Adele Ravagnani & Fabrizio Lillo & Paola Deriu & Piero Mazzarisi & Francesca Medda & Antonio Russo, 2024. "Dimensionality reduction techniques to support insider trading detection," Papers 2403.00707, arXiv.org, revised May 2024.
  • Handle: RePEc:arx:papers:2403.00707
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2403.00707
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Utpal Bhattacharya & Hazem Daouk, 2002. "The World Price of Insider Trading," Journal of Finance, American Finance Association, vol. 57(1), pages 75-108, February.
    2. Markus Goldstein & Seiichi Uchida, 2016. "A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data," PLOS ONE, Public Library of Science, vol. 11(4), pages 1-31, April.
    3. Michele Tumminello & Salvatore Miccichè & Fabrizio Lillo & Jyrki Piilo & Rosario N Mantegna, 2011. "Statistically Validated Networks in Bipartite Complex Systems," PLOS ONE, Public Library of Science, vol. 6(3), pages 1-11, March.
    4. Li, Baibing & Martin, Elaine B. & Morris, A. Julian, 2002. "On principal component analysis in L1," Computational Statistics & Data Analysis, Elsevier, vol. 40(3), pages 471-474, September.
    5. Carl Eckart & Gale Young, 1936. "The approximation of one matrix by another of lower rank," Psychometrika, Springer;The Psychometric Society, vol. 1(3), pages 211-218, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Piero Mazzarisi & Adele Ravagnani & Paola Deriu & Fabrizio Lillo & Francesca Medda & Antonio Russo, 2022. "A machine learning approach to support decision in insider trading detection," Papers 2212.05912, arXiv.org.
    2. Merola, Giovanni Maria & Chen, Gemai, 2019. "Projection sparse principal component analysis: An efficient least squares method," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 366-382.
    3. Benaych-Georges, Florent & Nadakuditi, Raj Rao, 2012. "The singular values and vectors of low rank perturbations of large rectangular random matrices," Journal of Multivariate Analysis, Elsevier, vol. 111(C), pages 120-135.
    4. Tuncer, Yalcin & Tanik, Murat M. & Allison, David B., 2008. "An overview of statistical decomposition techniques applied to complex systems," Computational Statistics & Data Analysis, Elsevier, vol. 52(5), pages 2292-2310, January.
    5. Anshul Verma & Orazio Angelini & Tiziana Di Matteo, 2019. "A new set of cluster driven composite development indicators," Papers 1911.11226, arXiv.org, revised Mar 2020.
    6. Dray, Stephane, 2008. "On the number of principal components: A test of dimensionality based on measurements of similarity between matrices," Computational Statistics & Data Analysis, Elsevier, vol. 52(4), pages 2228-2237, January.
    7. Landgraf, Andrew J. & Lee, Yoonkyung, 2020. "Dimensionality reduction for binary data through the projection of natural parameters," Journal of Multivariate Analysis, Elsevier, vol. 180(C).
    8. Beramendi, Pablo & Stegmueller, Daniel, 2016. "The Political Geography Of The Eurocrisis," CAGE Online Working Paper Series 278, Competitive Advantage in the Global Economy (CAGE).
    9. Bai, Jushan & Ng, Serena, 2019. "Rank regularized estimation of approximate factor models," Journal of Econometrics, Elsevier, vol. 212(1), pages 78-96.
    10. Shen, Haipeng & Huang, Jianhua Z., 2008. "Sparse principal component analysis via regularized low rank matrix approximation," Journal of Multivariate Analysis, Elsevier, vol. 99(6), pages 1015-1034, July.
    11. Zhang, Lingsong & Lu, Shu & Marron, J.S., 2015. "Nested nonnegative cone analysis," Computational Statistics & Data Analysis, Elsevier, vol. 88(C), pages 100-110.
    12. Sergio Camiz & Valério D. Pillar, 2018. "Identifying the Informational/Signal Dimension in Principal Component Analysis," Mathematics, MDPI, vol. 6(11), pages 1-16, November.
    13. Mitzi Cubilla-Montilla & Ana Belén Nieto-Librero & M. Purificación Galindo-Villardón & Carlos A. Torres-Cubilla, 2021. "Sparse HJ Biplot: A New Methodology via Elastic Net," Mathematics, MDPI, vol. 9(11), pages 1-15, June.
    14. Juan Carlos Chávez & Felipe J. Fonseca & Manuel Gómez-Zaldívar, 2017. "Resoluciones de disputas comerciales y desempeño económico regional en México. (Commercial Disputes Resolution and Regional Economic Performance in Mexico)," Ensayos Revista de Economia, Universidad Autonoma de Nuevo Leon, Facultad de Economia, vol. 0(1), pages 79-93, May.
    15. Chen, Ray-Bing & Chen, Ying & Härdle, Wolfgang K., 2014. "TVICA—Time varying independent component analysis and its application to financial data," Computational Statistics & Data Analysis, Elsevier, vol. 74(C), pages 95-109.
    16. Yan Yu Chen & Chun-Cheih Chao & Fu-Chen Liu & Po-Chen Hsu & Hsueh-Fen Chen & Shih-Chi Peng & Yung-Jen Chuang & Chung-Yu Lan & Wen-Ping Hsieh & David Shan Hill Wong, 2013. "Dynamic Transcript Profiling of Candida albicans Infection in Zebrafish: A Pathogen-Host Interaction Study," PLOS ONE, Public Library of Science, vol. 8(9), pages 1-16, September.
    17. Han, Rui-Qi & Li, Ming-Xia & Chen, Wei & Zhou, Wei-Xing & Stanley, H. Eugene, 2019. "Structural properties of statistically validated empirical information networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 523(C), pages 747-756.
    18. Plat, Richard, 2009. "Stochastic portfolio specific mortality and the quantification of mortality basis risk," Insurance: Mathematics and Economics, Elsevier, vol. 45(1), pages 123-132, August.
    19. Kondylis, Athanassios & Whittaker, Joe, 2008. "Spectral preconditioning of Krylov spaces: Combining PLS and PC regression," Computational Statistics & Data Analysis, Elsevier, vol. 52(5), pages 2588-2603, January.
    20. Simplice A. Asongu & Nicholas M. Odhiambo, 2019. "Governance, capital flight and industrialisation in Africa," Journal of Economic Structures, Springer;Pan-Pacific Association of Input-Output Studies (PAPAIOS), vol. 8(1), pages 1-22, December.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2403.00707. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.