IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v9y2021i8p882-d537378.html
   My bibliography  Save this article

A New Ensemble Method for Detecting Anomalies in Gene Expression Matrices

Author

Listed:
  • Laura Selicato

    (Department of Mathematics, University of Bari Aldo Moro, 70125 Bari, Italy
    Member of GNCS, Istituto Nazionale di Alta Matematica, P.le Aldo Moro 5, 00185 Roma, Italy
    These authors contributed equally to this work.)

  • Flavia Esposito

    (Department of Mathematics, University of Bari Aldo Moro, 70125 Bari, Italy
    Member of GNCS, Istituto Nazionale di Alta Matematica, P.le Aldo Moro 5, 00185 Roma, Italy
    These authors contributed equally to this work.)

  • Grazia Gargano

    (Department of Mathematics, University of Bari Aldo Moro, 70125 Bari, Italy)

  • Maria Carmela Vegliante

    (Hematology and Cell Therapy Unit, IRCCS-Istituto Tumori ‘Giovanni Paolo II’, 70124 Bari, Italy)

  • Giuseppina Opinto

    (Hematology and Cell Therapy Unit, IRCCS-Istituto Tumori ‘Giovanni Paolo II’, 70124 Bari, Italy)

  • Gian Maria Zaccaria

    (Hematology and Cell Therapy Unit, IRCCS-Istituto Tumori ‘Giovanni Paolo II’, 70124 Bari, Italy)

  • Sabino Ciavarella

    (Hematology and Cell Therapy Unit, IRCCS-Istituto Tumori ‘Giovanni Paolo II’, 70124 Bari, Italy)

  • Attilio Guarini

    (Hematology and Cell Therapy Unit, IRCCS-Istituto Tumori ‘Giovanni Paolo II’, 70124 Bari, Italy)

  • Nicoletta Del Buono

    (Department of Mathematics, University of Bari Aldo Moro, 70125 Bari, Italy
    Member of GNCS, Istituto Nazionale di Alta Matematica, P.le Aldo Moro 5, 00185 Roma, Italy)

Abstract

One of the main problems in the analysis of real data is often related to the presence of anomalies. Namely, anomalous cases can both spoil the resulting analysis and contain valuable information at the same time. In both cases, the ability to detect these occurrences is very important. In the biomedical field, a correct identification of outliers could allow the development of new biological hypotheses that are not considered when looking at experimental biological data. In this work, we address the problem of detecting outliers in gene expression data, focusing on microarray analysis. We propose an ensemble approach for detecting anomalies in gene expression matrices based on the use of Hierarchical Clustering and Robust Principal Component Analysis, which allows us to derive a novel pseudo-mathematical classification of anomalies.

Suggested Citation

  • Laura Selicato & Flavia Esposito & Grazia Gargano & Maria Carmela Vegliante & Giuseppina Opinto & Gian Maria Zaccaria & Sabino Ciavarella & Attilio Guarini & Nicoletta Del Buono, 2021. "A New Ensemble Method for Detecting Anomalies in Gene Expression Matrices," Mathematics, MDPI, vol. 9(8), pages 1-26, April.
  • Handle: RePEc:gam:jmathe:v:9:y:2021:i:8:p:882-:d:537378
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/9/8/882/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/9/8/882/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Laura Pasqualucci & David Dominguez-Sola & Annalisa Chiarenza & Giulia Fabbri & Adina Grunn & Vladimir Trifonov & Lawryn H. Kasper & Stephanie Lerach & Hongyan Tang & Jing Ma & Davide Rossi & Amy Chad, 2011. "Inactivating mutations of acetyltransferase genes in B-cell lymphoma," Nature, Nature, vol. 471(7337), pages 189-195, March.
    2. Shieh Albert D & Hung Yeung Sam, 2009. "Detecting Outlier Samples in Microarray Data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 8(1), pages 1-26, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Sareer Ul Amin & Mohib Ullah & Muhammad Sajjad & Faouzi Alaya Cheikh & Mohammad Hijji & Abdulrahman Hijji & Khan Muhammad, 2022. "EADN: An Efficient Deep Learning Model for Anomaly Detection in Videos," Mathematics, MDPI, vol. 10(9), pages 1-15, May.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Guozhong Jiang & Zhizhong Wang & Zhenguo Cheng & Weiwei Wang & Shuangshuang Lu & Zifang Zhang & Chinedu A. Anene & Faraz Khan & Yue Chen & Emma Bailey & Huisha Xu & Yunshu Dong & Peinan Chen & Zhongxi, 2024. "The integrated molecular and histological analysis defines subtypes of esophageal squamous cell carcinoma," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    2. Longxia Xu & Hongwen Xuan & Wei He & Liang Zhang & Mengying Huang & Kuai Li & Hong Wen & Han Xu & Xiaobing Shi, 2023. "TAZ2 truncation confers overactivation of p300 and cellular vulnerability to HDAC inhibition," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    3. Junlong Zhao & Chao Liu & Lu Niu & Chenlei Leng, 2019. "Multiple influential point detection in high dimensional regression spaces," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 81(2), pages 385-408, April.
    4. Josefine Radke & Naveed Ishaque & Randi Koll & Zuguang Gu & Elisa Schumann & Lina Sieverling & Sebastian Uhrig & Daniel Hübschmann & Umut H. Toprak & Cristina López & Xavier Pastor Hostench & Simone B, 2022. "The genomic and transcriptional landscape of primary central nervous system lymphoma," Nature Communications, Nature, vol. 13(1), pages 1-20, December.
    5. Misuzu Habazaki & Shinsuke Mizumoto & Hidetoshi Kajino & Tomoya Kujirai & Hitoshi Kurumizaka & Shigehiro A. Kawashima & Kenzo Yamatsugu & Motomu Kanai, 2023. "A chemical catalyst enabling histone acylation with endogenous acyl-CoA," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    6. Asuman Turkmen & Nedret Billor, 2013. "Partial least squares classification for high dimensional data using the PCOUT algorithm," Computational Statistics, Springer, vol. 28(2), pages 771-788, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2021:i:8:p:882-:d:537378. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.