IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0212333.html
   My bibliography  Save this article

A hybrid gene selection algorithm based on interaction information for microarray-based cancer classification

Author

Listed:
  • Songyot Nakariyakul

Abstract

We address gene selection and machine learning methods for cancer classification using microarray gene expression data. Due to the high dimensionality of microarray data, traditional gene selection algorithms are filter-based, focusing on intrinsic properties of the data such as distance, dependency, and correlation. These methods are fast but select far too many genes to use for the classification task. In this work, we present a new hybrid filter-wrapper gene subset selection algorithm that is an improved modification of our prior algorithm. Our proposed method employs interaction information to rank candidate genes to add into a gene subset. It then conditionally adds one gene at a time into the current subset and verifies whether the resultant subset improves the classification performance significantly. Only significant genes are selected, and the candidate gene list is updated every time a gene is added to the subset. Thus, our gene selection algorithm is very dynamic. Experimental results on ten public cancer microarray data sets show that our method consistently outperforms prior gene selection algorithms in terms of classification accuracy, while requiring a small number of selected genes.

Suggested Citation

  • Songyot Nakariyakul, 2019. "A hybrid gene selection algorithm based on interaction information for microarray-based cancer classification," PLOS ONE, Public Library of Science, vol. 14(2), pages 1-17, February.
  • Handle: RePEc:plo:pone00:0212333
    DOI: 10.1371/journal.pone.0212333
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0212333
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0212333&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0212333?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. William McGill, 1954. "Multivariate information transmission," Psychometrika, Springer;The Psychometric Society, vol. 19(2), pages 97-116, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. repec:hig:wpaper:98sti2019 is not listed on IDEAS
    2. Petersen, Alexander M. & Rotolo, Daniele & Leydesdorff, Loet, 2016. "A triple helix model of medical innovation: Supply, demand, and technological capabilities in terms of Medical Subject Headings," Research Policy, Elsevier, vol. 45(3), pages 666-681.
    3. Park, Han Woo & Leydesdorff, Loet, 2010. "Longitudinal trends in networks of university-industry-government relations in South Korea: The role of programmatic incentives," Research Policy, Elsevier, vol. 39(5), pages 640-649, June.
    4. Louis Verny & Nadir Sella & Séverine Affeldt & Param Priya Singh & Hervé Isambert, 2017. "Learning causal networks with latent variables from multivariate information in genomic data," PLOS Computational Biology, Public Library of Science, vol. 13(10), pages 1-25, October.
    5. Xiaojun Hu & Xian Li & Ronald Rousseau, 2021. "Mathematical reflections on Triple Helix calculations," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(10), pages 8581-8587, October.
    6. Inga A. Ivanova & Loet Leydesdorff, 2014. "A simulation model of the Triple Helix of university–industry–government relations and the decomposition of the redundancy," Scientometrics, Springer;Akadémiai Kiadó, vol. 99(3), pages 927-948, June.
    7. Loet Leydesdorff & Han Woo Park & Balazs Lengyel, 2014. "A routine for measuring synergy in university–industry–government relations: mutual information as a Triple-Helix and Quadruple-Helix indicator," Scientometrics, Springer;Akadémiai Kiadó, vol. 99(1), pages 27-35, April.
    8. Mariusz Kubkowski & Jan Mielniczuk, 2021. "Asymptotic Distributions of Empirical Interaction Information," Methodology and Computing in Applied Probability, Springer, vol. 23(1), pages 291-315, March.
    9. Loet Leydesdorff, 2011. "“Structuration” by intellectual organization: the configuration of knowledge in relations among structural components in networks of science," Scientometrics, Springer;Akadémiai Kiadó, vol. 88(2), pages 499-520, August.
    10. Lengyel, Balázs & Leydesdorff, Loet, 2015. "The Effects of FDI on Innovation Systems in Hungarian Regions: Where is the Synergy Generated?," MPRA Paper 73945, University Library of Munich, Germany.
    11. Irad Ben-Gal & Marcelo Bacher & Morris Amara & Erez Shmueli, 2023. "A Nonparametric Subspace Analysis Approach with Application to Anomaly Detection Ensembles," INFORMS Joural on Data Science, INFORMS, vol. 2(2), pages 99-115, October.
    12. Han Woo Park, 2014. "Mapping election campaigns through negative entropy: Triple and Quadruple Helix approach to South Korea’s 2012 presidential election," Scientometrics, Springer;Akadémiai Kiadó, vol. 99(1), pages 187-197, April.
    13. Loet Leydesdorff & Igone Porto-Gomez, 2019. "Measuring the expected synergy in Spanish regional and national systems of innovation," The Journal of Technology Transfer, Springer, vol. 44(1), pages 189-209, February.
    14. Strand, Øivind & Leydesdorff, Loet, 2013. "Where is synergy indicated in the Norwegian innovation system? Triple-Helix relations among technology, organization, and geography," Technological Forecasting and Social Change, Elsevier, vol. 80(3), pages 471-484.
    15. Leydesdorff, Loet & Fritsch, Michael, 2006. "Measuring the knowledge base of regional innovation systems in Germany in terms of a Triple Helix dynamics," Research Policy, Elsevier, vol. 35(10), pages 1538-1553, December.
    16. Petras Rupšys, 2019. "Understanding the Evolution of Tree Size Diversity within the Multivariate Nonsymmetrical Diffusion Process and Information Measures," Mathematics, MDPI, vol. 7(8), pages 1-22, August.
    17. Dennis Knepp & Doris Entwisle, 1969. "Testing significance of differences between two chi-squares," Psychometrika, Springer;The Psychometric Society, vol. 34(3), pages 331-333, September.
    18. Inga Ivanova & Oivind Strand & Loet Leydesdorff, 2019. "The Synergy and Cycle Values in Regional Innovation Systems: The Case of Norway," Foresight and STI Governance (Foresight-Russia till No. 3/2015), National Research University Higher School of Economics, vol. 13(1), pages 48-61.
    19. Frank Huettner, & Tamer Boyaci, & Yalcin Akcay, 2016. "Consumer choice under limited attention when alternatives have different information costs," ESMT Research Working Papers ESMT-16-04_R2, ESMT European School of Management and Technology, revised 28 Feb 2018.
    20. Inga Ivanova, 2022. "The relation between complexity and synergy in the case of China: different ways of predicting GDP growth in a complex and adaptive system," Quality & Quantity: International Journal of Methodology, Springer, vol. 56(1), pages 195-215, February.
    21. Frank Gelens & Juho Äijälä & Louis Roberts & Misako Komatsu & Cem Uran & Michael A. Jensen & Kai J. Miller & Robin A. A. Ince & Max Garagnani & Martin Vinck & Andres Canales-Johnson, 2024. "Distributed representations of prediction error signals across the cortical hierarchy are synergistic," Nature Communications, Nature, vol. 15(1), pages 1-18, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0212333. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.