IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0077848.html
   My bibliography  Save this article

Combining Position Weight Matrices and Document-Term Matrix for Efficient Extraction of Associations of Methylated Genes and Diseases from Free Text

Author

Listed:
  • Arwa Bin Raies
  • Hicham Mansour
  • Roberto Incitti
  • Vladimir B Bajic

Abstract

Background: In a number of diseases, certain genes are reported to be strongly methylated and thus can serve as diagnostic markers in many cases. Scientific literature in digital form is an important source of information about methylated genes implicated in particular diseases. The large volume of the electronic text makes it difficult and impractical to search for this information manually. Methodology: We developed a novel text mining methodology based on a new concept of position weight matrices (PWMs) for text representation and feature generation. We applied PWMs in conjunction with the document-term matrix to extract with high accuracy associations between methylated genes and diseases from free text. The performance results are based on large manually-classified data. Additionally, we developed a web-tool, DEMGD, which automates extraction of these associations from free text. DEMGD presents the extracted associations in summary tables and full reports in addition to evidence tagging of text with respect to genes, diseases and methylation words. The methodology we developed in this study can be applied to similar association extraction problems from free text. Conclusion: The new methodology developed in this study allows for efficient identification of associations between concepts. Our method applied to methylated genes in different diseases is implemented as a Web-tool, DEMGD, which is freely available at http://www.cbrc.kaust.edu.sa/demgd/. The data is available for online browsing and download.

Suggested Citation

  • Arwa Bin Raies & Hicham Mansour & Roberto Incitti & Vladimir B Bajic, 2013. "Combining Position Weight Matrices and Document-Term Matrix for Efficient Extraction of Associations of Methylated Genes and Diseases from Free Text," PLOS ONE, Public Library of Science, vol. 8(10), pages 1-1, October.
  • Handle: RePEc:plo:pone00:0077848
    DOI: 10.1371/journal.pone.0077848
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0077848
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0077848&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0077848?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Andrew P. Feinberg, 2007. "Phenotypic plasticity and the epigenetics of human disease," Nature, Nature, vol. 447(7143), pages 433-440, May.
    2. Gerda Egger & Gangning Liang & Ana Aparicio & Peter A. Jones, 2004. "Epigenetics in human disease and prospects for epigenetic therapy," Nature, Nature, vol. 429(6990), pages 457-463, May.
    3. Vicki Brower, 2011. "Epigenetics: Unravelling the cancer code," Nature, Nature, vol. 471(7339), pages 12-13, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Toyokawa, Satoshi & Uddin, Monica & Koenen, Karestan C. & Galea, Sandro, 2012. "How does the social environment ‘get into the mind’? Epigenetics at the intersection of social and psychiatric epidemiology," Social Science & Medicine, Elsevier, vol. 74(1), pages 67-74.
    2. Veruscka Leso & Ilaria Vetrani & Ilaria Della Volpe & Caterina Nocera & Ivo Iavicoli, 2019. "Welding Fume Exposure and Epigenetic Alterations: A Systematic Review," IJERPH, MDPI, vol. 16(10), pages 1-17, May.
    3. Michelle Kelly-Irving & Laurence Mabile & Pascale Grosclaude & Thierry Lang & Cyrille Delpierre, 2013. "The embodiment of adverse childhood experiences and cancer development: potential biological mechanisms and pathways across the life course," International Journal of Public Health, Springer;Swiss School of Public Health (SSPH+), vol. 58(1), pages 3-11, February.
    4. David V. McLeod & Geoff Wild & Francisco Úbeda, 2021. "Epigenetic memories and the evolution of infectious diseases," Nature Communications, Nature, vol. 12(1), pages 1-13, December.
    5. Jie Liu & Xuehua Zhong, 2024. "Epiallelic variation of non-coding RNA genes and their phenotypic consequences," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    6. Sherrie Lessans & Susan G. Dorsey, 2013. "The Role for Epigenetic Modifications in Pain and Analgesia Response," Nursing Research and Practice, Hindawi, vol. 2013, pages 1-6, October.
    7. William A. Toscano & Kristen P. Oehlke, 2005. "Systems Biology: New Approaches to Old Environmental Health Problems," IJERPH, MDPI, vol. 2(1), pages 1-6, April.
    8. Bo Zhang & Wei Zhu & Ping Yang & Tao Liu & Mei Jiang & Zhi-Ni He & Shi-Xin Zhang & Wei-Qing Chen & Wen Chen, 2011. "Cigarette Smoking and p16INK4α Gene Promoter Hypermethylation in Non-Small Cell Lung Carcinoma Patients: A Meta-Analysis," PLOS ONE, Public Library of Science, vol. 6(12), pages 1-9, December.
    9. Steffan D Bos & Christian M Page & Bettina K Andreassen & Emon Elboudwarej & Marte W Gustavsen & Farren Briggs & Hong Quach & Ingvild S Leikfoss & Anja Bjølgerud & Tone Berge & Hanne F Harbo & Lisa F , 2015. "Genome-Wide DNA Methylation Profiles Indicate CD8+ T Cell Hypermethylation in Multiple Sclerosis," PLOS ONE, Public Library of Science, vol. 10(3), pages 1-16, March.
    10. Carlos Olmeda-Gómez & Carlos Romá-Mateo & Maria-Antonia Ovalle-Perandones, 2019. "Overview of trends in global epigenetic research (2009–2017)," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(3), pages 1545-1574, June.
    11. Xuefeng Wang & Shuo Zhang & Yao Wu & Xuemei Yang, 2021. "Revealing potential drug-disease-gene association patterns for precision medicine," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(5), pages 3723-3748, May.
    12. Ko Sato & Amarjeet Kumar & Keisuke Hamada & Chikako Okada & Asako Oguni & Ayumi Machiyama & Shun Sakuraba & Tomohiro Nishizawa & Osamu Nureki & Hidetoshi Kono & Kazuhiro Ogata & Toru Sengoku, 2021. "Structural basis of the regulation of the normal and oncogenic methylation of nucleosomal histone H3 Lys36 by NSD2," Nature Communications, Nature, vol. 12(1), pages 1-10, December.
    13. Kaiqiong Zhao & Karim Oualkacha & Lajmi Lakhal‐Chaieb & Aurélie Labbe & Kathleen Klein & Antonio Ciampi & Marie Hudson & Inés Colmegna & Tomi Pastinen & Tieyuan Zhang & Denise Daley & Celia M.T. Green, 2021. "A novel statistical method for modeling covariate effects in bisulfite sequencing derived measures of DNA methylation," Biometrics, The International Biometric Society, vol. 77(2), pages 424-438, June.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0077848. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.