IDEAS home Printed from https://ideas.repec.org/a/gam/jijerp/v16y2019i19p3628-d271313.html
   My bibliography  Save this article

Ontology-Based Healthcare Named Entity Recognition from Twitter Messages Using a Recurrent Neural Network Approach

Author

Listed:
  • Erdenebileg Batbaatar

    (College of Electrical and Computer Engineering, Chungbuk National University, Cheongju 28644, Korea)

  • Keun Ho Ryu

    (Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City 700000, Vietnam
    Database and Bioinformatics Laboratory, Department of Computer Science, College of Electrical and Computer Engineering, Chungbuk National University, Cheongju 28644, Korea)

Abstract

Named Entity Recognition (NER) in the healthcare domain involves identifying and categorizing disease, drugs, and symptoms for biosurveillance, extracting their related properties and activities, and identifying adverse drug events appearing in texts. These tasks are important challenges in healthcare. Analyzing user messages in social media networks such as Twitter can provide opportunities to detect and manage public health events. Twitter provides a broad range of short messages that contain interesting information for information extraction. In this paper, we present a Health-Related Named Entity Recognition (HNER) task using healthcare-domain ontology that can recognize health-related entities from large numbers of user messages from Twitter. For this task, we employ a deep learning architecture which is based on a recurrent neural network (RNN) with little feature engineering. To achieve our goal, we collected a large number of Twitter messages containing health-related information, and detected biomedical entities from the Unified Medical Language System (UMLS). A bidirectional long short-term memory (BiLSTM) model learned rich context information, and a convolutional neural network (CNN) was used to produce character-level features. The conditional random field (CRF) model predicted a sequence of labels that corresponded to a sequence of inputs, and the Viterbi algorithm was used to detect health-related entities from Twitter messages. We provide comprehensive results giving valuable insights for identifying medical entities in Twitter for various applications. The BiLSTM-CRF model achieved a precision of 93.99%, recall of 73.31%, and F1-score of 81.77% for disease or syndrome HNER; a precision of 90.83%, recall of 81.98%, and F1-score of 87.52% for sign or symptom HNER; and a precision of 94.85%, recall of 73.47%, and F1-score of 84.51% for pharmacologic substance named entities. The ontology-based manual annotation results show that it is possible to perform high-quality annotation despite the complexity of medical terminology and the lack of context in tweets.

Suggested Citation

  • Erdenebileg Batbaatar & Keun Ho Ryu, 2019. "Ontology-Based Healthcare Named Entity Recognition from Twitter Messages Using a Recurrent Neural Network Approach," IJERPH, MDPI, vol. 16(19), pages 1-19, September.
  • Handle: RePEc:gam:jijerp:v:16:y:2019:i:19:p:3628-:d:271313
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1660-4601/16/19/3628/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1660-4601/16/19/3628/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Amir Karami & London S. Bennett & Xiaoyun He, 2018. "Mining Public Opinion about Economic Issues: Twitter and the U.S. Presidential Election," International Journal of Strategic Decision Sciences (IJSDS), IGI Global, vol. 9(1), pages 18-28, January.
    2. Meijing Li & Tsendsuren Munkhdalai & Xiuming Yu & Keun Ho Ryu, 2015. "A Novel Approach for Protein-Named Entity Recognition and Protein-Protein Interaction Extraction," Mathematical Problems in Engineering, Hindawi, vol. 2015, pages 1-10, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Khishigsuren Davagdorj & Ling Wang & Meijing Li & Van-Huy Pham & Keun Ho Ryu & Nipon Theera-Umpon, 2022. "Discovering Thematically Coherent Biomedical Documents Using Contextualized Bidirectional Encoder Representations from Transformers-Based Clustering," IJERPH, MDPI, vol. 19(10), pages 1-21, May.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Rondan-Cataluña, F. Javier & Peral-Peral, Begoña & Ramírez-Correa, Patricio E., 2023. "Measuring public opinion of education apps," Technological Forecasting and Social Change, Elsevier, vol. 188(C).
    2. Amir Karami & Melek Yildiz Spinel & C. Nicole White & Kayla Ford & Suzanne Swan, 2021. "A Systematic Literature Review of Sexual Harassment Studies with Text Mining," Sustainability, MDPI, vol. 13(12), pages 1-24, June.
    3. Amir Karami & Morgan Lundy & Frank Webb & Gabrielle Turner-McGrievy & Brooke W. McKeever & Robert McKeever, 2021. "Identifying and Analyzing Health-Related Themes in Disinformation Shared by Conservative and Liberal Russian Trolls on Twitter," IJERPH, MDPI, vol. 18(4), pages 1-16, February.
    4. Emiliano del Gobbo & Sara Fontanella & Annalina Sarra & Lara Fontanella, 2021. "Emerging Topics in Brexit Debate on Twitter Around the Deadlines," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 156(2), pages 669-688, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jijerp:v:16:y:2019:i:19:p:3628-:d:271313. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.