IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0227930.html
   My bibliography  Save this article

SAO2Vec: Development of an algorithm for embedding the subject–action–object (SAO) structure using Doc2Vec

Author

Listed:
  • Sunhye Kim
  • Inchae Park
  • Byungun Yoon

Abstract

In natural-language processing, the subject–action–object (SAO) structure is used to convert unstructured textual data into structured textual data comprising subjects, actions, and objects. This structure is suitable for analyzing the key elements of technology, as well as the relationships between these elements. However, analysis using the existing SAO structure requires a substantial number of manual processes because this structure does not represent the context of the sentences. Thus, we introduce the concept of SAO2Vec, in which SAO is used to embed the vectors of sentences and documents, for use in text mining in the analysis of technical documents. First, the technical documents of interest are collected, and SAO structures are extracted from them. Then, sentence vectors are extracted through the Doc2Vec algorithm and are updated using word vectors in the SAO structure. Finally, SAO vectors are drawn using an updated sentence vector with the same SAO structure. In addition, document vectors are derived from the document’s SAO vectors. The results of an experiment in the Internet of things field indicate that the SAO2Vec method produces 3.1% better accuracy than the Doc2Vec method and 115.0% better accuracy than SAO frequency alone. This proves that the proposed SAO2Vec algorithm can be used to improve grouping and similarity analysis by including both the meanings and the contexts of technical elements.

Suggested Citation

  • Sunhye Kim & Inchae Park & Byungun Yoon, 2020. "SAO2Vec: Development of an algorithm for embedding the subject–action–object (SAO) structure using Doc2Vec," PLOS ONE, Public Library of Science, vol. 15(2), pages 1-26, February.
  • Handle: RePEc:plo:pone00:0227930
    DOI: 10.1371/journal.pone.0227930
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0227930
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0227930&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0227930?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Zhigao Liu & Yimei Yin & Weidong Liu & Michael Dunford, 2015. "Visualizing the intellectual structure and evolution of innovation systems research: a bibliometric analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 103(1), pages 135-158, April.
    2. Martinčić-Ipšić, Sanda & Margan, Domagoj & Meštrović, Ana, 2016. "Multilayer network of language: A unified framework for structural analysis of linguistic subsystems," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 457(C), pages 117-128.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Wang, Jinfeng & Zhang, Zhixin & Feng, Lijie & Lin, Kuo-Yi & Liu, Peng, 2023. "Development of technology opportunity analysis based on technology landscape by extending technology elements with BERT and TRIZ," Technological Forecasting and Social Change, Elsevier, vol. 191(C).
    2. Jeon, Daeseong & Ahn, Joon Mo & Kim, Juram & Lee, Changyong, 2022. "A doc2vec and local outlier factor approach to measuring the novelty of patents," Technological Forecasting and Social Change, Elsevier, vol. 174(C).
    3. Liu, Zhenfeng & Feng, Jian & Uden, Lorna, 2023. "Technology opportunity analysis using hierarchical semantic networks and dual link prediction," Technovation, Elsevier, vol. 128(C).
    4. Roh, Taeyeoun & Yoon, Byungun, 2023. "Discovering technology and science innovation opportunity based on sentence generation algorithm," Journal of Informetrics, Elsevier, vol. 17(2).
    5. Jeon, Eunji & Yoon, Naeun & Sohn, So Young, 2023. "Exploring new digital therapeutics technologies for psychiatric disorders using BERTopic and PatentSBERTa," Technological Forecasting and Social Change, Elsevier, vol. 186(PA).
    6. Alina Irina Popescu, 2020. "Long-Term City Innovation Trajectories and Quality of Urban Life," Sustainability, MDPI, vol. 12(24), pages 1-19, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xingwen Chen & Li Zhu & Chao Liu & Chunhua Chen & Jun Liu & Dongxia Huo, 2023. "Workplace Diversity in the Asia-Pacific Region: A Review of Literature and Directions for Future Research," Asia Pacific Journal of Management, Springer, vol. 40(3), pages 1021-1045, September.
    2. Souzanchi Kashani, Ebrahim & Roshani, Saeed, 2019. "Evolution of innovation system literature: Intellectual bases and emerging trends," Technological Forecasting and Social Change, Elsevier, vol. 146(C), pages 68-80.
    3. Xuefeng Wang & Shuo Zhang & Yuqin liu, 2022. "ITGInsight–discovering and visualizing research fronts in the scientific literature," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6509-6531, November.
    4. Xingming Ma & Lifeng Zhang & Jingqiu Wang & Yanping Luo, 2019. "Knowledge Domain and Emerging Trends on Echinococcosis Research: A Scientometric Analysis," IJERPH, MDPI, vol. 16(5), pages 1-15, March.
    5. Wei Wang & Dechao Ma & Fengzhi Wu & Mengxin Sun & Shuangqing Xu & Qiuyue Hua & Ziyuan Sun, 2023. "Exploring the Knowledge Structure and Hotspot Evolution of Greenwashing: A Visual Analysis Based on Bibliometrics," Sustainability, MDPI, vol. 15(3), pages 1-35, January.
    6. Shiwangi Singh & Sanjay Dhir, 2019. "Structured review using TCCM and bibliometric analysis of international cause-related marketing, social marketing, and innovation of the firm," International Review on Public and Nonprofit Marketing, Springer;International Association of Public and Non-Profit Marketing, vol. 16(2), pages 335-347, December.
    7. Zhenhua Chen & Laurie A. Schintler, 2023. "Rediscovering regional science: Positioning the field's evolving location in science and society," Journal of Regional Science, Wiley Blackwell, vol. 63(3), pages 617-642, June.
    8. Chen, Xiaoyan & Liu, Yisheng, 2020. "Visualization analysis of high-speed railway research based on CiteSpace," Transport Policy, Elsevier, vol. 85(C), pages 1-17.
    9. Muhammad Ashraf Fauzi, 2023. "Social media in disaster management: review of the literature and future trends through bibliometric analysis," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 118(2), pages 953-975, September.
    10. Chengliang Liu & Qinchang Gui, 2016. "Mapping intellectual structures and dynamics of transport geography research: a scientometric overview from 1982 to 2014," Scientometrics, Springer;Akadémiai Kiadó, vol. 109(1), pages 159-184, October.
    11. Ruth Zárate Rueda & Yolima Ivonne Beltrán Villamizar & Luis Eduardo Becerra Ardila, 2023. "A Retrospective Approach to Pro-Environmental Behavior from Environmental Education: An Alternative from Sustainable Development," Sustainability, MDPI, vol. 15(6), pages 1-19, March.
    12. Xiaochen Zhang & Xiaoyu Yao & Lanxin Hui & Fuchuan Song & Fei Hu, 2021. "A Bibliometric Narrative Review on Modern Navigation Aids for People with Visual Impairment," Sustainability, MDPI, vol. 13(16), pages 1-23, August.
    13. Hongyu Liu & Shukuan Zhao & Ouyang Xin, 2019. "Analysis on the Evolution Path and Hotspot of Knowledge Innovation Study Based on Knowledge Map," Sustainability, MDPI, vol. 11(19), pages 1-14, October.
    14. Chung-Yen Yu & Yung-Ting Chuang & Hsi-Peng Kuan, 2017. "Understanding Faculty Collaboration and Productivity: A Case Study," Asian Social Science, Canadian Center of Science and Education, vol. 13(3), pages 1-1, March.
    15. Aurora A. C. Teixeira & Pedro Cosme Vieira & Ana Patrícia Abreu, 2017. "Sleeping Beauties and their princes in innovation studies," Scientometrics, Springer;Akadémiai Kiadó, vol. 110(2), pages 541-580, February.
    16. Bingke Zhu & Hao Fan & Bingbing Xie & Ran Su & Chaofeng Zhou & Jianping He, 2020. "Mapping the Scientific Research on Healthcare Workers’ Occupational Health: A Bibliometric and Social Network Analysis," IJERPH, MDPI, vol. 17(8), pages 1-22, April.
    17. Qiaoyun Yang & Dan Yang & Peng Li & Shilu Liang & Zhenghu Zhang, 2021. "A Bibliometric and Visual Analysis of Global Community Resilience Research," IJERPH, MDPI, vol. 18(20), pages 1-25, October.
    18. Qibin Chen & Guilian Fan & Wei Na & Jiming Liu & Jianguo Cui & Hongyan Li, 2019. "Past, Present, and Future of Groundwater Remediation Research: A Scientometric Analysis," IJERPH, MDPI, vol. 16(20), pages 1-17, October.
    19. Rakas, Marija & Hain, Daniel S., 2019. "The state of innovation system research: What happens beneath the surface?," Research Policy, Elsevier, vol. 48(9), pages 1-1.
    20. Chen, Zhenhua, 2023. "Socioeconomic Impacts of high-speed rail: A bibliometric analysis," Socio-Economic Planning Sciences, Elsevier, vol. 85(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0227930. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.