IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0227930.html
   My bibliography  Save this article

SAO2Vec: Development of an algorithm for embedding the subject–action–object (SAO) structure using Doc2Vec

Author

Listed:
  • Sunhye Kim
  • Inchae Park
  • Byungun Yoon

Abstract

In natural-language processing, the subject–action–object (SAO) structure is used to convert unstructured textual data into structured textual data comprising subjects, actions, and objects. This structure is suitable for analyzing the key elements of technology, as well as the relationships between these elements. However, analysis using the existing SAO structure requires a substantial number of manual processes because this structure does not represent the context of the sentences. Thus, we introduce the concept of SAO2Vec, in which SAO is used to embed the vectors of sentences and documents, for use in text mining in the analysis of technical documents. First, the technical documents of interest are collected, and SAO structures are extracted from them. Then, sentence vectors are extracted through the Doc2Vec algorithm and are updated using word vectors in the SAO structure. Finally, SAO vectors are drawn using an updated sentence vector with the same SAO structure. In addition, document vectors are derived from the document’s SAO vectors. The results of an experiment in the Internet of things field indicate that the SAO2Vec method produces 3.1% better accuracy than the Doc2Vec method and 115.0% better accuracy than SAO frequency alone. This proves that the proposed SAO2Vec algorithm can be used to improve grouping and similarity analysis by including both the meanings and the contexts of technical elements.

Suggested Citation

  • Sunhye Kim & Inchae Park & Byungun Yoon, 2020. "SAO2Vec: Development of an algorithm for embedding the subject–action–object (SAO) structure using Doc2Vec," PLOS ONE, Public Library of Science, vol. 15(2), pages 1-26, February.
  • Handle: RePEc:plo:pone00:0227930
    DOI: 10.1371/journal.pone.0227930
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0227930
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0227930&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0227930?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Zhigao Liu & Yimei Yin & Weidong Liu & Michael Dunford, 2015. "Visualizing the intellectual structure and evolution of innovation systems research: a bibliometric analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 103(1), pages 135-158, April.
    2. Martinčić-Ipšić, Sanda & Margan, Domagoj & Meštrović, Ana, 2016. "Multilayer network of language: A unified framework for structural analysis of linguistic subsystems," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 457(C), pages 117-128.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jeon, Eunji & Yoon, Naeun & Sohn, So Young, 2023. "Exploring new digital therapeutics technologies for psychiatric disorders using BERTopic and PatentSBERTa," Technological Forecasting and Social Change, Elsevier, vol. 186(PA).
    2. Alina Irina Popescu, 2020. "Long-Term City Innovation Trajectories and Quality of Urban Life," Sustainability, MDPI, vol. 12(24), pages 1-19, December.
    3. Liu, Zhenfeng & Feng, Jian & Uden, Lorna, 2023. "Technology opportunity analysis using hierarchical semantic networks and dual link prediction," Technovation, Elsevier, vol. 128(C).
    4. Wang, Liang & Li, Munan, 2024. "An exploration method for technology forecasting that combines link prediction with graph embedding: A case study on blockchain," Technological Forecasting and Social Change, Elsevier, vol. 208(C).
    5. Wang, Jinfeng & Zhang, Zhixin & Feng, Lijie & Lin, Kuo-Yi & Liu, Peng, 2023. "Development of technology opportunity analysis based on technology landscape by extending technology elements with BERT and TRIZ," Technological Forecasting and Social Change, Elsevier, vol. 191(C).
    6. Jeon, Daeseong & Ahn, Joon Mo & Kim, Juram & Lee, Changyong, 2022. "A doc2vec and local outlier factor approach to measuring the novelty of patents," Technological Forecasting and Social Change, Elsevier, vol. 174(C).
    7. Roh, Taeyeoun & Yoon, Byungun, 2023. "Discovering technology and science innovation opportunity based on sentence generation algorithm," Journal of Informetrics, Elsevier, vol. 17(2).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xuefeng Wang & Shuo Zhang & Yuqin liu, 2022. "ITGInsight–discovering and visualizing research fronts in the scientific literature," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6509-6531, November.
    2. Shiwangi Singh & Sanjay Dhir, 2019. "Structured review using TCCM and bibliometric analysis of international cause-related marketing, social marketing, and innovation of the firm," International Review on Public and Nonprofit Marketing, Springer;International Association of Public and Non-Profit Marketing, vol. 16(2), pages 335-347, December.
    3. Zhenhua Chen & Laurie A. Schintler, 2023. "Rediscovering regional science: Positioning the field's evolving location in science and society," Journal of Regional Science, Wiley Blackwell, vol. 63(3), pages 617-642, June.
    4. Chen, Xiaoyan & Liu, Yisheng, 2020. "Visualization analysis of high-speed railway research based on CiteSpace," Transport Policy, Elsevier, vol. 85(C), pages 1-17.
    5. Muhammad Ashraf Fauzi, 2023. "Social media in disaster management: review of the literature and future trends through bibliometric analysis," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 118(2), pages 953-975, September.
    6. Ruth Zárate Rueda & Yolima Ivonne Beltrán Villamizar & Luis Eduardo Becerra Ardila, 2023. "A Retrospective Approach to Pro-Environmental Behavior from Environmental Education: An Alternative from Sustainable Development," Sustainability, MDPI, vol. 15(6), pages 1-19, March.
    7. Chung-Yen Yu & Yung-Ting Chuang & Hsi-Peng Kuan, 2017. "Understanding Faculty Collaboration and Productivity: A Case Study," Asian Social Science, Canadian Center of Science and Education, vol. 13(3), pages 1-1, March.
    8. Aurora A. C. Teixeira & Pedro Cosme Vieira & Ana Patrícia Abreu, 2017. "Sleeping Beauties and their princes in innovation studies," Scientometrics, Springer;Akadémiai Kiadó, vol. 110(2), pages 541-580, February.
    9. Veronica Paul Kundy & Kamini Shah, 2024. "The knowledge base of financial technology: a bibliometric analysis review," SN Business & Economics, Springer, vol. 4(7), pages 1-22, July.
    10. Massimo Aria & Michelangelo Misuraca & Maria Spano, 2020. "Mapping the Evolution of Social Research and Data Science on 30 Years of Social Indicators Research," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 149(3), pages 803-831, June.
    11. Shuo Yang & Lanxia Zhang & Lele Wang, 2023. "Key Factors of Sustainable Development of Organization: Bibliometric Analysis of Organizational Citizenship Behavior," Sustainability, MDPI, vol. 15(10), pages 1-20, May.
    12. Wenting Yang & Jiantong Zhang & Ruolin Ma, 2020. "The Prediction of Infectious Diseases: A Bibliometric Analysis," IJERPH, MDPI, vol. 17(17), pages 1-19, August.
    13. Liyan Huang & Hong Ching Goh & Rosli Said, 2023. "Understanding the social integration process of rural–urban migrants in urban china: a bibliometrics review," Journal of Population Research, Springer, vol. 40(4), pages 1-34, December.
    14. Zhiwen Su & Mingyu Zhang & Wenbing Wu, 2021. "Visualizing Sustainable Supply Chain Management: A Systematic Scientometric Review," Sustainability, MDPI, vol. 13(8), pages 1-25, April.
    15. Lenka Soták-Benedeková & Jana Rybárová & Dana Tometzová & Andrea Seňová & Radim Rybár, 2025. "Comprehensive Analysis of Rural Tourism Development: Historical Evolution, Current Trends, and Future Prospects," Sustainability, MDPI, vol. 17(3), pages 1-41, January.
    16. Solomija Buk & Yuri Krynytskyi & Andrij Rovenchak, 2019. "Properties Of Autosemantic Word Networks In Ukrainian Texts," Advances in Complex Systems (ACS), World Scientific Publishing Co. Pte. Ltd., vol. 22(06), pages 1-22, December.
    17. Li, Daitian & Malerba, Franco, 2024. "Technological change and the evolution of the links across sectoral systems: The case of mobile communications," Technovation, Elsevier, vol. 130(C).
    18. Naina Narang & Seema Gupta & Naliniprava Tripathy, 2023. "A bibliometric analysis of governance mechanisms in dividend decisions: an overview and emerging trends," International Journal of Disclosure and Governance, Palgrave Macmillan, vol. 20(4), pages 410-430, December.
    19. Dejian Yu, 2015. "A scientometrics review on aggregation operator research," Scientometrics, Springer;Akadémiai Kiadó, vol. 105(1), pages 115-133, October.
    20. Abderahman Rejeb & Karim Rejeb & Andrea Appolloni & Mohammad Iranmanesh & Horst Treiblmaier & Sandeep Jagtap, 2022. "Exploring Food Supply Chain Trends in the COVID-19 Era: A Bibliometric Review," Sustainability, MDPI, vol. 14(19), pages 1-33, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0227930. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.