IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v127y2022i8d10.1007_s11192-022-04419-1.html
   My bibliography  Save this article

A multi-view method of scientific paper classification via heterogeneous graph embeddings

Author

Listed:
  • Yiqin Lv

    (National University of Defense Technology)

  • Zheng Xie

    (National University of Defense Technology)

  • Xiaojing Zuo

    (National University of Defense Technology)

  • Yiping Song

    (National University of Defense Technology)

Abstract

The classification task of scientific papers can be implemented based on contents or citations. In order to improve the performance on this task, we express papers as nodes and integrate scientific papers’ contents and citations into a heterogeneous graph. It has two types of edges. One type represents the semantic similarity between papers, derived from papers’ titles and abstracts. The other type represents the citation relationship between papers and the journals or proceedings of conferences of their references. We utilize a contrastive learning method to embed the nodes in the heterogeneous graph into a vector space. Then, we feed the paper node vectors into classifiers, such as the decision tree, multilayer perceptron, and so on. We conduct experiments on three datasets of scientific papers: the Microsoft Academic Graph with 63,211 scientific papers in 20 classes, the Proceedings of the National Academy of Sciences with 38,243 scientific papers in 18 classes, and the American Physical Society with 443,845 scientific papers in 5 classes. The experimental results on the multi-class task show that our multi-view method scores the classification accuracy up to 98%, outperforming state-of-the-arts.

Suggested Citation

  • Yiqin Lv & Zheng Xie & Xiaojing Zuo & Yiping Song, 2022. "A multi-view method of scientific paper classification via heterogeneous graph embeddings," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(8), pages 4847-4872, August.
  • Handle: RePEc:spr:scient:v:127:y:2022:i:8:d:10.1007_s11192-022-04419-1
    DOI: 10.1007/s11192-022-04419-1
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-022-04419-1
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-022-04419-1?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Yonghe Lu & Jiayi Luo & Ying Xiao & Hou Zhu, 2021. "Text representation model of scientific papers based on fusing multi-viewpoint information and its quality assessment," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(8), pages 6937-6963, August.
    2. Naseer Ahmed Sajid & Munir Ahmad & Muhammad Tanvir Afzal & Atta-ur-Rahman, 2021. "Exploiting Papers’ Reference’s Section for Multi-Label Computer Science Research Papers’ Classification," Journal of Information & Knowledge Management (JIKM), World Scientific Publishing Co. Pte. Ltd., vol. 20(01), pages 1-21, March.
    3. Yi Zhang & Fen Zhao & Jianguo Lu, 2019. "P2V: large-scale academic paper embedding," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(1), pages 399-432, October.
    4. Titipat Achakulvisut & Daniel E Acuna & Tulakan Ruangrong & Konrad Kording, 2016. "Science Concierge: A Fast Content-Based Recommendation System for Scientific Publications," PLOS ONE, Public Library of Science, vol. 11(7), pages 1-11, July.
    5. Diego Kozlowski & Jennifer Dusdal & Jun Pang & Andreas Zilian, 2021. "Semantic and relational spaces in science of science: deep learning models for article vectorisation," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 5881-5910, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Xie, Zheng & Lv, Yiqin & Song, Yiping & Wang, Qi, 2024. "Data labeling through the centralities of co-reference networks improves the classification accuracy of scientific papers," Journal of Informetrics, Elsevier, vol. 18(2).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Barbara McGillivray & Gard B. Jenset & Khalid Salama & Donna Schut, 2022. "Investigating patterns of change, stability, and interaction among scientific disciplines using embeddings," Palgrave Communications, Palgrave Macmillan, vol. 9(1), pages 1-15, December.
    2. Lee, O-Joun & Jeon, Hyeon-Ju & Jung, Jason J., 2021. "Learning multi-resolution representations of research patterns in bibliographic networks," Journal of Informetrics, Elsevier, vol. 15(1).
    3. Ana Teresa Santos & Sandro Mendonça, 2022. "Do papers (really) match journals’ “aims and scope”? A computational assessment of innovation studies," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(12), pages 7449-7470, December.
    4. He, Chaocheng & Liu, Fuzhen & Dong, Ke & Wu, Jiang & Zhang, Qingpeng, 2023. "Research on the formation mechanism of research leadership relations: An exponential random graph model analysis approach," Journal of Informetrics, Elsevier, vol. 17(2).
    5. Diego Kozlowski & Jennifer Dusdal & Jun Pang & Andreas Zilian, 2021. "Semantic and relational spaces in science of science: deep learning models for article vectorisation," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 5881-5910, July.
    6. Krzysztof Przystupa & Mykola Beshley & Olena Hordiichuk-Bublivska & Marian Kyryk & Halyna Beshley & Julia Pyrih & Jarosław Selech, 2021. "Distributed Singular Value Decomposition Method for Fast Data Processing in Recommendation Systems," Energies, MDPI, vol. 14(8), pages 1-24, April.
    7. Yuan Chih Fu & Marcelo Marques & Yuen-Hsien Tseng & Justin J. W. Powell & David P. Baker, 2022. "An evolving international research collaboration network: spatial and thematic developments in co-authored higher education research, 1998–2018," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(3), pages 1403-1429, March.
    8. Lea Helmers & Franziska Horn & Franziska Biegler & Tim Oppermann & Klaus-Robert Müller, 2019. "Automating the search for a patent’s prior art with a full text similarity search," PLOS ONE, Public Library of Science, vol. 14(3), pages 1-17, March.
    9. Tianshuang Qiu & Chuanming Yu & Yunci Zhong & Lu An & Gang Li, 2021. "A scientific citation recommendation model integrating network and text representations," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(11), pages 9199-9221, November.
    10. Yonghe Lu & Jiayi Luo & Ying Xiao & Hou Zhu, 2021. "Text representation model of scientific papers based on fusing multi-viewpoint information and its quality assessment," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(8), pages 6937-6963, August.
    11. Zafar Ali & Guilin Qi & Pavlos Kefalas & Shah Khusro & Inayat Khan & Khan Muhammad, 2022. "SPR-SMN: scientific paper recommendation employing SPECTER with memory network," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6763-6785, November.
    12. Jiang, Zhuoren & Lin, Tianqianjin & Huang, Cui, 2023. "Deep representation learning of scientific paper reveals its potential scholarly impact," Journal of Informetrics, Elsevier, vol. 17(1).
    13. Zafar Ali & Irfan Ullah & Amin Khan & Asim Ullah Jan & Khan Muhammad, 2021. "An overview and evaluation of citation recommendation models," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(5), pages 4083-4119, May.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:127:y:2022:i:8:d:10.1007_s11192-022-04419-1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.