IDEAS home Printed from https://ideas.repec.org/a/bla/jinfst/v74y2023i7p759-774.html
   My bibliography  Save this article

Generating keyphrases for readers: A controllable keyphrase generation framework

Author

Listed:
  • Yi Jiang
  • Rui Meng
  • Yong Huang
  • Wei Lu
  • Jiawei Liu

Abstract

With the wide application of keyphrases in many Information Retrieval (IR) and Natural Language Processing (NLP) tasks, automatic keyphrase prediction has been emerging. However, these statistically important phrases are contributing increasingly less to the related tasks because the end‐to‐end learning mechanism enables models to learn the important semantic information of the text directly. Similarly, keyphrases are of little help for readers to quickly grasp the paper's main idea because the relationship between the keyphrase and the paper is not explicit to readers. Therefore, we propose to generate keyphrases with specific functions for readers to bridge the semantic gap between them and the information producers, and verify the effectiveness of the keyphrase function for assisting users’ comprehension with a user experiment. A controllable keyphrase generation framework (the CKPG) that uses the keyphrase function as a control code to generate categorized keyphrases is proposed and implemented based on Transformer, BART, and T5, respectively. For the Computer Science domain, the Macro‐avgs of P@5, R@5, and F1@5 on the Paper with Code dataset are up to 0.680, 0.535, and 0.558, respectively. Our experimental results indicate the effectiveness of the CKPG models.

Suggested Citation

  • Yi Jiang & Rui Meng & Yong Huang & Wei Lu & Jiawei Liu, 2023. "Generating keyphrases for readers: A controllable keyphrase generation framework," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 74(7), pages 759-774, July.
  • Handle: RePEc:bla:jinfst:v:74:y:2023:i:7:p:759-774
    DOI: 10.1002/asi.24749
    as

    Download full text from publisher

    File URL: https://doi.org/10.1002/asi.24749
    Download Restriction: no

    File URL: https://libkey.io/10.1002/asi.24749?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Qikai Cheng & Jiamin Wang & Wei Lu & Yong Huang & Yi Bu, 2020. "Keyword-citation-keyword network: a new perspective of discipline knowledge structure analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(3), pages 1923-1943, September.
    2. Kevin Heffernan & Simone Teufel, 2018. "Identifying problems and solutions in scientific text," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(2), pages 1367-1382, August.
    3. Kun Lu & Margaret E.I. Kipp, 2014. "Understanding the retrieval effectiveness of collaborative tags and author keywords in different retrieval environments: An experimental study on medical collections," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 65(3), pages 483-500, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lu Huang & Xiang Chen & Yi Zhang & Changtian Wang & Xiaoli Cao & Jiarun Liu, 2022. "Identification of topic evolution: network analytics with piecewise linear representation and word embedding," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(9), pages 5353-5383, September.
    2. Youngok Choi & Sue Yeon Syn, 2016. "Characteristics of tagging behavior in digitized humanities online collections," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 67(5), pages 1089-1104, May.
    3. Katchanov, Yurij L. & Markova, Yulia V., 2022. "Dynamics of senses of new physics discourse: Co-keywords analysis," Journal of Informetrics, Elsevier, vol. 16(1).
    4. Yuzhuo Wang & Chengzhi Zhang & Kai Li, 2022. "A review on method entities in the academic literature: extraction, evaluation, and application," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(5), pages 2479-2520, May.
    5. Saeed-Ul Hassan & Naif R. Aljohani & Mudassir Shabbir & Umair Ali & Sehrish Iqbal & Raheem Sarwar & Eugenio Martínez-Cámara & Sebastián Ventura & Francisco Herrera, 2020. "Tweet Coupling: a social media methodology for clustering scientific publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(2), pages 973-991, August.
    6. Pengcheng Li & Wei Lu & Qikai Cheng, 2022. "Generating a related work section for scientific papers: an optimized approach with adopting problem and method information," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(8), pages 4397-4417, August.
    7. Wang, Shiyun & Mao, Jin & Lu, Kun & Cao, Yujie & Li, Gang, 2021. "Understanding interdisciplinary knowledge integration through citance analysis: A case study on eHealth," Journal of Informetrics, Elsevier, vol. 15(4).
    8. Ying Lian & Xiaofeng Lin & Xuefan Dong & Shengjie Hou, 2022. "A Normalized Rich-Club Connectivity-Based Strategy for Keyword Selection in Social Media Analysis," Sustainability, MDPI, vol. 14(13), pages 1-19, June.
    9. Yonghe Lu & Jiayi Luo & Ying Xiao & Hou Zhu, 2021. "Text representation model of scientific papers based on fusing multi-viewpoint information and its quality assessment," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(8), pages 6937-6963, August.
    10. Iqra Safder & Saeed-Ul Hassan, 2019. "Bibliometric-enhanced information retrieval: a novel deep feature engineering approach for algorithm searching from full-text publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(1), pages 257-277, April.
    11. Shiyun Wang & Jin Mao & Yujie Cao & Gang Li, 2022. "Integrated knowledge content in an interdisciplinary field: identification, classification, and application," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6581-6614, November.
    12. Xinyuan Zhang & Qing Xie & Chaemin Song & Min Song, 2022. "Mining the evolutionary process of knowledge through multiple relationships between keywords," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(4), pages 2023-2053, April.
    13. Jianrong Yao & Xiangliang Guo & Lu Wang & Hui Jiang, 2022. "Understanding Green Consumption: A Literature Review Based on Factor Analysis and Bibliometric Method," Sustainability, MDPI, vol. 14(14), pages 1-13, July.
    14. Bowen Ma & Chengzhi Zhang & Yuzhuo Wang & Sanhong Deng, 2022. "Enhancing identification of structure function of academic articles using contextual information," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(2), pages 885-925, February.
    15. Guillaume Cabanac & Ingo Frommholz & Philipp Mayr, 2018. "Bibliometric-enhanced information retrieval: preface," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(2), pages 1225-1227, August.
    16. Gaizka Garechana & Rosa Río-Belver & Enara Zarrabeitia & Izaskun Alvarez-Meaza, 2022. "TeknoAssistant : a domain specific tech mining approach for technical problem-solving support," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(9), pages 5459-5473, September.
    17. Marcelo Oliveira Passos & Priscila Lujan Gonzalez & Mathias Schneid Tessmann & Daniel Abreu Pereira Uhr, 2022. "The greatest co-authorships of finance theory literature (1896–2006): scientometrics based on complex networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(10), pages 5841-5862, October.
    18. Nasrin Asadi & Kambiz Badie & Maryam Tayefeh Mahmoudi, 2019. "Automatic zone identification in scientific papers via fusion techniques," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(2), pages 845-862, May.
    19. Luo, Zhuoran & Lu, Wei & He, Jiangen & Wang, Yuqi, 2022. "Combination of research questions and methods: A new measurement of scientific novelty," Journal of Informetrics, Elsevier, vol. 16(2).
    20. Yingyi Zhang & Chengzhi Zhang, 2024. "Extracting problem and method sentence from scientific papers: a context-enhanced transformer using formulaic expression desensitization," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(6), pages 3433-3468, June.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jinfst:v:74:y:2023:i:7:p:759-774. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.asis.org .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.