IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v9y2015i4p860-871.html
   My bibliography  Save this article

Assessing the impact of software on science: A bootstrapped learning of software entities in full-text papers

Author

Listed:
  • Pan, Xuelian
  • Yan, Erjia
  • Wang, Qianqian
  • Hua, Weina

Abstract

Although software has helped researchers conduct research, little is known of the impact of software on science. To fill this gap, this article proposes an improved bootstrapping method to extract software entities from full-text papers and assess their impact on science. Evaluation results show that the proposed entity extraction system outperforms three baseline methods on extracting software entities from full-text papers. The proposed method is then used to learn software entities from all papers published in PLoS ONE in 2014. More than 2000 unique software entities are obtained which accounted for more than 20,000 mentions and more than 7000 citations. The paper finds that software is commonly used in the scientific community along with a substantial uncitedness.

Suggested Citation

  • Pan, Xuelian & Yan, Erjia & Wang, Qianqian & Hua, Weina, 2015. "Assessing the impact of software on science: A bootstrapped learning of software entities in full-text papers," Journal of Informetrics, Elsevier, vol. 9(4), pages 860-871.
  • Handle: RePEc:eee:infome:v:9:y:2015:i:4:p:860-871
    DOI: 10.1016/j.joi.2015.07.012
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1751157715300602
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2015.07.012?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Erjia Yan & Cassidy R. Sugimoto, 2011. "Institutional interactions: Exploring social, cognitive, and geographic relationships between institutions as demonstrated through citation networks," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 62(8), pages 1498-1514, August.
    2. Heather Piwowar, 2013. "Value all research products," Nature, Nature, vol. 493(7431), pages 159-159, January.
    3. Daniele Fanelli, 2010. "Do Pressures to Publish Increase Scientists' Bias? An Empirical Support from US States Data," PLOS ONE, Public Library of Science, vol. 5(4), pages 1-7, April.
    4. Erjia Yan & Cassidy R. Sugimoto, 2011. "Institutional interactions: Exploring social, cognitive, and geographic relationships between institutions as demonstrated through citation networks," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 62(8), pages 1498-1514, August.
    5. Leonardo Candela & Donatella Castelli & Paolo Manghi & Alice Tani, 2015. "Data journals: A survey," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(9), pages 1747-1762, September.
    6. Jacob, Brian A. & Lefgren, Lars, 2011. "The impact of research grant funding on scientific productivity," Journal of Public Economics, Elsevier, vol. 95(9), pages 1168-1177.
    7. Xianwen Wang & Di Liu & Kun Ding & Xinran Wang, 2012. "Science funding and research output: a study on 10 countries," Scientometrics, Springer;Akadémiai Kiadó, vol. 91(2), pages 591-599, May.
    8. Vincent Larivière & Cassidy R. Sugimoto & Benoit Macaluso & Staša Milojević & Blaise Cronin & Mike Thelwall, 2014. "arXiv E-prints and the journal of record: An analysis of roles and relationships," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 65(6), pages 1157-1169, June.
    9. Stefanie Haustein & Isabella Peters & Cassidy R. Sugimoto & Mike Thelwall & Vincent Larivière, 2014. "Tweeting biomedicine: An analysis of tweets and citations in the biomedical literature," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 65(4), pages 656-669, April.
    10. Yan, Erjia & Guns, Raf, 2014. "Predicting and recommending collaborations: An author-, institution-, and country-level analysis," Journal of Informetrics, Elsevier, vol. 8(2), pages 295-309.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Xuelian Pan & Erjia Yan & Weina Hua, 2016. "Disciplinary differences of software use and impact in scientific literature," Scientometrics, Springer;Akadémiai Kiadó, vol. 109(3), pages 1593-1610, December.
    2. Robert Tomaszewski, 2023. "Visibility, impact, and applications of bibliometric software tools through citation analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(7), pages 4007-4028, July.
    3. Kai Li & Jason Rollins & Erjia Yan, 2018. "Web of Science use in published research and review papers 1997–2017: a selective, dynamic, cross-domain, content-based analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 115(1), pages 1-20, April.
    4. Li, Kai & Chen, Pei-Ying & Yan, Erjia, 2019. "Challenges of measuring software impact through citations: An examination of the lme4 R package," Journal of Informetrics, Elsevier, vol. 13(1), pages 449-461.
    5. Yuzhuo Wang & Kai Li, 2024. "How do official software citation formats evolve over time? A longitudinal analysis of R programming language packages," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(7), pages 3997-4019, July.
    6. Wang, Yuzhuo & Zhang, Chengzhi, 2020. "Using the full-text content of academic articles to identify and evaluate algorithm entities in the domain of natural language processing," Journal of Informetrics, Elsevier, vol. 14(4).
    7. Tong, Tong & Wang, Wanru & Ye, Fred Y., 2024. "A complement to the novel disruption indicator based on knowledge entities," Journal of Informetrics, Elsevier, vol. 18(2).
    8. Caifan Du & Johanna Cohoon & Patrice Lopez & James Howison, 2021. "Softcite dataset: A dataset of software mentions in biomedical and economic research publications," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 72(7), pages 870-884, July.
    9. Yuzhuo Wang & Chengzhi Zhang & Kai Li, 2022. "A review on method entities in the academic literature: extraction, evaluation, and application," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(5), pages 2479-2520, May.
    10. Lu Jiang & Xinyu Kang & Shan Huang & Bo Yang, 2022. "A refinement strategy for identification of scientific software from bioinformatics publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(6), pages 3293-3316, June.
    11. Pan, Xuelian & Yan, Erjia & Cui, Ming & Hua, Weina, 2018. "Examining the usage, citation, and diffusion patterns of bibliometric mapping software: A comparative study of three tools," Journal of Informetrics, Elsevier, vol. 12(2), pages 481-493.
    12. Yingyi Zhang & Chengzhi Zhang, 2024. "Extracting problem and method sentence from scientific papers: a context-enhanced transformer using formulaic expression desensitization," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(6), pages 3433-3468, June.
    13. Enrique Orduña-Malea & Rodrigo Costas, 2021. "Link-based approach to study scientific software usage: the case of VOSviewer," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(9), pages 8153-8186, September.
    14. Bikun Chen & Dannan Deng & Zhouyan Zhong & Chengzhi Zhang, 2020. "Exploring linguistic characteristics of highly browsed and downloaded academic articles," Scientometrics, Springer;Akadémiai Kiadó, vol. 122(3), pages 1769-1790, March.
    15. Li, Kai & Yan, Erjia, 2018. "Co-mention network of R packages: Scientific impact and clustering structure," Journal of Informetrics, Elsevier, vol. 12(1), pages 87-100.
    16. Pan, Xuelian & Yan, Erjia & Cui, Ming & Hua, Weina, 2019. "How important is software to library and information science research? A content analysis of full-text publications," Journal of Informetrics, Elsevier, vol. 13(1), pages 397-406.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Feiheng Luo & Aixin Sun & Mojisola Erdt & Aravind Sesagiri Raamkumar & Yin-Leng Theng, 2018. "Exploring prestigious citations sourced from top universities in bibliometrics and altmetrics: a case study in the computer science discipline," Scientometrics, Springer;Akadémiai Kiadó, vol. 114(1), pages 1-17, January.
    2. Lili Yuan & Yanni Hao & Minglu Li & Chunbing Bao & Jianping Li & Dengsheng Wu, 2018. "Who are the international research collaboration partners for China? A novel data perspective based on NSFC grants," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(1), pages 401-422, July.
    3. Pan, Xuelian & Yan, Erjia & Cui, Ming & Hua, Weina, 2019. "How important is software to library and information science research? A content analysis of full-text publications," Journal of Informetrics, Elsevier, vol. 13(1), pages 397-406.
    4. Liwei Zhang & Jue Wang, 2021. "What affects publications’ popularity on Twitter?," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(11), pages 9185-9198, November.
    5. Bornmann, Lutz, 2014. "Do altmetrics point to the broader impact of research? An overview of benefits and disadvantages of altmetrics," Journal of Informetrics, Elsevier, vol. 8(4), pages 895-903.
    6. Xu, Fang & Ou, Guiyan & Ma, Tingcan & Wang, Xianwen, 2021. "The consistency of impact of preprints and their journal publications," Journal of Informetrics, Elsevier, vol. 15(2).
    7. Hottenrott, Hanna & Lawson, Cornelia, 2013. "Fishing for Complementarities: Competitive Research Funding and Research Productivity," Department of Economics and Statistics Cognetti de Martiis LEI & BRICK - Laboratory of Economics of Innovation "Franco Momigliano", Bureau of Research in Innovation, Complexity and Knowledge, Collegio 201318, University of Turin.
    8. Yan, Erjia & Ding, Ying & Cronin, Blaise & Leydesdorff, Loet, 2013. "A bird's-eye view of scientific trading: Dependency relations among fields of science," Journal of Informetrics, Elsevier, vol. 7(2), pages 249-264.
    9. Mojisola Erdt & Aarthy Nagarajan & Sei-Ching Joanna Sin & Yin-Leng Theng, 2016. "Altmetrics: an analysis of the state-of-the-art in measuring research impact on social media," Scientometrics, Springer;Akadémiai Kiadó, vol. 109(2), pages 1117-1166, November.
    10. Stefan Hennemann, 2012. "Evaluating the performance of geographical locations within scientific networks using an aggregation—randomization—re-sampling approach (ARR)," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(12), pages 2393-2404, December.
    11. Houcemeddine Turki & Mohamed Ali Hadj Taieb & Mohamed Ben Aouicha & Ajith Abraham, 2020. "Nature or Science: what Google Trends says," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(2), pages 1367-1385, August.
    12. Zhiqi Wang & Wolfgang Glänzel & Yue Chen, 2020. "The impact of preprints in Library and Information Science: an analysis of citations, usage and social attention indicators," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(2), pages 1403-1423, November.
    13. Guo Chen & Lu Xiao & Chang-ping Hu & Xue-qin Zhao, 2015. "Identifying the research focus of Library and Information Science institutions in China with institution-specific keywords," Scientometrics, Springer;Akadémiai Kiadó, vol. 103(2), pages 707-724, May.
    14. Erjia Yan, 2014. "Topic-based Pagerank: toward a topic-level scientific evaluation," Scientometrics, Springer;Akadémiai Kiadó, vol. 100(2), pages 407-437, August.
    15. Giovanni Abramo & Ciriaco Andrea D’Angelo & Flavia Costa, 2020. "Does the geographic proximity effect on knowledge spillovers vary across research fields?," Scientometrics, Springer;Akadémiai Kiadó, vol. 123(2), pages 1021-1036, May.
    16. Wang, Jue & Zhang, Liwei, 2018. "Proximal advantage in knowledge diffusion: The time dimension," Journal of Informetrics, Elsevier, vol. 12(3), pages 858-867.
    17. He, Bing & Ding, Ying & Yan, Erjia, 2012. "Mining patterns of author orders in scientific publications," Journal of Informetrics, Elsevier, vol. 6(3), pages 359-367.
    18. Shan Jiang & Hsinchun Chen, 2019. "Examining patterns of scientific knowledge diffusion based on knowledge cyber infrastructure: a multi-dimensional network approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(3), pages 1599-1617, December.
    19. Yongjun Zhu & Erjia Yan, 2015. "Dynamic subfield analysis of disciplines: an examination of the trading impact and knowledge diffusion patterns of computer science," Scientometrics, Springer;Akadémiai Kiadó, vol. 104(1), pages 335-359, July.
    20. Yuanyuan Liu & Qiang Wu & Shijie Wu & Yong Gao, 2021. "Weighted citation based on ranking-related contribution: a new index for evaluating article impact," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(10), pages 8653-8672, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:9:y:2015:i:4:p:860-871. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.