IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0230416.html
   My bibliography  Save this article

The citation advantage of linking publications to research data

Author

Listed:
  • Giovanni Colavizza
  • Iain Hrynaszkiewicz
  • Isla Staden
  • Kirstie Whitaker
  • Barbara McGillivray

Abstract

Efforts to make research results open and reproducible are increasingly reflected by journal policies encouraging or mandating authors to provide data availability statements. As a consequence of this, there has been a strong uptake of data availability statements in recent literature. Nevertheless, it is still unclear what proportion of these statements actually contain well-formed links to data, for example via a URL or permanent identifier, and if there is an added value in providing such links. We consider 531, 889 journal articles published by PLOS and BMC, develop an automatic system for labelling their data availability statements according to four categories based on their content and the type of data availability they display, and finally analyze the citation advantage of different statement categories via regression. We find that, following mandated publisher policies, data availability statements become very common. In 2018 93.7% of 21,793 PLOS articles and 88.2% of 31,956 BMC articles had data availability statements. Data availability statements containing a link to data in a repository—rather than being available on request or included as supporting information files—are a fraction of the total. In 2017 and 2018, 20.8% of PLOS publications and 12.2% of BMC publications provided DAS containing a link to data in a repository. We also find an association between articles that include statements that link to data in a repository and up to 25.36% (± 1.07%) higher citation impact on average, using a citation prediction model. We discuss the potential implications of these results for authors (researchers) and journal publishers who make the effort of sharing their data in repositories. All our data and code are made available in order to reproduce and extend our results.

Suggested Citation

  • Giovanni Colavizza & Iain Hrynaszkiewicz & Isla Staden & Kirstie Whitaker & Barbara McGillivray, 2020. "The citation advantage of linking publications to research data," PLOS ONE, Public Library of Science, vol. 15(4), pages 1-18, April.
  • Handle: RePEc:plo:pone00:0230416
    DOI: 10.1371/journal.pone.0230416
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0230416
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0230416&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0230416?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Andreas Strotmann & Dangzhi Zhao, 2012. "Author name disambiguation: What difference does it make in author-based citation analysis?," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(9), pages 1820-1833, September.
    2. Richard Klavans & Kevin W. Boyack, 2017. "Which Type of Citation Analysis Generates the Most Accurate Taxonomy of Scientific and Technical Knowledge?," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 68(4), pages 984-998, April.
    3. David Giofrè & Geoff Cumming & Luca Fresc & Ingrid Boedker & Patrizio Tressoldi, 2017. "The influence of journal submission guidelines on authors' reporting of statistics and use of open research practices," PLOS ONE, Public Library of Science, vol. 12(4), pages 1-15, April.
    4. Wanli Liu & Rezarta Islamaj Doğan & Sun Kim & Donald C. Comeau & Won Kim & Lana Yeganova & Zhiyong Lu & W. John Wilbur, 2014. "Author name disambiguation for PubMed," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 65(4), pages 765-781, April.
    5. Heather A Piwowar & Roger S Day & Douglas B Fridsma, 2007. "Sharing Detailed Research Data Is Associated with Increased Citation Rate," PLOS ONE, Public Library of Science, vol. 2(3), pages 1-5, March.
    6. Andreas Strotmann & Dangzhi Zhao, 2012. "Author name disambiguation: What difference does it make in author‐based citation analysis?," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 63(9), pages 1820-1833, September.
    7. Thelwall, Mike & Wilson, Paul, 2014. "Regression for citation data: An evaluation of different methods," Journal of Informetrics, Elsevier, vol. 8(4), pages 963-971.
    8. Michael P. Milham & R. Cameron Craddock & Jake J. Son & Michael Fleischmann & Jon Clucas & Helen Xu & Bonhwang Koo & Anirudh Krishnakumar & Bharat B. Biswal & F. Xavier Castellanos & Stan Colcombe & A, 2018. "Assessment of the impact of shared brain imaging data on the scientific literature," Nature Communications, Nature, vol. 9(1), pages 1-7, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Andreas Daniel & Stefan Jakowatz & Nadeshda Jung & Lydia Kleine & Aleksander Kocaj & Alexia Meyermann & Kati Mozygemba & Alexander Schuster, 2023. "Die Erfassung von Publikationen aus der Datennutzung: Verfahren, Herausforderungen und Nutzen. Ein Erfahrungsbericht von Forschungsdatenzentren," RatSWD Working Papers 281, German Data Forum (RatSWD).
    2. Dengsheng Wu & Huidong Wu & Jianping Li, 2024. "Citation advantage of positive words: predictability, temporal evolution, and universality in varied quality journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(7), pages 4275-4293, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ciriaco Andrea D’Angelo & Nees Jan Eck, 2020. "Collecting large-scale publication data at the level of individual researchers: a practical proposal for author name disambiguation," Scientometrics, Springer;Akadémiai Kiadó, vol. 123(2), pages 883-907, May.
    2. Jinseok Kim & Jinmo Kim & Jason Owen-Smith, 2019. "Generating automatically labeled data for author name disambiguation: an iterative clustering method," Scientometrics, Springer;Akadémiai Kiadó, vol. 118(1), pages 253-280, January.
    3. Jinseok Kim, 2019. "A fast and integrative algorithm for clustering performance evaluation in author name disambiguation," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(2), pages 661-681, August.
    4. Alexander Karlsson & Björn Hammarfelt & H. Joe Steinhauer & Göran Falkman & Nasrine Olson & Gustaf Nelhans & Jan Nolin, 2015. "Modeling uncertainty in bibliometrics and information retrieval: an information fusion approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(3), pages 2255-2274, March.
    5. Jinseok Kim, 2018. "Evaluating author name disambiguation for digital libraries: a case of DBLP," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(3), pages 1867-1886, September.
    6. Jinseok Kim & Jason Owen-Smith, 2021. "ORCID-linked labeled data for evaluating author name disambiguation at scale," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(3), pages 2057-2083, March.
    7. Song, Min & Kim, Erin Hea-Jin & Kim, Ha Jin, 2015. "Exploring author name disambiguation on PubMed-scale," Journal of Informetrics, Elsevier, vol. 9(4), pages 924-941.
    8. Mike Thelwall, 2020. "Mid-career field switches reduce gender disparities in academic publishing," Scientometrics, Springer;Akadémiai Kiadó, vol. 123(3), pages 1365-1383, June.
    9. Chengliang Wang & Xiaojiao Chen & Teng Yu & Yidan Liu & Yuhui Jing, 2024. "Education reform and change driven by digital technology: a bibliometric study from a global perspective," Palgrave Communications, Palgrave Macmillan, vol. 11(1), pages 1-17, December.
    10. Dangzhi Zhao & Andreas Strotmann, 2020. "Telescopic and panoramic views of library and information science research 2011–2018: a comparison of four weighting schemes for author co-citation analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(1), pages 255-270, July.
    11. Kim, Jinseok & Diesner, Jana, 2015. "The effect of data pre-processing on understanding the evolution of collaboration networks," Journal of Informetrics, Elsevier, vol. 9(1), pages 226-236.
    12. Mi Zhou & Biyu Bian & Weiming Zhu & Li Huang, 2021. "A Half Century of Research on Childhood and Adolescent Depression: Science Mapping the Literature, 1970 to 2019," IJERPH, MDPI, vol. 18(18), pages 1-20, September.
    13. Liu, Meijun & Hu, Xiao, 2021. "Will collaborators make scientists move? A Generalized Propensity Score analysis," Journal of Informetrics, Elsevier, vol. 15(1).
    14. Lutz Bornmann & Werner Marx, 2014. "How to evaluate individual researchers working in the natural and life sciences meaningfully? A proposal of methods based on percentiles of citations," Scientometrics, Springer;Akadémiai Kiadó, vol. 98(1), pages 487-509, January.
    15. Xuan Zhen Liu & Hui Fang, 2014. "Scientific group leaders’ authorship preferences: an empirical investigation," Scientometrics, Springer;Akadémiai Kiadó, vol. 98(2), pages 909-925, February.
    16. Freeman, Richard B. & Huang, Wei, 2014. "Collaborating With People Like Me: Ethnic Co-authorship within the US," IZA Discussion Papers 8432, Institute of Labor Economics (IZA).
    17. Uyen-Phuong Nguyen & Philip Hallinger, 2020. "Assessing the Distinctive Contributions of Simulation & Gaming to the Literature, 1970-2019: A Bibliometric Review," Simulation & Gaming, , vol. 51(6), pages 744-769, December.
    18. Xie, Qing & Zhang, Xinyuan & Song, Min, 2021. "A network embedding-based scholar assessment indicator considering four facets: Research topic, author credit allocation, field-normalized journal impact, and published time," Journal of Informetrics, Elsevier, vol. 15(4).
    19. Xianru Shang & Zijian Liu & Chen Gong & Zhigang Hu & Yuexuan Wu & Chengliang Wang, 2024. "Knowledge mapping and evolution of research on older adults’ technology acceptance: a bibliometric study from 2013 to 2023," Palgrave Communications, Palgrave Macmillan, vol. 11(1), pages 1-21, December.
    20. Jinseok Kim & Liang Tao & Seok-Hyoung Lee & Jana Diesner, 2016. "Evolution and structure of scientific co-publishing network in Korea between 1948–2011," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(1), pages 27-41, April.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0230416. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.