IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v93y2012i3d10.1007_s11192-012-0810-x.html
   My bibliography  Save this article

Ten challenges in modeling bibliographic data for bibliometric analysis

Author

Listed:
  • Alfio Ferrara

    (Università degli Studi di Milano)

  • Silvia Salini

    (Università degli Studi di Milano)

Abstract

The complexity and variety of bibliographic data is growing, and efforts to define new methodologies and techniques for bibliometric analysis are intensifying. In this complex scenario, one of the most crucial issues is the quality of data and the capability of bibliometric analysis to cope with multiple data dimensions. Although the problem of enforcing a multidimensional approach to the analysis and management of bibliographic data is not new, a reference design pattern and a specific conceptual model for multidimensional analysis of bibliographic data are still missing. In this paper, we discuss ten of the most relevant challenges for bibliometric analysis when dealing with multidimensional data, and we propose a reference data model that, according to different goals, can help analysis designers and bibliographic experts in working with large collections of bibliographic data.

Suggested Citation

  • Alfio Ferrara & Silvia Salini, 2012. "Ten challenges in modeling bibliographic data for bibliometric analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 93(3), pages 765-785, December.
  • Handle: RePEc:spr:scient:v:93:y:2012:i:3:d:10.1007_s11192-012-0810-x
    DOI: 10.1007/s11192-012-0810-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-012-0810-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-012-0810-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jean-Francois Molinari & Alain Molinari, 2008. "A new methodology for ranking scientific institutions," Scientometrics, Springer;Akadémiai Kiadó, vol. 75(1), pages 163-174, April.
    2. Mallig, Nicolai, 2010. "A relational database for bibliometric analysis," Journal of Informetrics, Elsevier, vol. 4(4), pages 564-580.
    3. Hamish Coates, 2007. "Universities on the Catwalk: Models for Performance Ranking in Australia," Higher Education Management and Policy, OECD Publishing, vol. 19(2), pages 1-17.
    4. Michael Greenacre, 2008. "Correspondence analysis of raw data," Economics Working Papers 1112, Department of Economics and Business, Universitat Pompeu Fabra, revised Jul 2009.
    5. Mallig, Nicolai, 2010. "A relational database for bibliometric analysis," Discussion Papers "Innovation Systems and Policy Analysis" 22, Fraunhofer Institute for Systems and Innovation Research (ISI).
    6. Emil Hudomalj & Gaj Vidmar, 2003. "OLAP and bibliographic databases," Scientometrics, Springer;Akadémiai Kiadó, vol. 58(3), pages 609-622, November.
    7. Marco Geraci & M. Degli Esposti, 2011. "Where do Italian universities stand? An in-depth statistical analysis of national and international rankings," Scientometrics, Springer;Akadémiai Kiadó, vol. 87(3), pages 667-681, June.
    8. Dietmar Wolfram, 2006. "Applications of SQL for informetric frequency distribution processing," Scientometrics, Springer;Akadémiai Kiadó, vol. 67(2), pages 301-313, May.
    9. Scott Deerwester & Susan T. Dumais & George W. Furnas & Thomas K. Landauer & Richard Harshman, 1990. "Indexing by latent semantic analysis," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 41(6), pages 391-407, September.
    10. Yu, Hairong & Davis, Mari & Wilson, Concepción S. & Cole, Fletcher T.H., 2008. "Object-relational data modelling for informetric databases," Journal of Informetrics, Elsevier, vol. 2(3), pages 240-251.
    11. Massimo Franceschet, 2009. "A cluster analysis of scholar and journal bibliometric indicators," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 60(10), pages 1950-1964, October.
    12. Teh, Yee Whye & Jordan, Michael I. & Beal, Matthew J. & Blei, David M., 2006. "Hierarchical Dirichlet Processes," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1566-1581, December.
    13. Lokman I. Meho & Kiduk Yang, 2007. "Impact of data sources on citation counts and rankings of LIS faculty: Web of science versus scopus and google scholar," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 58(13), pages 2105-2125, November.
    14. repec:cte:wsrepe:ws112015 is not listed on IDEAS
    15. Ron S. Kenett & Silvia Salini, 2011. "Modern analysis of customer satisfaction surveys: comparison of models and integrated analysis," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 27(5), pages 465-475, September.
    16. Wolfgang Glänzel & András Schubert, 2003. "A new classification scheme of science fields and subfields designed for scientometric evaluation purposes," Scientometrics, Springer;Akadémiai Kiadó, vol. 56(3), pages 357-367, March.
    17. J. Hubert, 1977. "Bibliometric models for journal productivity," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 4(1), pages 441-473, January.
    18. M. Benito & R. Romera, 2011. "Improving quality assessment of composite indicators in university rankings: a case study of French and German universities of excellence," Scientometrics, Springer;Akadémiai Kiadó, vol. 89(1), pages 153-176, October.
    19. Harvey Goldstein & David J. Spiegelhalter, 1996. "League Tables and Their Limitations: Statistical Issues in Comparisons of Institutional Performance," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 159(3), pages 385-409, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jeong, Yujin & Park, Inchae & Yoon, Byungun, 2019. "Identifying emerging Research and Business Development (R&BD) areas based on topic modeling and visualization with intellectual property right data," Technological Forecasting and Social Change, Elsevier, vol. 146(C), pages 655-672.
    2. Chyi-Kwei Yau & Alan Porter & Nils Newman & Arho Suominen, 2014. "Clustering scientific documents with topic modeling," Scientometrics, Springer;Akadémiai Kiadó, vol. 100(3), pages 767-786, September.
    3. Massimo FLORIO & Francesco GIFFONI, 2019. "L’impatto sociale della produzione di scienza su larga scala: come governarlo?," Departmental Working Papers 2019-05, Department of Economics, Management and Quantitative Methods at Università degli Studi di Milano.
    4. Francesca De Battisti & Silvia Salini, 2013. "Robust analysis of bibliometric data," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 22(2), pages 269-283, June.
    5. Bornmann, Lutz, 2019. "Does the normalized citation impact of universities profit from certain properties of their published documents – such as the number of authors and the impact factor of the publishing journals? A mult," Journal of Informetrics, Elsevier, vol. 13(1), pages 170-184.
    6. Chen, Guo & Xiao, Lu, 2016. "Selecting publication keywords for domain analysis in bibliometrics: A comparison of three methods," Journal of Informetrics, Elsevier, vol. 10(1), pages 212-223.
    7. Sabine Loudcher & Wararat Jakawat & Edmundo Pavel Soriano Morales & Cécile Favre, 2015. "Combining OLAP and information networks for bibliographic data analysis: a survey," Scientometrics, Springer;Akadémiai Kiadó, vol. 103(2), pages 471-487, May.
    8. Charles H. Cho & Tiphaine Jérôme & Jonathan Maurice, 2022. "Assessing the impact of environmental accounting research: evidence from citation and journal data," Post-Print hal-03770661, HAL.
    9. Francesca De Battisti & Alfio Ferrara & Silvia Salini, 2015. "A decade of research in statistics: a topic model approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 103(2), pages 413-433, May.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Guillaume Cabanac, 2012. "Shaping the landscape of research in information systems from the perspective of editorial boards: A scientometric study of 77 leading journals," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(5), pages 977-996, May.
    2. Bar-Ilan, Judit, 2008. "Informetrics at the beginning of the 21st century—A review," Journal of Informetrics, Elsevier, vol. 2(1), pages 1-52.
    3. Gagolewski, Marek, 2011. "Bibliometric impact assessment with R and the CITAN package," Journal of Informetrics, Elsevier, vol. 5(4), pages 678-692.
    4. John Panaretos & Chrisovaladis Malesios, 2009. "Assessing scientific research performance and impact with single indices," Scientometrics, Springer;Akadémiai Kiadó, vol. 81(3), pages 635-670, December.
    5. Guillaume Cabanac, 2012. "Shaping the landscape of research in information systems from the perspective of editorial boards: A scientometric study of 77 leading journals," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 63(5), pages 977-996, May.
    6. Guillaume Cabanac, 2013. "Experimenting with the partnership ability φ-index on a million computer scientists," Scientometrics, Springer;Akadémiai Kiadó, vol. 96(1), pages 1-9, July.
    7. Parul Khurana & Kiran Sharma, 2022. "Impact of h-index on author’s rankings: an improvement to the h-index for lower-ranked authors," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(8), pages 4483-4498, August.
    8. Loizides, Orestis-Stavros & Koutsakis, Polychronis, 2017. "On evaluating the quality of a computer science/computer engineering conference," Journal of Informetrics, Elsevier, vol. 11(2), pages 541-552.
    9. Michel Zitt, 2015. "Meso-level retrieval: IR-bibliometrics interplay and hybrid citation-words methods in scientific fields delineation," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(3), pages 2223-2245, March.
    10. Petridis, Konstantinos & Malesios, Chrisovalantis & Arabatzis, Garyfallos & Thanassoulis, Emmanuel, 2013. "Efficiency analysis of forestry journals: Suggestions for improving journals’ quality," Journal of Informetrics, Elsevier, vol. 7(2), pages 505-521.
    11. Mallig, Nicolai, 2010. "A relational database for bibliometric analysis," Journal of Informetrics, Elsevier, vol. 4(4), pages 564-580.
    12. Fernanda Morillo & Ignacio Santabárbara & Javier Aparicio, 2013. "The automatic normalisation challenge: detailed addresses identification," Scientometrics, Springer;Akadémiai Kiadó, vol. 95(3), pages 953-966, June.
    13. Waltman, Ludo, 2016. "A review of the literature on citation impact indicators," Journal of Informetrics, Elsevier, vol. 10(2), pages 365-391.
    14. Mallig, Nicolai, 2010. "A relational database for bibliometric analysis," Discussion Papers "Innovation Systems and Policy Analysis" 22, Fraunhofer Institute for Systems and Innovation Research (ISI).
    15. Simon Fritzsch & Philipp Scharner & Gregor Weiß, 2021. "Estimating the relation between digitalization and the market value of insurers," Journal of Risk & Insurance, The American Risk and Insurance Association, vol. 88(3), pages 529-567, September.
    16. Michael Hall, C., 2011. "Publish and perish? Bibliometric analysis, journal ranking and the assessment of research quality in tourism," Tourism Management, Elsevier, vol. 32(1), pages 16-27.
    17. Javier Ruiz-Castillo, 2012. "The evaluation of citation distributions," SERIEs: Journal of the Spanish Economic Association, Springer;Spanish Economic Association, vol. 3(1), pages 291-310, March.
    18. Magnone, Edoardo, 2013. "A scientometric look at calendar events," Journal of Informetrics, Elsevier, vol. 7(1), pages 101-108.
    19. Lutz Bornmann & Alexander Butz & Klaus Wohlrabe, 2018. "What are the top five journals in economics? A new meta-ranking," Applied Economics, Taylor & Francis Journals, vol. 50(6), pages 659-675, February.
    20. Carolin Michels & Ulrich Schmoch, 2012. "The growth of science and database coverage," Scientometrics, Springer;Akadémiai Kiadó, vol. 93(3), pages 831-846, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:93:y:2012:i:3:d:10.1007_s11192-012-0810-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.