IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v127y2022i3d10.1007_s11192-022-04266-0.html
   My bibliography  Save this article

Accounting for quality in data integration systems: a completeness-aware integration approach

Author

Listed:
  • Cinzia Daraio

    (Sapienza University of Rome)

  • Simone Leo

    (Sapienza University of Rome)

  • Monica Scannapieco

    (ISTAT)

Abstract

Ensuring the quality of integrated data is undoubtedly one of the main problems of integrated data systems. When focusing on multi-national and historical data integration systems, where the “space” and “time” dimensions play a relevant role, it is very much important to build the integration layer in such a way that the final user accesses a layer that is “by design” as much complete as possible. In this paper, we propose a method for accessing data in multipurpose data infrastructures, like data integration systems, which has the properties of (i) relieving the final user from the need to access single data sources while, at the same time, (ii) ensuring to maximize the amount of the information available for the user at the integration layer. Our approach is based on a completeness-aware integration approach which allows the user to have ready available all the maximum information that can get out of the integrated data system without having to carry out the preliminary data quality analysis on each of the databases included in the system. Our proposal of providing data quality information at the integrated level extends then the functions of the individual data sources, opening the data infrastructure to additional uses. This may be a first step to move from data infrastructures towards knowledge infrastructures. A case study on the research infrastructure for the science and innovation studies shows the usefulness of the proposed approach.

Suggested Citation

  • Cinzia Daraio & Simone Leo & Monica Scannapieco, 2022. "Accounting for quality in data integration systems: a completeness-aware integration approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(3), pages 1465-1490, March.
  • Handle: RePEc:spr:scient:v:127:y:2022:i:3:d:10.1007_s11192-022-04266-0
    DOI: 10.1007/s11192-022-04266-0
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-022-04266-0
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-022-04266-0?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Cinzia Daraio & Andrea Bonaccorsi, 2017. "Beyond university rankings? Generating new indicators on universities by linking data in open platforms," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 68(2), pages 508-529, February.
    2. Cinzia Daraio, 2017. "A framework for the Assessment of Research and its impacts," DIAG Technical Reports 2017-04, Department of Computer, Control and Management Engineering, Universita' degli Studi di Roma "La Sapienza".
    3. Marco Angelini & Cinzia Daraio & Maurizio Lenzerini & Francesco Leotta & Giuseppe Santucci, 2020. "Performance model’s development: a novel approach encompassing ontology-based data access and visual analytics," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(2), pages 865-892, November.
    4. Cinzia Daraio & Maurizio Lenzerini & Claudio Leporelli & Henk F. Moed & Paolo Naggar & Andrea Bonaccorsi & Alessandro Bartolucci, 2016. "Data integration for research and innovation policy: an Ontology-Based Data Management approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 106(2), pages 857-871, February.
    5. Atanu Sengupta & Sanjoy De, 2020. "Review of Literature," India Studies in Business and Economics, in: Assessing Performance of Banks in India Fifty Years After Nationalization, chapter 0, pages 15-30, Springer.
    6. Hamid Ekbia & Michael Mattioli & Inna Kouper & G. Arave & Ali Ghazinejad & Timothy Bowman & Venkata Ratandeep Suri & Andrew Tsou & Scott Weingart & Cassidy R. Sugimoto, 2015. "Big data, bigger dilemmas: A critical review," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(8), pages 1523-1545, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Okwori, E. & Viklander, M. & Hedström, A., 2024. "Data integration in asset management of municipal pipe networks in Sweden: Challenges, gaps, and potential drivers," Utilities Policy, Elsevier, vol. 86(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Marco Angelini & Cinzia Daraio & Maurizio Lenzerini & Francesco Leotta & Giuseppe Santucci, 2020. "Performance model’s development: a novel approach encompassing ontology-based data access and visual analytics," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(2), pages 865-892, November.
    2. Marco Angelini & Cinzia Daraio & Maurizio Lenzerini & Francesco Leotta & Giuseppe Santucci, 2019. "Performance Model’s development: A Novel Approach encompassing Ontology-Based Data Access and Visual Analytics," DIAG Technical Reports 2019-11, Department of Computer, Control and Management Engineering, Universita' degli Studi di Roma "La Sapienza".
    3. Cinzia Daraio, 2017. "A framework for the Assessment of Research and its impacts," DIAG Technical Reports 2017-04, Department of Computer, Control and Management Engineering, Universita' degli Studi di Roma "La Sapienza".
    4. Cristina Blasi Casagran & Colleen Boland & Elena Sánchez-Montijano & Eva Vilà Sanchez, 2021. "The Role of Emerging Predictive IT Tools in Effective Migration Governance," Politics and Governance, Cogitatio Press, vol. 9(4), pages 133-145.
    5. He Tingting, 2021. "Comparing Money and Time Donation: What Do Experiments Tell Us?," Marketing of Scientific and Research Organizations, Sciendo, vol. 41(3), pages 65-94, September.
    6. Alberto Cerezo-Narváez & Andrés Pastor-Fernández & Manuel Otero-Mateo & Pablo Ballesteros-Pérez, 2022. "The Influence of Knowledge on Managing Risk for the Success in Complex Construction Projects: The IPMA Approach," Sustainability, MDPI, vol. 14(15), pages 1-30, August.
    7. Rafidah Md Noor & Nadia Bella Gustiani Rasyidi & Tarak Nandy & Raenu Kolandaisamy, 2020. "Campus Shuttle Bus Route Optimization Using Machine Learning Predictive Analysis: A Case Study," Sustainability, MDPI, vol. 13(1), pages 1-24, December.
    8. Dominika Ehrenbergerová & Martin Hodula & Zuzana Gric, 2022. "Does capital-based regulation affect bank pricing policy?," Journal of Regulatory Economics, Springer, vol. 61(2), pages 135-167, April.
    9. Cinzia Daraio & Simone Di Leo & Loet Leydesdorff, 2022. "Using the Leiden Rankings as a Heuristics: Evidence from Italian universities in the European landscape," LEM Papers Series 2022/08, Laboratory of Economics and Management (LEM), Sant'Anna School of Advanced Studies, Pisa, Italy.
    10. Mohammed Khaled Al-Hanawi & Rubayyat Hashmi & Sarh Almubark & Ameerah M. N. Qattan & Mohammad Habibullah Pulok, 2020. "Socioeconomic Inequalities in Uptake of Breast Cancer Screening among Saudi Women: A Cross-Sectional Analysis of a National Survey," IJERPH, MDPI, vol. 17(6), pages 1-13, March.
    11. Ortega, José Luis, 2021. "How do media mention research papers? Structural analysis of blogs and news networks using citation coupling," Journal of Informetrics, Elsevier, vol. 15(3).
    12. Richard Grieveson & Michael Landesmann & Isilda Mara, 2021. "Potential Mobility from Africa, Middle East and EU Neighbouring Countries to Europe," wiiw Working Papers 199, The Vienna Institute for International Economic Studies, wiiw.
    13. Eric W. K. See-To & Eric W. T. Ngai, 2018. "Customer reviews for demand distribution and sales nowcasting: a big data approach," Annals of Operations Research, Springer, vol. 270(1), pages 415-431, November.
    14. Pham, Hanh Song Thi & Petersen, Bent, 2021. "The bargaining power, value capture, and export performance of Vietnamese manufacturers in global value chains," International Business Review, Elsevier, vol. 30(6).
    15. Wafa Alwakid & Sebastian Aparicio & David Urbano, 2021. "The Influence of Green Entrepreneurship on Sustainable Development in Saudi Arabia: The Role of Formal Institutions," IJERPH, MDPI, vol. 18(10), pages 1-23, May.
    16. Gary Gereffi, 2020. "What does the COVID-19 pandemic teach us about global value chains? The case of medical supplies," Journal of International Business Policy, Palgrave Macmillan, vol. 3(3), pages 287-301, September.
    17. E. Denny, 2022. "Long-term Energy Cost Labelling for Appliances: Evidence from a Randomised Controlled Trial in Ireland," Journal of Consumer Policy, Springer, vol. 45(3), pages 369-409, September.
    18. Kentaka Aruga & Md. Monirul Islam & Yoshihiro Zenno & Arifa Jannat, 2022. "Developing Novel Technique for Investigating Guidelines and Frameworks: A Text Mining Comparison between International and Japanese Green Bonds," JRFM, MDPI, vol. 15(9), pages 1-17, August.
    19. Eric. W. K. See-To & Yang Yang, 2017. "Market sentiment dispersion and its effects on stock return and volatility," Electronic Markets, Springer;IIM University of St. Gallen, vol. 27(3), pages 283-296, August.
    20. Lenka Mynaříková & Lukáš Novotný, 2020. "Knowledge Society Failure? Barriers in the Use of ICTs and Further Teacher Education in the Czech Republic," Sustainability, MDPI, vol. 12(17), pages 1-19, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:127:y:2022:i:3:d:10.1007_s11192-022-04266-0. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.