IDEAS home Printed from https://ideas.repec.org/a/gam/jijerp/v19y2022i7p4272-d786225.html
   My bibliography  Save this article

Optimizing the Retrieval of the Vital Status of Cancer Patients for Health Data Warehouses by Using Open Government Data in France

Author

Listed:
  • Olivier Lauzanne

    (Analytics Department & Data Factory, Institut de Cancérologie de l’Ouest, F-44805 Nantes-Angers, France)

  • Jean-Sébastien Frenel

    (Oncology Department, Institut de Cancérologie de l’Ouest, F-44805 Nantes-Angers, France
    Center for Research in Cancerology and Immunology Nantes-Angers, INSERM UMR 1232, Nantes University and Angers University, F-44307 Nantes-Angers, France)

  • Mustapha Baziz

    (Analytics Department & Data Factory, Institut de Cancérologie de l’Ouest, F-44805 Nantes-Angers, France)

  • Mario Campone

    (Oncology Department, Institut de Cancérologie de l’Ouest, F-44805 Nantes-Angers, France
    Center for Research in Cancerology and Immunology Nantes-Angers, INSERM UMR 1232, Nantes University and Angers University, F-44307 Nantes-Angers, France)

  • Judith Raimbourg

    (Oncology Department, Institut de Cancérologie de l’Ouest, F-44805 Nantes-Angers, France
    Center for Research in Cancerology and Immunology Nantes-Angers, INSERM UMR 1232, Nantes University and Angers University, F-44307 Nantes-Angers, France)

  • François Bocquet

    (Analytics Department & Data Factory, Institut de Cancérologie de l’Ouest, F-44805 Nantes-Angers, France
    The Law and Social Change (DCS) Laboratory, UMR CNRS 6297, F-40000 Nantes, France)

Abstract

Electronic Medical Records (EMR) and Electronic Health Records (EHR) are often missing critical information about the death of a patient, although it is an essential metric for medical research in oncology to assess survival outcomes, particularly for evaluating the efficacy of new therapeutic approaches. We used open government data in France from 1970 to September 2021 to identify deceased patients and match them with patient data collected from the Institut de Cancérologie de l’Ouest (ICO) data warehouse (Integrated Center of Oncology—the third largest cancer center in France) between January 2015 and November 2021. To meet our objective, we evaluated algorithms to perform a deterministic record linkage: an exact matching algorithm and a fuzzy matching algorithm. Because we lacked reference data, we needed to assess the algorithms by estimating the number of homonyms that could lead to false links, using the same open dataset of deceased persons in France. The exact matching algorithm allowed us to double the number of dates of death in the ICO data warehouse, and the fuzzy matching algorithm tripled it. Studying homonyms assured us that there was a low risk of misidentification, with precision values of 99.96% for the exact matching and 99.68% for the fuzzy matching. However, estimating the number of false negatives proved more difficult than anticipated. Nevertheless, using open government data can be a highly interesting way to improve the completeness of the date of death variable for oncology patients in data warehouses

Suggested Citation

  • Olivier Lauzanne & Jean-Sébastien Frenel & Mustapha Baziz & Mario Campone & Judith Raimbourg & François Bocquet, 2022. "Optimizing the Retrieval of the Vital Status of Cancer Patients for Health Data Warehouses by Using Open Government Data in France," IJERPH, MDPI, vol. 19(7), pages 1-12, April.
  • Handle: RePEc:gam:jijerp:v:19:y:2022:i:7:p:4272-:d:786225
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1660-4601/19/7/4272/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1660-4601/19/7/4272/
    Download Restriction: no
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. François Bocquet & Mario Campone & Marc Cuggia, 2022. "The Challenges of Implementing Comprehensive Clinical Data Warehouses in Hospitals," IJERPH, MDPI, vol. 19(12), pages 1-6, June.
    2. François Bocquet & Judith Raimbourg & Frédéric Bigot & Victor Simmet & Mario Campone & Jean-Sébastien Frenel, 2023. "Opportunities and Obstacles to the Development of Health Data Warehouses in Hospitals in France: The Recent Experience of Comprehensive Cancer Centers," IJERPH, MDPI, vol. 20(2), pages 1-13, January.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jijerp:v:19:y:2022:i:7:p:4272-:d:786225. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.