IDEAS home Printed from https://ideas.repec.org/a/kap/jgeosy/v26y2024i4d10.1007_s10109-023-00435-8.html
   My bibliography  Save this article

Unveiling the impact of machine learning algorithms on the quality of online geocoding services: a case study using COVID-19 data

Author

Listed:
  • Batuhan Kilic

    (Yildiz Technical University)

  • Onur Can Bayrak

    (Yildiz Technical University)

  • Fatih Gülgen

    (Yildiz Technical University)

  • Mert Gurturk

    (Yildiz Technical University)

  • Perihan Abay

    (Kanuni Sultan Süleyman Research and Training Hospital)

Abstract

In today's era, the address plays a crucial role as one of the key components that enable mobility in daily life. Address data are used by global map platforms and location-based services to pinpoint a geographically referenced location. Geocoding provided by online platforms is useful in the spatial tracking of reported cases and controls in the spatial analysis of infectious illnesses such as COVID-19. The first and most critical phase in the geocoding process is address matching. However, due to typographical errors, variations in abbreviations used, and incomplete or malformed addresses, the matching can seldom be performed with 100% accuracy. The purpose of this research is to examine the capabilities of machine learning classifiers that can be used to measure the consistency of address matching results produced by online geocoding services and to identify the best performing classifier. The performance of the seven machine learning classifiers was compared using several text similarity measures, which assess the match scores between the input address data and the services' output. The data utilized in the testing came from four distinct online geocoding services applied to 925 addresses in Türkiye. The findings from this study revealed that the Random Forest machine learning classifier was the most accurate in the address matching procedure. While the results of this study hold true for similar datasets in Türkiye, additional research is required to determine whether they apply to data in other countries.

Suggested Citation

  • Batuhan Kilic & Onur Can Bayrak & Fatih Gülgen & Mert Gurturk & Perihan Abay, 2024. "Unveiling the impact of machine learning algorithms on the quality of online geocoding services: a case study using COVID-19 data," Journal of Geographical Systems, Springer, vol. 26(4), pages 601-622, October.
  • Handle: RePEc:kap:jgeosy:v:26:y:2024:i:4:d:10.1007_s10109-023-00435-8
    DOI: 10.1007/s10109-023-00435-8
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10109-023-00435-8
    File Function: Abstract
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1007/s10109-023-00435-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    More about this item

    Keywords

    Address matching; COVID-19; Geocoding; Machine learning; Random forest;
    All these keywords.

    JEL classification:

    • C45 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - Neural Networks and Related Topics
    • C52 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Evaluation, Validation, and Selection
    • C53 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Forecasting and Prediction Models; Simulation Methods
    • I18 - Health, Education, and Welfare - - Health - - - Government Policy; Regulation; Public Health

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:kap:jgeosy:v:26:y:2024:i:4:d:10.1007_s10109-023-00435-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.