IDEAS home Printed from https://ideas.repec.org/p/wbk/wbrwps/9488.html
   My bibliography  Save this paper

Applying Machine Learning and Geolocation Techniques to Social Media Data (Twitter) to Develop a Resource for Urban Planning

Author

Listed:
  • Milusheva,Svetoslava Petkova
  • Marty,Robert Andrew
  • Bedoya Arguelles,Guadalupe
  • Williams,Sarah Elizabeth
  • Resor,Elizabeth Landsdowne
  • Legovini,Arianna

Abstract

With all the recent attention focused on big data, it is easy to overlook that basic vital statistics remain difficult to obtain in most of the world. This project set out to test whether an openly available dataset (Twitter) could be transformed into a resource for urban planning and development. The hypothesis is tested by creating road traffic crash location data, which are scarce in most resource-poor environments but essential for addressing the number one cause of mortality for children over age five and young adults. The research project scraped 874,588 traffic-related tweets in Nairobi, Kenya, applied a machine learning model to capture the occurrence of a crash, and developed an improved geoparsing algorithm to identify its location. The project geolocated 32,991 crash reports in Twitter for 2012-20 and clustered them into 22,872 unique crashes to produce one of the first crash maps for Nairobi. A motorcycle delivery service was dispatched in real-time to verify a subset of crashes, showing 92 percent accuracy. Using a spatial clustering algorithm, portions of the road network (less than 1 percent) were identified where 50 percent of the geolocated crashes occurred. Even with limitations in the representativeness of the data, the results can provide urban planners useful information to target road safety improvements where resources are limited.

Suggested Citation

  • Milusheva,Svetoslava Petkova & Marty,Robert Andrew & Bedoya Arguelles,Guadalupe & Williams,Sarah Elizabeth & Resor,Elizabeth Landsdowne & Legovini,Arianna, 2020. "Applying Machine Learning and Geolocation Techniques to Social Media Data (Twitter) to Develop a Resource for Urban Planning," Policy Research Working Paper Series 9488, The World Bank.
  • Handle: RePEc:wbk:wbrwps:9488
    as

    Download full text from publisher

    File URL: http://documents.worldbank.org/curated/en/407261607111342557/pdf/Applying-Machine-Learning-and-Geolocation-Techniques-to-Social-Media-Data-Twitter-to-Develop-a-Resource-for-Urban-Planning.pdf
    Download Restriction: no
    ---><---

    More about this item

    Keywords

    ICT Applications; Disease Control&Prevention; Public Health Promotion; Road Safety; Intelligent Transport Systems; Transport Services; Crime and Society;
    All these keywords.

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wbk:wbrwps:9488. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Roula I. Yazigi (email available below). General contact details of provider: https://edirc.repec.org/data/dvewbus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.