IDEAS home Printed from https://ideas.repec.org/p/fpr/cgiarp/137177.html
   My bibliography  Save this paper

Longa: An automated speech recognition tool for Bantu languages

Author

Listed:
  • Mganga, Nelson
  • Jones-Garcia, Eliot
  • Monsalue, Andrea Gardeazabal
  • Koo, Jawoo

Abstract

Farm Radio International (FRI) and the CGIAR Research Initiative on Digital Innovation have col laborated on the development of an end-to-end, automatic speech recognition pipeline for the tran scription, translation, and analysis of Swahili and Luganda. This task is particularly challenging due to the number of languages used by FRI's clients and the limited training data available for speech recognition in African languages. The tool is named 'Longa', or 'Let's chat' in Swahili. Longa will be used to answer the surplus of phone calls currently being received from smallholder farmers asking questions about radio programs which FRI does not presently have the capacity to address. When fully implemented, Longa should allow FRI to design their broadcasts more intricately in line with the needs of farmers and better deliver insights to those most in need, such as female and youth farmers. Key results from the collaboration include a series of design principles iteratively and col laboratively developed to reflect the common values and goals of FRI and the CGIAR, a proof of concept for Longa, building on open-source models and open access corpora, to be shared with the developer community upon completion of the final tool, a 10% improvement upon the state-of-the art automatic speech recognition in Luganda radio-speech performance and accuracy, some im provement in performance with audio enhancement processes using real-world data, and proof that fine-tuning is an effective approach to expanding Longa to new languages. The next steps of the collaboration will focus on the analysis and interpretation of an aggregation of farmer phone calls and integration with the existing FRI workflow and software.

Suggested Citation

  • Mganga, Nelson & Jones-Garcia, Eliot & Monsalue, Andrea Gardeazabal & Koo, Jawoo, 2024. "Longa: An automated speech recognition tool for Bantu languages," CGIAR Initative Publications Digital Innovation, International Food Policy Research Institute (IFPRI).
  • Handle: RePEc:fpr:cgiarp:137177
    as

    Download full text from publisher

    File URL: https://cgspace.cgiar.org/bitstreams/f657054f-dcf6-4228-bccd-78f9a3f99b28/download
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:fpr:cgiarp:137177. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/ifprius.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.