IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2409.08379.html
   My bibliography  Save this paper

The Impact of Large Language Models on Open-source Innovation: Evidence from GitHub Copilot

Author

Listed:
  • Doron Yeverechyahu
  • Raveesh Mayya
  • Gal Oestreicher-Singer

Abstract

Generative AI (GenAI) has been shown to enhance individual productivity in a guided setting. While it is also likely to transform processes in a collaborative work setting, it is unclear what trajectory this transformation will follow. Collaborative environment is characterized by a blend of origination tasks that involve building something from scratch and iteration tasks that involve refining on others' work. Whether GenAI affects these two aspects of collaborative work and to what extent is an open empirical question. We study this question within the open-source development landscape, a prime example of collaborative innovation, where contributions are voluntary and unguided. Specifically, we focus on the launch of GitHub Copilot in October 2021 and leverage a natural experiment in which GitHub Copilot (a programming-focused LLM) selectively rolled out support for Python, but not for R. We observe a significant jump in overall contributions, suggesting that GenAI effectively augments collaborative innovation in an unguided setting. Interestingly, Copilot's launch increased maintenance-related contributions, which are mostly iterative tasks involving building on others' work, significantly more than code-development contributions, which are mostly origination tasks involving standalone contributions. This disparity was exacerbated in active projects with extensive coding activity, raising concerns that, as GenAI models improve to accommodate richer context, the gap between origination and iterative solutions may widen. We discuss practical and policy implications to incentivize high-value innovative solutions.

Suggested Citation

  • Doron Yeverechyahu & Raveesh Mayya & Gal Oestreicher-Singer, 2024. "The Impact of Large Language Models on Open-source Innovation: Evidence from GitHub Copilot," Papers 2409.08379, arXiv.org.
  • Handle: RePEc:arx:papers:2409.08379
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2409.08379
    File Function: Latest version
    Download Restriction: no
    ---><---

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2409.08379. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.