Doron Yeverechyahu, Raveesh Mayya, Gal Oestreicher-Singer
{"title":"The Impact of Large Language Models on Open-source Innovation: Evidence from GitHub Copilot","authors":"Doron Yeverechyahu, Raveesh Mayya, Gal Oestreicher-Singer","doi":"arxiv-2409.08379","DOIUrl":null,"url":null,"abstract":"Generative AI (GenAI) has been shown to enhance individual productivity in a\nguided setting. While it is also likely to transform processes in a\ncollaborative work setting, it is unclear what trajectory this transformation\nwill follow. Collaborative environment is characterized by a blend of\norigination tasks that involve building something from scratch and iteration\ntasks that involve refining on others' work. Whether GenAI affects these two\naspects of collaborative work and to what extent is an open empirical question.\nWe study this question within the open-source development landscape, a prime\nexample of collaborative innovation, where contributions are voluntary and\nunguided. Specifically, we focus on the launch of GitHub Copilot in October\n2021 and leverage a natural experiment in which GitHub Copilot (a\nprogramming-focused LLM) selectively rolled out support for Python, but not for\nR. We observe a significant jump in overall contributions, suggesting that\nGenAI effectively augments collaborative innovation in an unguided setting.\nInterestingly, Copilot's launch increased maintenance-related contributions,\nwhich are mostly iterative tasks involving building on others' work,\nsignificantly more than code-development contributions, which are mostly\norigination tasks involving standalone contributions. This disparity was\nexacerbated in active projects with extensive coding activity, raising concerns\nthat, as GenAI models improve to accommodate richer context, the gap between\norigination and iterative solutions may widen. We discuss practical and policy\nimplications to incentivize high-value innovative solutions.","PeriodicalId":501273,"journal":{"name":"arXiv - ECON - General Economics","volume":"34 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - ECON - General Economics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.08379","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Generative AI (GenAI) has been shown to enhance individual productivity in a
guided setting. While it is also likely to transform processes in a
collaborative work setting, it is unclear what trajectory this transformation
will follow. Collaborative environment is characterized by a blend of
origination tasks that involve building something from scratch and iteration
tasks that involve refining on others' work. Whether GenAI affects these two
aspects of collaborative work and to what extent is an open empirical question.
We study this question within the open-source development landscape, a prime
example of collaborative innovation, where contributions are voluntary and
unguided. Specifically, we focus on the launch of GitHub Copilot in October
2021 and leverage a natural experiment in which GitHub Copilot (a
programming-focused LLM) selectively rolled out support for Python, but not for
R. We observe a significant jump in overall contributions, suggesting that
GenAI effectively augments collaborative innovation in an unguided setting.
Interestingly, Copilot's launch increased maintenance-related contributions,
which are mostly iterative tasks involving building on others' work,
significantly more than code-development contributions, which are mostly
origination tasks involving standalone contributions. This disparity was
exacerbated in active projects with extensive coding activity, raising concerns
that, as GenAI models improve to accommodate richer context, the gap between
origination and iterative solutions may widen. We discuss practical and policy
implications to incentivize high-value innovative solutions.