That’s a wrap: GitHub Innovation Graph in 2024

Discover the latest trends and insights on public software development activity on GitHub with the release of Q2 & Q3 2024 data for the Innovation Graph.

|
| 6 minutes

This is our first GitHub Innovation Graph data release in 2025 and our first data release after celebrating the Innovation Graph’s first birthday, so we’d like to reflect a bit on how things have gone so far and share our hopes and dreams for the future.

Innovation Graph: a look back

We created the Innovation Graph to help make GitHub data more easily available to researchers, policymakers, and developers. Unfortunately, no one’s made a Dataset Success Graph yet to make dataset success metrics more easily available, so in lieu of that, here’s a chart of the github/innovationgraph repo’s stars, annotated with open source-related events to help explain the increases over time:

A line chart of the cumulative star count of the github/innovationgraph repo over time, starting from September 2023 through January 2025. The number of stars increased rapidly in the two weeks following its launch, then grew steadily (but more slowly) for the subsequent year and a quarter. The line chart is annotated with various events, including from conferences, data releases, academic paper publications, and news articles.

The Innovation Graph might not get VC funding anytime soon with these rookie numbers, but we’re heartened to see the steady growth over the past year and a quarter, and we’re hopeful that this imperfect proxy is at least somewhat correlated with usefulness. Let’s dive into each event category below.

Innovation Graph data releases

We’ve released five quarters’ worth of additional data since the launch of the Innovation Graph (mostly like clockwork–thanks for your patience with this latest release, it’s been a busy couple of quarters!). With each release, we’ve seen modest increases in star count. Taken together with the results of a recently published working paper on the association between semantic versioning and adoption of software packages, we have to wonder if we’re leaving stars on the table by only bumping the patch version with each dataset release, as new major releases are apparently associated with significantly greater growth in adoption than patch (or minor) releases.

Academic papers

Speaking of papers, here’s a roundup of some that caught our attention because they lined up with research questions about the open source ecosystem that were on our wishlist, or because they grappled with the impact of AI on software production, or both.

The Value of Open Source Software

The demand-side value of open source software was estimated to be $8.8 trillion.

Generative AI and the Nature of Work

Access to GitHub Copilot was found to cause open source maintainers to do proportionally more coding work and less project management work, as well as explore more lucrative programming languages. Check out our researcher Q&A with a couple of the co-authors for more details.

  • Hoffmann, Manuel and Boysel, Sam and Nagle, Frank and Peng, Sida and Xu, Kevin, “Generative AI and the Nature of Work” (October 27, 2024). Harvard Business School Strategy Unit Working Paper No. 25-021, Harvard Business Working Paper No. No. 25-021, CESifo Working Paper Series No. 11479, Available at SSRN: https://ssrn.com/abstract=5007084 or http://dx.doi.org/10.2139/ssrn.5007084

Measuring Software Innovation with Open Source Software Development Data

Major releases of OSS packages were found to count as a unit of innovation complementary to scientific publications, patents, and standards, offering applications for policymakers, managers, and researchers.

  • Brown, Eva Maxfield, et al. “Measuring Software Innovation with Open Source Software Development Data.” arXiv preprint arXiv:2411.05087 (2024).

Open Source Software Policy in Industry Equilibrium

Global and domestic impacts of China and US government restrictions on, disincentives against, and subsidies for open source software contribution were estimated using simulations. Restrictions were found to be ineffective at increasing domestic investment into OSS; disincentives were found to increase costs both domestically and globally; and subsidies were found to both increase domestic investment and decrease costs globally.

  • Gortmaker, Jeff. “Open Source Software Policy in Industry Equilibrium.” Working Paper (2024).

Impact of the Availability of ChatGPT on Software Development: A Synthetic Difference in Differences Estimation using GitHub Data

Availability of ChatGPT was found to cause an increase in Git pushes per 100,000 inhabitants and increases in developer engagement across high-level languages like Python and JavaScript, while the effects on domain-specific languages like HTML and SQL varied. Learn more through our researcher Q&A with a couple of the co-authors.

  • Quispe, Alexander, and Rodrigo Grijalba. “Impact of the Availability of ChatGPT on Software Development: A Synthetic Difference in Differences Estimation using GitHub Data.” arXiv preprint arXiv:2406.11046 (2024).

From GitHub to GDP: A framework for measuring open source software innovation

A methodology was developed to generate estimates of investment in OSS that are consistent with the U.S. national accounting methods used for measuring software investment. U.S. investment in OSS in 2019 was estimated to be $37.8 billion with a current-cost net stock of $74.3 billion.

  • Gizem Korkmaz, J. Bayoán Santiago Calderón, Brandon L. Kramer, Ledia Guci, Carol A. Robbins, “From GitHub to GDP: A framework for measuring open source software innovation,” Research Policy, Volume 53, Issue 3, 2024, 104954, ISSN 0048-7333, https://doi.org/10.1016/j.respol.2024.104954.

Conferences

There were too many conferences to list, but we know that the Innovation Graph and other datasets on GitHub activity made a splash in presentations at the following conferences (in part because many of the above papers were presented at these conferences):

News publications

There were also too many news articles to list, but we were happy to see The Economist and Rest of World using Innovation Graph data and we’ll consider adding year-over-year percentage growth charts to the Innovation Graph site to save data journalists some time.

Reports

We’re glad to have had the continued opportunity to contribute to the 2023 and 2024 editions of the WIPO Global Innovation Index and similarly, to the 2023 and 2024 Stanford AI Index Reports. And of course, we’re thrilled with the excitement and energy that the annual State of the Octoverse report generates around GitHub data each year, with its narrative structure that helps contextualize a wide array of topics.

This year, to complement macro-level reports that use large-scale aggregate data like those above, we also supported two surveys to shed light on the composition, motivations, and perspectives of the open source community and its funders: the 2024 edition of the Open Source Survey and the inaugural Open Source Software Funding Survey.

Our hopes and dreams for the future

In a word: more. 2024 was our first full year of the Innovation Graph and it’s been nothing short of delightful to meet new collaborators, reconnect with existing ones, and continue building an evidence base to demonstrate the impact of open source. As we kick off 2025, we look forward to even more collaboration to come!

Written by

Related posts