Last year, we launched a technical preview of GitHub Copilot, a new AI pair programmer that plugs into your editor and offers coding suggestions in real time. Despite offering a limited number of seats, people that started using GitHub Copilot told us it became an indispensable part of their daily workflows.
Now, GitHub Copilot is generally available to all developers. And the feedback we have heard and continue to hear is substantiating our core thesis: AI can help make developers more productive and happier while coding. Even still, we wanted to test our theory and see if GitHub Copilot itself actually leads to higher productivity rates among developers.
To find out, our research and engineering teams partnered to combine qualitative survey data from more than 2,000 U.S.-based developers with anonymized data to determine if developers feel like GitHub Copilot is making them more productive—and if the data proves they actually are, in fact, more productive when using GitHub Copilot.
This is the first of several studies we’re doing around GitHub Copilot, and the early results are promising. Let’s dive in.
If you pair-program with a friend or colleague, does that make you more productive? Most people agree that even if a friend’s suggestions aren’t perfect, working with someone else typically helps you reach your coding goals faster, produce better end products, and learn something new while doing it. Academic researchers have also found evidence that pair programming improves productivity [1, 2].
In contrast, if you try to solve a math problem with a calculator that often gives wrong answers, would you find that useful? Probably not. The difference here is what we value most in calculators is precision. Not many people turn to a calculator for inspiration.
In a sense, GitHub Copilot is a bit like a pair programmer with a calculator attached. It’s really good at the fiddly stuff, and I can trust it to close all my brackets in the right order—which comes in handy.
But recently, I was on a flight without internet—and consequently I was left without GitHub Copilot. What I missed about it wasn’t its precision at closing brackets, but its larger flashes of insight. For example, suggestions of whole patterns or pre-populated boilerplate I only had to adapt slightly. Or valiant attempts at expressions that weren’t yet exactly what I wanted, but helped get me started.
We built GitHub Copilot to help make developers happier and more productive by keeping them focused on what matters most: building great software.
But the word “productivity” in development contains a wide range of possible practical meanings. Do developers ideally want to save keyboard strokes or avoid searches on Google and StackOverflow? Should GitHub Copilot help them stay in the flow by giving them highly accurate solutions on mechanical, calculator-like tasks? Or, should it inspire them with speculative stubs that might help unblock them when they’re stuck?
We’re in pretty uncharted territory with GitHub Copilot, so the first thing to do was to ask people through a survey. Then, we checked their answers against anonymized user data to determine if how people felt GitHub Copilot’s boosted their productivity levels was reflected in how they were actually using it.
In total, we surveyed more than 2,000 U.S.-based developers and compared their answers with user data from the same time period. We focused on answering three questions:
- Do people feel like GitHub Copilot makes them more productive?
- Is that feeling reflected in any objective usage measurements?
- Which usage measurements best reflect that feeling?
As someone who is part of the team that developed GitHub Copilot, it was incredibly gratifying to hear survey respondents describe how GitHub Copilot empowers them in a multitude of ways. We also discovered a strong connection to our objective usage data. For example, we counted the number of characters contributed by GitHub Copilot, the number of retained suggestions, and how often GitHub Copilot made suggestions in the first place. All of these things correlated with reported usefulness and improved productivity.
Yet we got the strongest connection by simply dividing the number of accepted suggestions by the number of shown suggestions. This acceptance rate captures how many of the code suggestions GitHub Copilot produces are deemed promising enough to accept.
Developers who report the highest productivity gains with GitHub Copilot also accept the largest number of shown code suggestions
When sorting the users in different quartiles depending on how useful they reported GitHub Copilot to be, there was a stark difference between those groups: The acceptance rate of completions was much higher for those who had reported the biggest productivity gains.
We found developers didn’t care that much if they needed to rework the suggestion, as long as GitHub Copilot gave them a suitable starting point.
And this makes sense: GitHub Copilot isn’t designed to build software by itself. It’s designed to offer helpful suggestions that make it easier to stay in the flow. In other words, GitHub Copilot offers developers the parts but leaves it up to them to assemble and design the finished product.
We’ve written an academic research paper with these findings, and some general background about code suggestion acceptance rates we’re seeing among people who use GitHub Copilot. Have a look for a deeper and more systematic dive into topics like retention, language differences, and weekend coding. We presented this paper at PLDI’s MAPS ‘22.
But everyone writes code differently, so how will our findings apply to you? Try out GitHub Copilot today, and let us know what benefits you discover.