Research: Quantifying GitHub Copilot’s impact on code quality

Findings show that code quality is better across the board and developers felt more confident, too.

Mario Rodriguez

October 10, 2023

| 8 minutes

Today, we’re releasing new research on GitHub Copilot Chat. By using the power of natural language, developers in our study used GitHub Copilot Chat to get real-time guidance, tips, troubleshooting, remediation, and solutions tailored to their specific coding challenges—all in real time without leaving the IDE.

Our research found that the quality of the code authored and reviewed was better across the board with GitHub Copilot Chat enabled, even though none of the developers had used the feature before.

85% of developers felt more confident in their code quality when authoring code with GitHub Copilot and GitHub Copilot Chat.
Code reviews were more actionable and completed 15% faster with GitHub Copilot Chat.
88% of developers reported maintaining flow state with GitHub Copilot Chat because they felt more focused, less frustrated, and enjoyed coding more, too.

Last year, our research uncovered that developers using GitHub Copilot code 55% faster. But working fast is just one part of the picture—in many cases, there has traditionally been a tradeoff between doing something quickly and doing something right. As artificial intelligence continues to write code for an increasing number of developers, ensuring good code quality is even more important.

What does high-quality code look like?

To measure code quality, we developed a rubric of five metrics used internally at GitHub, but that also align with academic¹ and industry² standards. Participants used the metrics to differentiate between strong code and code that slows them down.

Readable

Does the code follow the language’s idioms and naming patterns? Code that is difficult to read makes it more challenging to maintain, improve, and document.

Reusable

Is the code written so that it can be reused? Code reuse is a cornerstone of developer collaboration. It saves time and energy, breaks down silos, and creates consistency as a whole.

Concise

Does the code adhere to DRY (don’t repeat yourself)? The less repetitive the code is, the easier it’ll be to read, understand, and build upon. Complex code can lead to bugs and issues that will be tough to remediate.

Maintainable

Is the code written in a way that makes the functionality clear, transparent, and relevant to the problem at hand? Well-maintained code means that developers minimize dependencies. Maintainable code also impacts developers’ ability to search and practice code-reuse.

Resilient

Does the code anticipate and handle errors? Resilient code will maintain its functionality (or at least have minimal disruption) if there are errors. This goes a long way toward ensuring that code will, simply put, work.

Using GitHub Copilot correlates with better code quality

In this study, we investigated whether GitHub Copilot and its chatbot functionalities would improve perceived quality of the code produced, reduce time required to review the code, and produce code that passes unit testing. And by every measure, developers felt their coding improved when using GitHub Copilot.

85% of developers felt more confident in their code quality when authoring code with GitHub Copilot and Copilot Chat

GitHub Copilot Chat is a chat interface that lets you interact with GitHub Copilot, to ask and receive answers to coding-related questions from directly within a supported IDE. The chat interface provides access to coding information and support without requiring you to navigate documentation or search online forums. Copilot Chat is currently supported in Visual Studio Code and Visual Studio.

Here’s how to enable GitHub Copilot Chat.

Overall, developers told us they felt more confident because coding is easier, more error-free, more readable, more reusable, more concise, more maintainable, and more resilient with GitHub Copilot and GitHub Copilot Chat than when they’re coding without it.

The code was so clean, I could just look at the code, know what was going on, and bring it over fairly easily to my codebase. So, I felt the code quality was very clean and easy to understand. When I was doing the pull request review using Copilot, I saw how good it was at generating proper error handling code.

- Senior Software Engineer at a Fortune 500 company (study participant)

Code reviews were more actionable and completed 15% faster than without GitHub Copilot Chat (that’s first-time users, too!)

This is the part where we get to talk about quality and speed—because yes it’s possible to have both.

Developers noted that using GitHub Copilot Chat for code reviews improved the quality of their code (when compared to doing code review without it). Those code reviews were 15% faster with GitHub Copilot Chat. A higher percentage of comments were accepted, too. In fact, almost 70% of participants accepted comments from reviewers using GitHub Copilot Chat.

These results show the impact GitHub Copilot Chat has on collaboration, and underscores the potential impact of scaling it across larger engineering teams in bigger organizations. Reducing time spent on pull requests and code reviews means developers can focus on higher-priority changes. And better quality code from the start ensures code doesn’t need to be rolled back later, nor does it require additional testing.

One of the more actionable comments I received was related to conceptual understanding of my code–I want other developers to be able to understand what I'm doing, so this was good feedback.

- Software Engineer (study participant)

88% of developers reported maintaining flow state with GitHub Copilot Chat because they felt more focused, less frustrated, and enjoyed coding more, too

Last year’s research found 60-75% of developers using GitHub Copilot reported feeling more fulfilled in their job, less frustrated when coding, and better positioned to focus on more satisfying work. In this year’s study, 88% of participants similarly felt less frustrated and more focused. One reason is that staying in the IDE means less time spent searching and more time staying in that coveted focused flow state.

Because I’ve had industry experience for over a decade, I usually prefer coding with a notepad or a simplistic platform and Copilot Chat was a really useful tool for checking basic things without me needing to go to Google or Stack Overflow. And Copilot has a pretty neat syntax edition.

- Senior Systems Validation Engineer (study participant)

How we set up the study

In the study, the goal was to simulate the process of authoring code in a controlled setting, having the code reviewed, and incorporating the changes suggested in the code review. So, each participant was asked to author code, review code, and then review the suggestions from code review and incorporate changes.

We recruited 36 participants with between five and 10 years of software development experience. In the study, the participants authored and reviewed code both with and without GitHub Copilot Chat. (Participants had some experience using GitHub Copilot and no experience using GitHub Copilot Chat.)

Participants were asked to author API endpoints for an HTTP service that creates, reads, and deletes objects. They were randomly assigned to either use GitHub Copilot Chat to then create, read and delete API endpoints. Before using GitHub Copilot Chat, developers were shown a brief video on its functionality. The participants created one pull request for their work on the create API endpoint, and another for the read and delete portion.

After authoring the code for the API endpoints, the participants compared how using GitHub Copilot Chat impacted the quality of the code they wrote. Specifically, they were asked if the task was easier to complete; if the code had fewer errors; and was more readable, reusable, concise, maintainable, and resilient.

After the session, developers were assigned the two pull requests that another participant in the study authored. The study participants were blind to which pull request was authored with or without Copilot, but were asked to review it and provide suggestions on how the code could be improved. Then they rated the process of conducting the review with and without GitHub Copilot Chat. The reviewers then rated the quality of the code using the rubric above, measuring whether the code was readable, reusable, and well-architected.

After their code was done being reviewed by another participant, the participants that originally authored the code reviewed the comments on their pull requests to decide which were helpful in improving the quality of the code and how actionable the comments were. Again, these participants were blind to which pull had been reviewed with Copilot Chat and which hadn’t.

The promise of GitHub Copilot Chat: better quality code, faster

We know there’s a difference between doing something fast and doing something well. With GitHub Copilot Chat, it turns out you can have both.

We built GitHub Copilot and GitHub Copilot Chat to improve the lives of developers by helping them focus, keeping them in the flow, and empowering them to find more joy in their work. The results have shown that these AI tools are doing that and more—we look forward to building what’s next.

Acknowledgements

We are very grateful to all the developers who participated in the this study–we always love hearing how we can make GitHub better for you! GitHub Customer Research conducted this research with help and consultation from GitHub Next.

Börstler, J., Bennin, K.E., Hooshangi, S. et al. Developers talking about code quality. Empir Software Eng 28, 128 (2023). https://doi.org/10.1007/s10664-023-10381-0 ↩
Ghani, U. (2023, September 18). 5 code review best practices – Work Life by Atlassian. Work Life by Atlassian. https://www.atlassian.com/blog/add-ons/code-review-best-practices ↩

Written by

News & insights

Research: Quantifying GitHub Copilot’s impact on code quality