Learn more about generating unit tests with GitHub Copilot >
How to generate unit tests with GitHub Copilot: Tips and examples
Learn how to generate unit tests with GitHub Copilot and get specific examples, a tutorial, and best practices.
Developers writing enough unit tests? Sure, and my code never has bugs on a Friday afternoon.
Whether you’re an early-career developer or a seasoned professional, writing tests—or writing enough tests—is a challenge. That’s especially true with unit tests, which help developers catch bugs early, validate code, aid with refactoring, improve code quality, and play a core role in Test-Driven Development (TDD).
All of this to say, you can save a lot of time (and write better, more robust code) by automating your test generation—and AI coding tools are making that easier and quicker than ever.
GitHub Copilot, GitHub’s AI-powered coding assistant, helps generate test cases on the fly and can save you time. I’ll be honest: I heavily rely on GitHub Copilot to generate tests in my own workflows—but I still manually write a number of them to help formulate my thoughts.
In this article, I’ll walk you through why unit tests are essential, how GitHub Copilot can assist with generating unit tests, and practical tips for getting the most from Copilot’s test generation capabilities. We’ll also dive into specific examples across languages and frameworks so you can get started with using Copilot to generate unit tests.
Oh, and if you’re curious I used Anthropic’s Claude model to generate the unit test examples you’ll find later in this article (in case you missed it, GitHub Copilot offers support for Anthropic’s Claude, Google’s Gemini, and OpenAI’s GPT o1 models).
Let’s jump in.
💡 Oh, and if you’re a visual learner, we have you covered.👇
Why unit tests matter (and what differentiates good unit tests from bad ones)
If you already know all of this, feel free to skip past this section—but just in case you don’t, unit tests are fundamental to creating reliable, maintainable software. When you’re writing code, testing individual units, such as functions or classes, can help you ensure each component works as expected. This improves the codebase’s integrity, simplifies debugging, and fosters collaboration, as other developers can understand and trust the code.
The challenge, however, is that writing unit tests is often time consuming—and far too often, it can be easy to write unit tests that have less value than they should. Simply writing tests because you’re told to or because you’re trying to check off a box doesn’t make them useful; you need to understand their purpose and ensure they add value.
You should always start with the purpose of your unit tests and the ultimate audience and role they’ll play. Here are a few helpful things to consider:
- Consider your testing philosophy. Are you looking to isolate classes and dependencies or write high-level tests that validate overall behavior against your requirements? It’s not an either/or question—but you should consider exactly what outcome you’re looking to achieve.
- Define the purpose—and audience—of your tests. Clearly state the purpose of each test to help future developers know when it’s safe to delete them. Tests should support requirements, classes, or APIs clearly. Tests should also be written with their audience in mind. Maybe you’re looking to satisfy a product owner, help with QA, educate new team members, or enable refactoring work.
- Focus on utility. Always prioritize what’s most useful and needed for your projects. TDD, for instance, requires practice and should improve your speed and confidence instead of slowing you down.
How GitHub Copilot helps generate unit tests
GitHub Copilot uses generative AI to provide real-time code suggestions in your IDE and via chat-based functions in your IDE and across your GitHub projects.
Based on the context in your code or chat-based queries (or even slash commands you use after highlighting specific code blocks), it can suggest relevant unit tests, covering typical scenarios like edge cases, common inputs, and failure modes. This ability to anticipate and generate test code can lead to better code coverage and more resilient applications.
So, how does this work in practice? Imagine you’re testing a piece of business logic—like validating your inputs with a regular expression. Writing unit tests can feel (and often is) repetitive and time consuming because you need to test various edge cases to ensure the code works as expected.
Instead of manually writing every test case, you can use GitHub Copilot to generate tests on your behalf by highlighting your code or logic, and let Copilot suggest unit tests to cover a range of inputs and edge cases.
There are a number of ways to generate unit tests with GitHub Copilot. For instance, you can select the code you want to test, right click in your IDE and select Copilot->Generate Tests. You can also use the slash command /tests in your IDE to generate tests (you’ll want to highlight the code or logic block first that you’re looking to test). And then you always have GitHub Copilot Chat—both in your IDE and across your online GitHub experience—that you can prompt to find existing tests or use to generate new ones.
When should you avoid using GitHub Copilot to generate unit tests?
I tend to write tests manually in the same scenarios where I write code manually, because I know what I want, so I just do it and get it done. But sometimes I need to formulate my thoughts, and the process of manually writing code can help me determine what I’m trying to do and how to do it. From there, I ask GitHub Copilot to expand what I’ve already built.
Key benefits of using GitHub Copilot to generate unit tests
Even if I don’t always use GitHub Copilot for unit tests, I use it a lot when it comes to unit tests. Some of the biggest benefits I find when using GitHub Copilot to generate unit tests include:
- Saving time on routine tasks. Unit tests are perfect candidates for automation because of their repetitive nature. With Copilot, you can offload much of the grunt work, letting you focus on coding features rather than manually writing test cases.
- Supporting TDD. TDD involves writing tests before implementing the code itself—a process that can feel daunting when other autocompletion tools don’t offer any suggestions. Copilot changes the game here. It “trusts” your description of the application you’re building, helping you generate tests for functionalities that don’t exist yet. For example, you can describe an app’s functionality to Copilot, and it will generate tests for those features. Then, you can build the app to meet the requirements of those tests to put TDD into play in your workflow.
- Increasing test coverage. By letting Copilot handle initial test generation, you can quickly cover a broad range of cases. You can then refine and extend those tests, ensuring they meet your exact requirements. This iterative process improves confidence in your test suite and the code it verifies.
Best practices for using GitHub Copilot to generate unit tests
During my time using GitHub Copilot for test generation, I’ve come away with a number of personal best practices that may prove useful.
- Highlight the code you want to test. You always want to highlight the code or logic you want Copilot to focus on when generating tests or before using the slash command. In my experience, this feels incredibly intuitive, but I often hear questions from a lot of first timers.
- Be specific in your prompts about what you want to test. Copilot doesn’t code like humans do. If I create a function, for instance, I focus on what the function does and how it works. Copilot doesn’t truly read code; it just evaluates patterns. So, if you know there is a specific part of the function you’re looking to test, tell Copilot to “look for this” or look for a specific piece of logic.
- Provide context. When using Copilot, make sure to add comments or docstrings explaining the intended behavior of your code. You can also use a #[file] command to get Copilot to point at existing tests you’ve written. This helps Copilot generate more accurate and meaningful tests.
- Review suggestions carefully. Just like with human-generated code, never trust any tests Copilot generates without going through your normal review process. Review the output yourself, run it through linters, and check the code.
- Be flexible and iterative. At the end of the day, unit tests are code that effectively describe code. The first iteration of generated tests, for instance, may not necessarily be exactly what you’re looking for. I find sometimes that it won’t generate mock objects, or sometimes it will hallucinate. Don’t be afraid to reframe your prompt or question.
- Ask Copilot if you’re missing any tests. You can always prompt Copilot with the question “is there anything I’m not testing?” and Copilot will—in my experience—provide a number of tests I hadn’t considered around edge cases, requirement verifications, and more. Try it out for yourself; it’s something I’ve found incredibly helpful. I also like using Copilot to generate tests for error conditions and code paths that generate expected failures. Testing for these is just as important as testing with good inputs so you know your application can handle errors gracefully.
- Use test coverage tools. Use coverage tools (like Jest’s coverage in JavaScript or Cobertura in Java) to assess Copilot’s test coverage and fill in any gaps. 💡 And here’s a pro tip: if you combine Copilot with a code coverage tool, you can quickly find untested code paths and use Copilot to generate tests and reduce the risk of unforeseen errors
Generating unit tests with GitHub Copilot: Practical examples
Now, let’s look at some practical examples in Python and TypeScript—two of the biggest languages used on GitHub today.
Example 1: Creating unit tests in Python with GitHub Copilot
Suppose we have a Python function to check that the price of something is greater than 0 or less than or equal to 1000. Let’s use Copilot to generate test cases for it.
def validate_price(price: float):
if price <= 0:
raise ValueError("Price must be greater than 0")
if price > 1000:
raise ValueError("Price must be less than or equal to 1000")
With Copilot enabled, you’d want to enter the following slash command and prompt in Copilot Chat: /tests create tests for validate_price
. Check the edge cases as well.
From there, you would see an output like this (note: since LLMs are nondeterministic, your output may look a bit different):
import unittest
from generate_unit_tests import validate_price
class TestValidatePrice(unittest.TestCase):
def test_price_zero(self):
with self.assertRaises(ValueError) as context:
validate_price(0)
self.assertEqual(str(context.exception), "Price must be greater than 0")
def test_price_negative(self):
with self.assertRaises(ValueError) as context:
validate_price(-10)
self.assertEqual(str(context.exception), "Price must be greater than 0")
def test_price_above_limit(self):
with self.assertRaises(ValueError) as context:
validate_price(1500)
self.assertEqual(str(context.exception), "Price must be less than or equal to 1000")
def test_price_edge_case_zero(self):
with self.assertRaises(ValueError) as context:
validate_price(0)
self.assertEqual(str(context.exception), "Price must be greater than 0")
def test_price_edge_case_max(self):
try:
validate_price(1000)
except ValueError:
self.fail("validate_price() raised ValueError unexpectedly!")
if __name__ == '__main__':
unittest.main()
Here, we see GitHub Copilot generating unit tests to validate the behavior of the validate_price function. Here’s a breakdown of what each test does:
- test_price_zero: Confirms that calling validate_price(0) raises a ValueError with the message “Price must be greater than 0”. This tests the lower bound condition for an invalid price.
- test_price_negative: Confirms that calling validate_price(-10) raises a ValueError with the same message, ensuring negative values are handled correctly.
- test_price_above_limit: Validates that calling validate_price(1500) raises a ValueError with the message “Price must be less than or equal to 1000”, testing the upper bound for invalid prices.
- test_price_edge_case_zero: A duplicate of test_price_zero (it checks the same condition). It is redundant and can be removed.
- test_price_edge_case_max: Verifies that calling validate_price(1000) does not raise an error. If an exception is raised, the test fails.
While these tests work, they can also be improved with a follow-up prompt to remove redundancies (such as test_price_edge_case_zero
and test_price_zero
) and adding a test to validate prices within the acceptable range (for example., validate_price(500)
). Also remember: GitHub Copilot is powered by nondeterministic LLMs that don’t always produce the same results, and you should always review code suggested by Copilot—and re-prompt Copilot to improve that code.
Example 2: Creating unit tests in Jest for TypeScript with GitHub Copilot
Now, let’s create the same function in TypeScript to check whether a given numeric price falls within a specific range. It should look something like this:
function validatePrice(price: number) {
if (price < 0) {
throw new Error('Price must be greater than 0');
}
if (price > 1000) {
throw new Error('Price must be less than 1000');
}
}
You’d then enter the same slash command and prompt in Copilot Chat that we used above: /tests create tests for validate_price
. Check the edge cases as well.
The generated unit tests would look something like this (since LLMs are nondeterministic, your output may look a bit different):
describe('validatePrice', () => {
// Valid cases
it('should accept valid price in middle range', () => {
expect(() => validatePrice(500)).not.toThrow();
});
it('should accept price of 0', () => {
expect(() => validatePrice(0)).not.toThrow();
});
it('should accept price of 1000', () => {
expect(() => validatePrice(1000)).not.toThrow();
});
// Error cases
it('should throw error for negative price', () => {
expect(() => validatePrice(-1)).toThrow('Price must be greater than 0');
});
it('should throw error for price over 1000', () => {
expect(() => validatePrice(1001)).toThrow('Price must be less than 1000');
});
// Edge cases
it('should accept small decimal price', () => {
expect(() => validatePrice(0.01)).not.toThrow();
});
it('should accept price close to maximum', () => {
expect(() => validatePrice(999.99)).not.toThrow();
});
it('should throw error for NaN', () => {
expect(() => validatePrice(NaN)).toThrow('Price must be greater than 0');
});
it('should throw error for Infinity', () => {
expect(() => validatePrice(Infinity)).toThrow('Price must be less than 1000');
});
});
You can see Copilot generated unit tests in Jest that mirror the Python unit tests above—they validate expected cases, error cases, and edge cases to ensure the validatePrice function works correctly.
Take this with you
Unit testing is a vital part of software development, but it can be tedious and time-consuming. GitHub Copilot automates much of this process, making it easier to generate meaningful, comprehensive tests without the grunt work. Whether you’re validating complex business logic, working in a TDD workflow, or expanding an existing test suite, Copilot can be a powerful ally.
The key to getting the most out of Copilot lies in clear communication and iteration. Be specific in your prompts, highlight the code you want tested, and don’t hesitate to refine your prompt (or Copilot’s output). Use tools like slash commands or Copilot Chat to provide broader context or request additional test cases. And while Copilot can speed up the process, always make sure to review and validate any generated tests to ensure accuracy. In the meantime, happy testing!
Tags:
Written by
Related posts
How to use GitHub Copilot: What it can do and real-world examples
How Copilot can generate unit tests, refactor code, create documentation, perform multi-file edits, and much more.
GitHub’s top blogs of 2024
Explore GitHub’s top blogs of 2024, featuring new tools, AI breakthroughs, and tips to level up your developer game.