Discover the exciting enhancements in GitHub that empower Machine Learning practitioners to do more.
From planning and tracking our work on GitHub Issues to using GitHub Discussions to gather your feedback and running our developer environments in Codespaces, we pride ourselves on using GitHub to build GitHub, and we love sharing how we use our own products in the hopes it’ll inspire new ways for you and your teams to use them.
Even before we officially released GitHub Actions in 2018, we were already using it to automate all kinds of things behind the scenes at GitHub. If you don’t already know, GitHub Actions brings platform-native automation and CI/CD that responds to any webhook event on GitHub (you can learn more in this article). We’ve seen some incredible GitHub Actions from open source communities and enterprise companies alike with more than 12,000 community-built actions in the GitHub Marketplace.
Now, we want to share a few ways we use GitHub Actions to build GitHub. Let’s dive in.
In 2019, we announced the creation of the GitHub Security Lab as a way to bring security researchers, open source maintainers, and companies together to secure open source software. Since then, we’ve been busy doing everything from giving advice on how to write secure code, to explaining vulnerabilities in important open source projects, to keeping our GitHub Advisory Database up-to-date.
In short, it’s fair to say our Security Lab team is busy. And it shouldn’t surprise you to know that they’re using GitHub Actions to automate their workflows, tests, and project management processes.
One particularly interesting way our Security Lab team uses GitHub Actions is to automate a number of processes related to reporting vulnerabilities to open source projects. They also use actions to automate processes related to the CodeQL bug bounty program, but I’ll focus on the vulnerability reporting here.
Any GitHub employee who discovers a vulnerability in an open source project can report it via the Security Lab. We help them to create a vulnerability report, take care of reporting it to the project maintainer, and track the fix and the disclosure.
To start this process, we created an issue form template that GitHub employees can use to report a vulnerability:
A screenshot of an Issue template GitHub employees use to report vulnerabilities.
The issue form triggers an action that automatically generates a report template (with details such as the reporter’s name that is filled out automatically). We ask the vulnerability reporter to enter the URL of a private repository, which is where the report template will be created (as an issue), so that the details of the vulnerability can be discussed confidentially.
Every vulnerability report is assigned a unique ID, such as GHSL-2021-1001. The action generates these unique IDs automatically and adds them to the report template. We generate the unique IDs by creating empty issues in a special-purpose repository and use the issue numbers as the IDs. This is a great improvement over our previous system, which involved using a shared spreadsheet to generate GHSL IDs and introduced a lot more potential for error due to having to manually fill out the template.
For most people, reporting a vulnerability is not something that they do every day. The issue form and automatically-generated report template help to guide the reporter, so that they give the Security Lab all the information they need when they report the issue to the maintainer.
CodeQL plays a big part in keeping the software ecosystem secure—both as a tool we use internally to bolster our own platform security and as a freely available tool for open source communities, companies, and developers to use.
If you’re not familiar, CodeQL is a semantic code analysis engine that enables developers to query code as if it were data. This makes it easier to find vulnerabilities across a codebase and create reusable queries (or leverage queries that others have developed).
The CodeQL Team at GitHub leverages a lot of automation in their day-to-day workflows. Yet one of the most interesting applications they use GitHub Actions for is large-scale regression testing of CodeQL implementation changes. In addition to recurring nightly experiments, most CodeQL pull requests also use custom experiments for investigating the CodeQL performance and output changes a merge would result in.
The typical experiment runs the standard github/codeql-action queries on a curated set of open source projects, recording performance and output metrics to perform comparisons that answers questions such as “how much faster does my optimization make the queries?” and “does my query improvement produce new security alerts?”
Let me repeat that for emphasis: They’ve built an entire regression testing system on GitHub Actions. To do this, they use two kinds of GitHub Actions workflows:
- One-off, dynamically-generated workflows that run the github/codeql-action on individual open source projects. These workflows are similar to what codeql-action users would write manually, but also contain additional code that collects data for the experiments.
- Periodically run workflows that generate and trigger the above workflows for any ongoing experiments and later compose the resulting data into digestible reports.
The elasticity of GitHub Actions is crucial for making the entire system work, both in terms of compute and storage. Experiments on hundreds of projects trivially parallelize to hundreds of on-demand action runners, causing even large experiments to finish quickly, while the storage of large experiment outputs is handled transparently through workflow artifacts.
Several other GitHub features are used to make the experiments accessible to the engineers through a single platform with the two most visible being:
- Issues: The status of every experiment is tracked through an ordinary GitHub issue that is updated automatically by a workflow. Upon completion of the experiment, the relevant engineers are notified. This enables easy discussions of experiment outcomes, and also enables cross-referencing experiments and any associated pull requests.
- Rich content: Detailed reports for the changes observed in an experiment are presented as ordinary markdown files in a GitHub repository that can easily be viewed through a browser.
And while this isn’t exactly a typical use case for GitHub Actions, it illustrates how flexible it is—and how much you can do with it. After all, most organizations have dedicated infrastructure to perform regression testing at the scale we do. At GitHub, we’re able to use our own products to solve the problem in a non-standard way.
Every week, the GitHub Mobile Team updates our mobile app with new features, bug fixes, and improvements. Additionally, GitHub Actions plays an integral role in their release process, helping to deliver release candidates to our more than 8,000 beta testers.
Our Mobile team is comparatively small compared to other teams at GitHub, so automating any number of processes is incredibly impactful. It lets them focus more on building code and new features, and removes repetitive tasks that otherwise would take hours to manually process each week.
That means they’ve thought a good deal about how to best leverage GitHub Actions to save the most amount of time possible when building and releasing GitHub Mobile updates.
This chart below shows all the steps included in building and delivering a mobile app update. The gray steps are automated, while the blue steps are manually orchestrated. The automated steps include running a shell command, creating a branch, opening a pull request, creating an issue and GitHub release, and assigning a developer.
A workflow diagram of GitHub’s release process with automated steps represented in gray and manual steps represented in green.
Another thing our team focused on was to make it possible for anyone to be a release captain. By making a computer do things that a human might have to learn or be trained on, makes it easier for any of our engineers to know what to do to get a new version of GitHub Mobile out to users.
This is a great example of CI/CD in action at GitHub. It also shows firsthand what GitHub Actions does best: automating workflows to let developers focus more on coding and less on repetitive tasks.
Of course, we also use GitHub Actions to automate a bunch of non-technical tasks, like spinning up status updates and sending automated notifications on chat applications.
After talking with some of our internal teams, I wanted to showcase some of my favorite internal examples I’ve seen of Hubbers using GitHub Actions to streamline their workflows (and have a bit of fun, too).
Our Intranet team uses GitHub Actions to add updates to our intranet whenever changes are made to a specified directory. In practice, this means that anyone at GitHub with the right permissions can share messages with the company by adding a file to a repository. This then triggers a GitHub Actions workflow that turns that file into a public-facing message that’s shared to our intranet and automatically to a Slack channel.
At GitHub, we have technical program management teams that are responsible for making sure the trains arrive on time and things get built and shipped. Part of their job includes building out weekly status reports for visibility into development projects, including progress, anticipated timelines, and potential blockers. To speed up this process, our technical program teams use GitHub Actions to automate the compilation of all of their individual reports into an all-up program status dashboard.
Here’s a fun one for you: Our Ecosystem Applications team built a custom GitHub Actions workflow that combines team photos they take at their weekly meetings and turns it into a GIF. And if that wasn’t enough, that same workflow also automatically uploads that GIF to their team README. In the words of our Senior Engineer, Jake Wilkins, “I’m not sure when or why we started taking team photos, but when we got access to GitHub Actions it was an obvious thing to do.”
Whether you need to build a CI/CD pipeline or want to step up your Twitter game, GitHub Actions offers powerful automation across GitHub (and outside of it, too). With more than 12,000 pre-built community actions in the GitHub Marketplace, it’s easy to start bringing simple and complex automations to your workflows so you can focus on what matters most: building great code.