How to secure your GitHub Actions workflows with CodeQL
In the last few months, we secured 75+ GitHub Actions workflows in open source projects, disclosing 90+ different vulnerabilities. Out of this research we produced new support for workflows in CodeQL, empowering you to secure yours.
In the last few months, we secured more than 75 GitHub Actions workflows in open source projects, disclosing more than 90 different vulnerabilities. Out of this research, we produced new support for workflows in CodeQL, empowering you to secure yours.
The situation: growing number of insecure workflows
If you have read our series about keeping your GitHub Actions and workflows secure, you already have a good understanding of common vulnerabilities in GitHub Actions and how to solve them.
- Keeping your GitHub Actions and workflows secure Part 1: Preventing pwn requests
- Keeping your GitHub Actions and workflows secure Part 2: Untrusted input
- Keeping your GitHub Actions and workflows secure Part 3: How to trust your building blocks
Unfortunately, we found that these vulnerabilities are still quite common, mostly because of a lack of awareness of how the moving parts interact with each other and what the impact of these vulnerabilities may be for your organization or repository.
To help prevent the introduction of vulnerabilities, identify them in existing workflows, and even fix them using GitHub Copilot Autofix, CodeQL support has been added for GitHub Actions. The new CodeQL packs can be used by code scanning to scan both existing and new workflows. As code scanning and Copilot Autofix are free for OSS repositories, all public GitHub repositories will have access to these new queries, empowering detection and remediation of these vulnerabilities.
In the rest of this post, we’ll see
- What we added to our existing CodeQL support, including actions as a first-class language, taint tracking, bash support;
- What models and queries we developed;
- What vulnerabilities and what new patterns we discovered as part of this work.
Previous attempt
Previously, there was a single CodeQL query capable of identifying simplistic code injections in GitHub workflows. However, this query had several limitations. First, it was bundled with the JavaScript QL packs, meaning users had to enable JavaScript scanning even if they had no JavaScript code in their repositories, which was confusing and misleading. Additionally, the representation of GitHub workflow syntax and grammar was incomplete, making it difficult to express more complex patterns using the existing Abstract Syntax Tree (AST) of GitHub Actions, which is used by static analysis tools such as CodeQL. Most importantly, the CodeQL support for GitHub workflows did not previously include Taint Tracking support and models for non-straightforward sources of untrusted data or dangerous operations.
Taint Tracking is key!
So what is Taint Tracking and how important is it?
Through the previous query for Code Injection, we were able to identify simplistic vulnerabilities such as those cases where a known user-controlled property gets directly interpolated into a Run script:
That was a great starting point, but what about cases such as the following?
The first step of the path is the download of an artifact.
The second and third steps are setting the content of a file from the artifact as the output of the workflow step.
In the last step, the value from the previous step is interpolated in an unsafe manner in a Run script leading to a potential code injection.
In the case above, the source of untrusted data is not simply a GitHub Event Context access of a known untrusted property (for example, github.event.pull_request.body
) but rather the download of an artifact. Should all artifacts be considered untrusted? Certainly not. However, in this instance, where the workflow is triggered by a workflow_run
event with no branch filters and where an artifact is downloaded from the triggering workflow (github.event.workflow_run.workflow_id
), the artifact should be considered untrusted. When decompressed, it may pollute the Runner’s workspace by writing to files in unexpected locations. Consequently, from that step onward, all files in the workspace should be considered untrusted. This example highlights a non-trivial pattern that we need to express using the new actions AST representation to identify sources of untrusted data.
Identifying sources of untrusted data is only the first step towards uncovering more complex injection vulnerabilities. In the example above, it is crucial to understand Bash scripts to determine if they are reading from untrusted files and inserting data into shell variables. It is essential to comprehend how these variables may flow into the output of a step, and, subsequently, how the output flows through different steps, jobs, composite actions, or reusable workflows until they reach a potentially dangerous sink. This understanding is what Taint Tracking and Control Flow will achieve.
In summary, as illustrated in the CodeQL alert above, we can now identify non-obvious sources of untrusted data (for example, git
or gh
commands or third-party actions) and, more importantly, track this untrusted data throughout complex workflows. These workflows involve multiple steps, jobs, actions, and even entire workflows, allowing us to better understand where this data is used and to report potential vulnerabilities effectively.
Bash support
GitHub’s workflows can execute various scripts, with Bash scripts being among the most common. The new CodeQL packs for GitHub Actions offer basic support for Bash, helping to identify tainted data originating from Bash scripts. For example, commands such as git diff-tree
obtain a list of changed files. Tainted data can flow through a script when reading an attacker-controlled file or environment variable into a step’s output or another environment variable. A pertinent example of such a vulnerability could be found in a workflow of Azure CLI repository.
In the alert above, we can see how untrusted data, such as a pull request’s title, is assigned to the TITLE
environment variable. This variable is then read and processed by several commands, resulting in a new message variable that gets redirected to the special file pointed to by the GITHUB_ENV
variable. A malicious actor could craft a title that results in a multiline message
, allowing them to inject arbitrary environment variables into subsequent steps. This, in turn, would enable the attacker to exfiltrate secrets used in the workflow.
The new CodeQL packs are able to parse Bash scripts. While they don’t yet generate a full AST, they already allow us to understand elements such as assignments, pipelines, and redirections, enabling us to report subtle vulnerabilities like the one mentioned above.
Models
As explained in “Keeping your GitHub Actions and workflows secure Part 2: Untrusted input,” GitHub’s event context is the most common source of untrusted data. Properties such as github.event.issue.title
, github.event.pull_request.head.ref
, or github.event.comment.body
are typical sources of untrusted data. However, any third-party action may introduce untrusted data. For instance, an action that returns a list of filenames changed in a pull request should be considered a source of untrusted data. Similarly, actions that parse an issue body or comment for a command or structured data should also be treated as sources of untrusted data.
The same applies to actions that pass data from one of their inputs to their outputs or into an environment variable, therefore acting as taint steps (summaries). Actions such as actions/github-script
, azure/cli
, or azure/powershell
should be considered sinks for Code Injection, similar to a Run’s step.
We have analyzed thousands of popular third-party actions and identified a number of models now incorporated into the analysis:
- 62 sources
- 129 summaries
- 2199 sinks
Queries
The previous support for GitHub Actions contained a single query for code injection, whereas the new CodeQL packs incorporate 18 new queries, including the Code Injection and Environment Variable Injection queries mentioned above.
- Execution of Untrusted Code
- Execution of Untrusted Code (TOCTOU)
- Artifact Poisoning
- Code Injection
- Environment Variable Injection
- Path Injection
- Unpinned Action Tag
- Improper Access Control
- Excessive Secrets Exposure
- Secrets In Artifacts
- Expression is Always True
- Unmasked Secret Exposure
- Cache Poisoning (Code Injection, Direct Cache, Untrusted Code)
- Use of Known Vulnerable Actions
- Missing Action Permissions
- Argument Injection (Experimental)
- Code Execution on Self Hosted Runners (Experimental)
- Output Clobbering (Experimental)
Results
For the past few months, we have been testing the new queries on thousands of open source projects to validate their accuracy and performance. The results have been very impressive, allowing us to identify and report vulnerabilities in numerous critical organizations and repositories, such as Microsoft, Azure, GitHub, Eclipse, Jupyter, Adobe, AWS, Cloudflare, Discord, Hibernate, HuggingFace, and Apache.
The table below shows all the repositories affected, along with their GitHub stars, to give an idea of the impact that a supply chain attack could have had in these projects:
Repository | Stars |
---|---|
ant-design/ant-design | 92,412 |
Excalidraw/excalidraw | 84,021 |
apache/superset | 62,589 |
withastro/astro | 46,604 |
Stirling-Tools/Stirling-PDF | 44,988 |
geekan/MetaGPT | 44,901 |
Kong/kong | 39,221 |
LAION-AI/Open-Assistant | 37,045 |
appsmithorg/appsmith | 34,352 |
gradio-app/gradio | 33,709 |
DIYgod/RSSHub | 33,432 |
calcom/cal.com | 32,282 |
milvus-io/milvus | 30,299 |
k3s-io/k3s | 28,010 |
discordjs/discord.js | 25,390 |
element-plus/element-plus | 24,488 |
cilium/cilium | 20,150 |
monkeytypegame/monkeytype | 15,635 |
amplication/amplication | 15,196 |
docker-mailserver/docker-mailserver | 14,643 |
jupyterlab/jupyterlab | 14,167 |
openimsdk/open-im-server | 14,041 |
quarkusio/quarkus | 13,771 |
espressif/arduino-esp32 | 13,609 |
sympy/sympy | 12,967 |
ionic-team/stencil | 12,561 |
zephyrproject-rtos/zephyr | 10,819 |
qgis/QGIS | 10,569 |
trinodb/trino | 10,413 |
OpenFeign/feign | 9,490 |
marimo-team/marimo | 7,583 |
dream-num/univer | 7,021 |
aws/karpenter-provider-aws | 6,782 |
hibernate/hibernate-orm | 5,976 |
ant-design-blazor/ant-design-blazor | 5,809 |
litestar-org/litestar | 5,511 |
New vulnerability patterns
Having triaged and reported numerous alerts, we have identified some common patterns that often lead to vulnerabilities in GitHub workflows:
Misuse of pull_request_target trigger
The pull_request_target
event trigger, while offering powerful automation capabilities in GitHub Actions, harbors a dark side filled with potential security pitfalls. This event trigger, designed to execute workflows within the context of pull request’s base branch, presents special characteristics that severely increase the impact in case of any vulnerability. A workflow activated by pull_request_target
and triggered from a fork operates with significant privileges, in contrast to the pull_request
event:
- It is able to read repository and organization secrets.
- It is allowed to have write permissions.
- It circumvents the usual safeguards of pull request approvals, allowing workflows to run unimpeded even if approval mechanisms are configured for standard
pull_request
events. - It normally runs in the context of the default branch, which, as we will see, may allow malicious actors to poison the action’s cache and move laterally to other, more privileged workflows even when removing all permissions to the vulnerable workflow.
When working with pull_request_triggered
workflows, we have to be very careful and pay special attention to the following scenarios:
- Code execution from untrusted sources: The ability to execute code from forked pull requests is a double-edged sword. A malicious actor can submit a seemingly innocuous pull request that, when triggered by
pull_request_target
, unleashes havoc. This malicious code, running in the context of the target repository’s environment, could exfiltrate secrets or even tamper with repository contents and releases. The danger lies in the inadvertent checkout and execution of code from untrusted sources. Common mistakes include using theactions/checkout
action with a reference to the pull request’s head branch. -
Time of check to time of use (TOCTOU) attacks. Even with approval requirements in place (such as requiring the pull request to have a specific label that attackers are not able to set on their own), attackers can leverage the element of time to bypass security measures. A malicious actor can submit a seemingly harmless pull request, patiently wait for approval, and then swiftly update the pull request with malicious code before the workflow execution. The workflow, relying on a mutable reference like a branch name, falls prey to this “time of check to time of use” (TOCTOU) attack, unwittingly executing the newly injected malicious code. Confusion surrounding context variables, specifically
head.ref
andhead.sha
, can lead to vulnerabilities.head.ref
, pointing to the branch, is susceptible to manipulation by attackers. In contrast,head.sha
, referencing the specific commit, provides a reliable and immutable pointer to the reviewed and approved code. Using the incorrect variable can create an opening for attackers to inject malicious code after approval. -
Lurking threats in non-default branches. Vulnerabilities often linger in non-default branches, even after being addressed in the default branch. An attacker can target these vulnerable versions of the workflows residing in non-default branches, exploiting them by submitting pull requests specifically to those branches.
-
Cache poisoning. As a mitigation, developers may strip out all read and write permissions from these workflows when in need to run untrusted code. However, the seemingly innocuous
permissions: {}
configuration can unexpectedly pave the way for cache poisoning attacks. This attack vector involves injecting malicious content into the cache, which can subsequently affect other workflows relying on the poisoned cache entries. Even if a workflow doesn’t have write access to the repository, it can still poison the cache, leading to potential code execution or data manipulation in other workflows that utilize the compromised cache.
If we really need to use this trigger event, there are a few ways to harden the workflows to prevent any abuses:
- Repository checks. Implementing stringent repository checks is crucial for thwarting attacks originating from forked pull requests. One effective method is to configure workflows to execute only for pull requests originating from the base repository, effectively blocking any attempts from external forks. This can be achieved by using conditions such as
github.event.pull_request.head.repo.owner.login == “myorg”
to restrict workflow execution. -
Actor checks. Verifying the permissions of the pull request author is another layer of defense. By restricting workflow execution to trusted actors, such as members of the organization or approved collaborators, the risk of malicious code injection from unauthorized sources can be significantly reduced. Never use a hardcoded list of user names, since the users may lose the permissions with time or the user names may be left abandoned for an attacker to claim.
-
Workflow splitting. The above mitigations may not be useful if we need to run a workflow for forks or arbitrary users. In those cases, splitting workflows into unprivileged and privileged components is a powerful security strategy. Unprivileged workflows, triggered by
pull_request
events, handle the initial processing of pull requests without access to sensitive secrets or write permissions. Privileged workflows, activated byworkflow_run
events, are invoked only after the unprivileged workflow has completed its checks. This separation ensures that potentially malicious code from forked pull requests never executes within a privileged context. The unprivileged workflow will generally need to communicate and pass information to the privileged one. This is a crucial security boundary, and, as we will see in the next section, any data coming from the unprivileged workflow should be considered untrusted and potentially dangerous.
Security boundaries and workflow_run event
The workflow_run
event trigger in GitHub Actions is designed to automate tasks based on the execution or completion of another workflow. It may grant write permissions and access to secrets even if the triggering workflow doesn’t have such privileges. While this is beneficial for tasks like labeling pull requests based on test results, it poses significant security risks if not used carefully.
The workflow_run
trigger poses a risk because it can often be initiated by an attacker. Some maintainers were surprised by this, believing that their triggering workflows, which were run on events such as release
, were safe. This assumption was based on the idea that since an attacker couldn’t trigger a new release, they shouldn’t be able to initiate the triggering workflow or the subsequent workflow_run
workflow.
The reality is that an attacker can submit a pull request that modifies the triggering workflow and even replace the triggering events. Since pull_request
workflows run in the context of the pull request’s HEAD
branch, the modified workflow will run and, upon completion, will be able to trigger an existing workflow_run
workflow. The danger arises from the fact that even if the triggering pull_request
workflow is not privileged, the triggered workflow_run
workflow will have access to secrets and write-scoped tokens, even if the initial workflow did not have those privileges. This enables privilege escalation attacks, allowing attackers to execute malicious code with elevated permissions within the CI/CD pipeline.
Another significant pitfall with the workflow_run
event trigger is artifact poisoning. Artifacts are files generated during a workflow run that can be shared with other workflows. Attackers can poison these artifacts by uploading malicious content through a pull request. When a workflow_run
workflow downloads and uses these poisoned artifacts, it can lead to arbitrary code execution or other malicious activities within the privileged workflow. The issue is that many workflow_run
workflows do not verify the contents of downloaded artifacts before using them, making them vulnerable to various attacks.
Securing workflow_run
workflows requires a multi-faceted approach. By understanding the inherent risks and implementing the recommended mitigations, developers can leverage the automation benefits of workflow_run
while minimizing the potential for security compromises.
Effective mitigations
- Limit workflow scope with branch filters: Specify the branches where
workflow_run
workflows can be triggered using thebranches
filter. This helps restrict the scope of potential attacks by preventing them from being triggered on branches from forks. -
Verify event origin: Incorporate a check like
github.event_name != 'pull_request'
to preventworkflow_run
workflows from being triggered by pull requests from forks. This adds an extra layer of protection by ensuring that the triggering workflow originates from a trusted source. -
Treat artifacts as untrusted: Treat all downloaded artifacts as potentially malicious and implement rigorous validation checks before using them. Always unzip artifacts to a temporary directory like
/tmp
to prevent potential file overwrites, that is, the pollution of the workspace.- Avoid defining environment variables: Minimize the use of environment variables, especially when handling untrusted data. Environment variables can be vulnerable to injection attacks, potentially allowing attackers to modify their values and execute malicious code.
-
Handle output variables with caution: Exercise caution when defining output variables from artifact’s content, as they can also be vulnerable to manipulation by attackers. Always validate the contents of output variables (for example, that it is a number, not a string) before using them in subsequent steps or other workflows.
Non-effective mitigations
- Repository checks: We found several workflows (for example, AWS Karpenter Provider, or Cloudflare Workers SDK) relying solely on repository checks, such as verifying the repository owner (
github.repository_owner == 'myorg'
), which is not effective in mitigatingworkflow_run
risks since the workflow always runs in the context of the default branch which belongs to the organization.
IssueOops: Security pitfalls with issue_comment
trigger
The issue_comment
event trigger in GitHub Actions is a powerful tool for automating workflows based on comments on issues and pull requests. When applied in the context of IssueOps, it can streamline tasks like running commands in response to specific comments. However, this convenience comes with significant security risks that must be carefully considered.
- TOCTOU vulnerabilities: Similar to the
pull_request_target
event trigger, workflows usingissue_comment
can be vulnerable to TOCTOU attacks. If the workflow checks out and executes code from a pull request based on an issue comment, an attacker could exploit the time window between the comment and the workflow execution. An attacker might initially submit a harmless pull request, waiting for an administrator to review and approve the workflow by adding a comment. Once the approval is given, the attacker could quickly update the pull request with malicious code, which would then be executed by the workflow. -
Bypassing pull request approval mechanisms: The
issue_comment
event trigger is not subject to the pull request approval mechanisms intended to prevent abuse. Even if the workflows triggered bypull request
require approvals, an attacker can trigger anissue_comment
workflow by simply adding a comment to the pull request, potentially executing malicious code without any review.
Mitigating the risks
- Shifting to label gates: Instead of relying on issue comments to trigger critical workflows, consider adopting a label gates approach. Label gates use labels to trigger specific actions, allowing for more granular control and better security. Since the labeled activity type for a
pull_request
trigger contains details about the latest commit SHA of the pull request, there is no need for workflow to resolve a pull request number into the latest commit SHA, as it is the case withissue_comment
, and, therefore, there is no window for an attacker to modify the pull request. Remember to use the commit SHA rather than the HEAD reference to prevent TOCTOU vulnerabilities.
Ineffective or incomplete mitigations
-
Actor checks: Relying solely on actor checks (verifying the identity or permissions of the commenter) is ineffective. The actor triggering the workflow might not be the same one trying to exploit it, further rendering actor checks unreliable. This is the case for TOCTOU vulnerabilities where an attacker can submit a legitimate pull request and wait for an admin to trigger an IssueOp and then swiftly mutate the pull request by adding a new commit with malicious code on it.
-
Date checks: Comparing timestamps to verify that the last commit occurred before the triggering comment is also unreliable. Currently, GitHub has no reliable way to figure out the date when a commit was pushed to a repository. An attacker can forge commit dates, rendering these checks useless in preventing malicious code execution.
-
Repository checks: Verifying the origin repository is not a useful mitigation for
issue_comment
event triggers. Theissue_comment
event always executes within the context of the target repository’s default branch, making repository checks redundant.
Wrapping up
The new CodeQL support for GitHub Actions is in public preview. The new QL packs allow you to scan your repository for a variety of vulnerabilities in GitHub Actions, helping prevent supply chain attacks in the OSS software we all depend on! If you want to give them a try, take one of the following steps depending on your case:
- Your repository is newly configured for default setup: Code scanning will automatically attempt to analyze actions if any workflows are present on the default branch. You have nothing to do; keep calm and fix the potential alerts.
- Your repository is already configured for default setup: You need to edit the Default Setup settings and explicitly enable actions analysis.
- Your repository is using advanced setup: You just have to add
actions
to the language matrix.
Stay secure!
Tags:
Written by
Related posts
Announcing CodeQL Community Packs
We are excited to introduce the new CodeQL Community Packs, a comprehensive set of queries and models designed to enhance your code analysis capabilities. These packs are tailored to augment…
Uncovering GStreamer secrets
In this post, I’ll walk you through the vulnerabilities I uncovered in the GStreamer library and how I built a custom fuzzing generator to target MP4 files.
CodeQL zero to hero part 4: Gradio framework case study
Learn how I discovered 11 new vulnerabilities by writing CodeQL models for Gradio framework and how you can do it, too.