How to secure your GitHub Actions workflows with CodeQL

In the last few months, we secured 75+ GitHub Actions workflows in open source projects, disclosing 90+ different vulnerabilities. Out of this research we produced new support for workflows in CodeQL, empowering you to secure yours.

| 18 minutes

In the last few months, we secured more than 75 GitHub Actions workflows in open source projects, disclosing more than 90 different vulnerabilities. Out of this research, we produced new support for workflows in CodeQL, empowering you to secure yours.

The situation: growing number of insecure workflows

If you have read our series about keeping your GitHub Actions and workflows secure, you already have a good understanding of common vulnerabilities in GitHub Actions and how to solve them.

Unfortunately, we found that these vulnerabilities are still quite common, mostly because of a lack of awareness of how the moving parts interact with each other and what the impact of these vulnerabilities may be for your organization or repository.

To help prevent the introduction of vulnerabilities, identify them in existing workflows, and even fix them using GitHub Copilot Autofix, CodeQL support has been added for GitHub Actions. The new CodeQL packs can be used by code scanning to scan both existing and new workflows. As code scanning and Copilot Autofix are free for OSS repositories, all public GitHub repositories will have access to these new queries, empowering detection and remediation of these vulnerabilities.

In the rest of this post, we’ll see

  • What we added to our existing CodeQL support, including actions as a first-class language, taint tracking, bash support;
  • What models and queries we developed;
  • What vulnerabilities and what new patterns we discovered as part of this work.

Previous attempt

Previously, there was a single CodeQL query capable of identifying simplistic code injections in GitHub workflows. However, this query had several limitations. First, it was bundled with the JavaScript QL packs, meaning users had to enable JavaScript scanning even if they had no JavaScript code in their repositories, which was confusing and misleading. Additionally, the representation of GitHub workflow syntax and grammar was incomplete, making it difficult to express more complex patterns using the existing Abstract Syntax Tree (AST) of GitHub Actions, which is used by static analysis tools such as CodeQL. Most importantly, the CodeQL support for GitHub workflows did not previously include Taint Tracking support and models for non-straightforward sources of untrusted data or dangerous operations.

Taint Tracking is key!

So what is Taint Tracking and how important is it?

Through the previous query for Code Injection, we were able to identify simplistic vulnerabilities such as those cases where a known user-controlled property gets directly interpolated into a Run script:

Code snippet through which a known user-controlled property gets directly interpolated into a Run script

That was a great starting point, but what about cases such as the following?

The first step of the path is the download of an artifact.

Code snippet representing an artifact download

The second and third steps are setting the content of a file from the artifact as the output of the workflow step.

Code snippets setting the content of a file from the artifact as the output of the workflow step

In the last step, the value from the previous step is interpolated in an unsafe manner in a Run script leading to a potential code injection.

Code snippet identifying a potential code injection

In the case above, the source of untrusted data is not simply a GitHub Event Context access of a known untrusted property (for example, github.event.pull_request.body) but rather the download of an artifact. Should all artifacts be considered untrusted? Certainly not. However, in this instance, where the workflow is triggered by a workflow_run event with no branch filters and where an artifact is downloaded from the triggering workflow (github.event.workflow_run.workflow_id), the artifact should be considered untrusted. When decompressed, it may pollute the Runner’s workspace by writing to files in unexpected locations. Consequently, from that step onward, all files in the workspace should be considered untrusted. This example highlights a non-trivial pattern that we need to express using the new actions AST representation to identify sources of untrusted data.

Identifying sources of untrusted data is only the first step towards uncovering more complex injection vulnerabilities. In the example above, it is crucial to understand Bash scripts to determine if they are reading from untrusted files and inserting data into shell variables. It is essential to comprehend how these variables may flow into the output of a step, and, subsequently, how the output flows through different steps, jobs, composite actions, or reusable workflows until they reach a potentially dangerous sink. This understanding is what Taint Tracking and Control Flow will achieve.

In summary, as illustrated in the CodeQL alert above, we can now identify non-obvious sources of untrusted data (for example, git or gh commands or third-party actions) and, more importantly, track this untrusted data throughout complex workflows. These workflows involve multiple steps, jobs, actions, and even entire workflows, allowing us to better understand where this data is used and to report potential vulnerabilities effectively.

Bash support

GitHub’s workflows can execute various scripts, with Bash scripts being among the most common. The new CodeQL packs for GitHub Actions offer basic support for Bash, helping to identify tainted data originating from Bash scripts. For example, commands such as git diff-tree obtain a list of changed files. Tainted data can flow through a script when reading an attacker-controlled file or environment variable into a step’s output or another environment variable. A pertinent example of such a vulnerability could be found in a workflow of Azure CLI repository.

environment variable built from user-controlled sources

Code snippet showing how untrusted data, such as a pull request’s title, is assigned to the TITLE environment variable

In the alert above, we can see how untrusted data, such as a pull request’s title, is assigned to the TITLE environment variable. This variable is then read and processed by several commands, resulting in a new message variable that gets redirected to the special file pointed to by the GITHUB_ENV variable. A malicious actor could craft a title that results in a multiline message, allowing them to inject arbitrary environment variables into subsequent steps. This, in turn, would enable the attacker to exfiltrate secrets used in the workflow.

The new CodeQL packs are able to parse Bash scripts. While they don’t yet generate a full AST, they already allow us to understand elements such as assignments, pipelines, and redirections, enabling us to report subtle vulnerabilities like the one mentioned above.

Models

As explained in “Keeping your GitHub Actions and workflows secure Part 2: Untrusted input,” GitHub’s event context is the most common source of untrusted data. Properties such as github.event.issue.title, github.event.pull_request.head.ref, or github.event.comment.body are typical sources of untrusted data. However, any third-party action may introduce untrusted data. For instance, an action that returns a list of filenames changed in a pull request should be considered a source of untrusted data. Similarly, actions that parse an issue body or comment for a command or structured data should also be treated as sources of untrusted data.

The same applies to actions that pass data from one of their inputs to their outputs or into an environment variable, therefore acting as taint steps (summaries). Actions such as actions/github-script, azure/cli, or azure/powershell should be considered sinks for Code Injection, similar to a Run’s step.

We have analyzed thousands of popular third-party actions and identified a number of models now incorporated into the analysis:

  • 62 sources
  • 129 summaries
  • 2199 sinks

Queries

The previous support for GitHub Actions contained a single query for code injection, whereas the new CodeQL packs incorporate 18 new queries, including the Code Injection and Environment Variable Injection queries mentioned above.

  • Execution of Untrusted Code
  • Execution of Untrusted Code (TOCTOU)
  • Artifact Poisoning
  • Code Injection
  • Environment Variable Injection
  • Path Injection
  • Unpinned Action Tag
  • Improper Access Control
  • Excessive Secrets Exposure
  • Secrets In Artifacts
  • Expression is Always True
  • Unmasked Secret Exposure
  • Cache Poisoning (Code Injection, Direct Cache, Untrusted Code)
  • Use of Known Vulnerable Actions
  • Missing Action Permissions
  • Argument Injection (Experimental)
  • Code Execution on Self Hosted Runners (Experimental)
  • Output Clobbering (Experimental)

Results

For the past few months, we have been testing the new queries on thousands of open source projects to validate their accuracy and performance. The results have been very impressive, allowing us to identify and report vulnerabilities in numerous critical organizations and repositories, such as Microsoft, Azure, GitHub, Eclipse, Jupyter, Adobe, AWS, Cloudflare, Discord, Hibernate, HuggingFace, and Apache.

The table below shows all the repositories affected, along with their GitHub stars, to give an idea of the impact that a supply chain attack could have had in these projects:

Repository Stars
ant-design/ant-design 92,412
Excalidraw/excalidraw 84,021
apache/superset 62,589
withastro/astro 46,604
Stirling-Tools/Stirling-PDF 44,988
geekan/MetaGPT 44,901
Kong/kong 39,221
LAION-AI/Open-Assistant 37,045
appsmithorg/appsmith 34,352
gradio-app/gradio 33,709
DIYgod/RSSHub 33,432
calcom/cal.com 32,282
milvus-io/milvus 30,299
k3s-io/k3s 28,010
discordjs/discord.js 25,390
element-plus/element-plus 24,488
cilium/cilium 20,150
monkeytypegame/monkeytype 15,635
amplication/amplication 15,196
docker-mailserver/docker-mailserver 14,643
jupyterlab/jupyterlab 14,167
openimsdk/open-im-server 14,041
quarkusio/quarkus 13,771
espressif/arduino-esp32 13,609
sympy/sympy 12,967
ionic-team/stencil 12,561
zephyrproject-rtos/zephyr 10,819
qgis/QGIS 10,569
trinodb/trino 10,413
OpenFeign/feign 9,490
marimo-team/marimo 7,583
dream-num/univer 7,021
aws/karpenter-provider-aws 6,782
hibernate/hibernate-orm 5,976
ant-design-blazor/ant-design-blazor 5,809
litestar-org/litestar 5,511

New vulnerability patterns

Having triaged and reported numerous alerts, we have identified some common patterns that often lead to vulnerabilities in GitHub workflows:

Misuse of pull_request_target trigger

The pull_request_target event trigger, while offering powerful automation capabilities in GitHub Actions, harbors a dark side filled with potential security pitfalls. This event trigger, designed to execute workflows within the context of pull request’s base branch, presents special characteristics that severely increase the impact in case of any vulnerability. A workflow activated by pull_request_target and triggered from a fork operates with significant privileges, in contrast to the pull_request event:

  • It is able to read repository and organization secrets.
  • It is allowed to have write permissions.
  • It circumvents the usual safeguards of pull request approvals, allowing workflows to run unimpeded even if approval mechanisms are configured for standard pull_request events.
  • It normally runs in the context of the default branch, which, as we will see, may allow malicious actors to poison the action’s cache and move laterally to other, more privileged workflows even when removing all permissions to the vulnerable workflow.

When working with pull_request_triggered workflows, we have to be very careful and pay special attention to the following scenarios:

  • Code execution from untrusted sources: The ability to execute code from forked pull requests is a double-edged sword. A malicious actor can submit a seemingly innocuous pull request that, when triggered by pull_request_target, unleashes havoc. This malicious code, running in the context of the target repository’s environment, could exfiltrate secrets or even tamper with repository contents and releases. The danger lies in the inadvertent checkout and execution of code from untrusted sources. Common mistakes include using the actions/checkout action with a reference to the pull request’s head branch.
  • Time of check to time of use (TOCTOU) attacks. Even with approval requirements in place (such as requiring the pull request to have a specific label that attackers are not able to set on their own), attackers can leverage the element of time to bypass security measures. A malicious actor can submit a seemingly harmless pull request, patiently wait for approval, and then swiftly update the pull request with malicious code before the workflow execution. The workflow, relying on a mutable reference like a branch name, falls prey to this “time of check to time of use” (TOCTOU) attack, unwittingly executing the newly injected malicious code. Confusion surrounding context variables, specifically head.ref and head.sha, can lead to vulnerabilities. head.ref, pointing to the branch, is susceptible to manipulation by attackers. In contrast, head.sha, referencing the specific commit, provides a reliable and immutable pointer to the reviewed and approved code. Using the incorrect variable can create an opening for attackers to inject malicious code after approval.

  • Lurking threats in non-default branches. Vulnerabilities often linger in non-default branches, even after being addressed in the default branch. An attacker can target these vulnerable versions of the workflows residing in non-default branches, exploiting them by submitting pull requests specifically to those branches.

  • Cache poisoning. As a mitigation, developers may strip out all read and write permissions from these workflows when in need to run untrusted code. However, the seemingly innocuous permissions: {} configuration can unexpectedly pave the way for cache poisoning attacks. This attack vector involves injecting malicious content into the cache, which can subsequently affect other workflows relying on the poisoned cache entries. Even if a workflow doesn’t have write access to the repository, it can still poison the cache, leading to potential code execution or data manipulation in other workflows that utilize the compromised cache.

If we really need to use this trigger event, there are a few ways to harden the workflows to prevent any abuses:

  • Repository checks. Implementing stringent repository checks is crucial for thwarting attacks originating from forked pull requests. One effective method is to configure workflows to execute only for pull requests originating from the base repository, effectively blocking any attempts from external forks. This can be achieved by using conditions such as github.event.pull_request.head.repo.owner.login == “myorg” to restrict workflow execution.
  • Actor checks. Verifying the permissions of the pull request author is another layer of defense. By restricting workflow execution to trusted actors, such as members of the organization or approved collaborators, the risk of malicious code injection from unauthorized sources can be significantly reduced. Never use a hardcoded list of user names, since the users may lose the permissions with time or the user names may be left abandoned for an attacker to claim.

  • Workflow splitting. The above mitigations may not be useful if we need to run a workflow for forks or arbitrary users. In those cases, splitting workflows into unprivileged and privileged components is a powerful security strategy. Unprivileged workflows, triggered by pull_request events, handle the initial processing of pull requests without access to sensitive secrets or write permissions. Privileged workflows, activated by workflow_run events, are invoked only after the unprivileged workflow has completed its checks. This separation ensures that potentially malicious code from forked pull requests never executes within a privileged context. The unprivileged workflow will generally need to communicate and pass information to the privileged one. This is a crucial security boundary, and, as we will see in the next section, any data coming from the unprivileged workflow should be considered untrusted and potentially dangerous.

Security boundaries and workflow_run event

The workflow_run event trigger in GitHub Actions is designed to automate tasks based on the execution or completion of another workflow. It may grant write permissions and access to secrets even if the triggering workflow doesn’t have such privileges. While this is beneficial for tasks like labeling pull requests based on test results, it poses significant security risks if not used carefully.

The workflow_run trigger poses a risk because it can often be initiated by an attacker. Some maintainers were surprised by this, believing that their triggering workflows, which were run on events such as release, were safe. This assumption was based on the idea that since an attacker couldn’t trigger a new release, they shouldn’t be able to initiate the triggering workflow or the subsequent workflow_run workflow.

The reality is that an attacker can submit a pull request that modifies the triggering workflow and even replace the triggering events. Since pull_request workflows run in the context of the pull request’s HEAD branch, the modified workflow will run and, upon completion, will be able to trigger an existing workflow_run workflow. The danger arises from the fact that even if the triggering pull_request workflow is not privileged, the triggered workflow_run workflow will have access to secrets and write-scoped tokens, even if the initial workflow did not have those privileges. This enables privilege escalation attacks, allowing attackers to execute malicious code with elevated permissions within the CI/CD pipeline.

Another significant pitfall with the workflow_run event trigger is artifact poisoning. Artifacts are files generated during a workflow run that can be shared with other workflows. Attackers can poison these artifacts by uploading malicious content through a pull request. When a workflow_run workflow downloads and uses these poisoned artifacts, it can lead to arbitrary code execution or other malicious activities within the privileged workflow. The issue is that many workflow_run workflows do not verify the contents of downloaded artifacts before using them, making them vulnerable to various attacks.

Securing workflow_run workflows requires a multi-faceted approach. By understanding the inherent risks and implementing the recommended mitigations, developers can leverage the automation benefits of workflow_run while minimizing the potential for security compromises.

Effective mitigations

  • Limit workflow scope with branch filters: Specify the branches where workflow_run workflows can be triggered using the branches filter. This helps restrict the scope of potential attacks by preventing them from being triggered on branches from forks.
  • Verify event origin: Incorporate a check like github.event_name != 'pull_request' to prevent workflow_run workflows from being triggered by pull requests from forks. This adds an extra layer of protection by ensuring that the triggering workflow originates from a trusted source.

  • Treat artifacts as untrusted: Treat all downloaded artifacts as potentially malicious and implement rigorous validation checks before using them. Always unzip artifacts to a temporary directory like /tmp to prevent potential file overwrites, that is, the pollution of the workspace.

    • Avoid defining environment variables: Minimize the use of environment variables, especially when handling untrusted data. Environment variables can be vulnerable to injection attacks, potentially allowing attackers to modify their values and execute malicious code.
    • Handle output variables with caution: Exercise caution when defining output variables from artifact’s content, as they can also be vulnerable to manipulation by attackers. Always validate the contents of output variables (for example, that it is a number, not a string) before using them in subsequent steps or other workflows.

Non-effective mitigations

  • Repository checks: We found several workflows (for example, AWS Karpenter Provider, or Cloudflare Workers SDK) relying solely on repository checks, such as verifying the repository owner (github.repository_owner == 'myorg'), which is not effective in mitigating workflow_run risks since the workflow always runs in the context of the default branch which belongs to the organization.

IssueOops: Security pitfalls with issue_comment trigger

The issue_comment event trigger in GitHub Actions is a powerful tool for automating workflows based on comments on issues and pull requests. When applied in the context of IssueOps, it can streamline tasks like running commands in response to specific comments. However, this convenience comes with significant security risks that must be carefully considered.

  • TOCTOU vulnerabilities: Similar to the pull_request_target event trigger, workflows using issue_comment can be vulnerable to TOCTOU attacks. If the workflow checks out and executes code from a pull request based on an issue comment, an attacker could exploit the time window between the comment and the workflow execution. An attacker might initially submit a harmless pull request, waiting for an administrator to review and approve the workflow by adding a comment. Once the approval is given, the attacker could quickly update the pull request with malicious code, which would then be executed by the workflow.
  • Bypassing pull request approval mechanisms: The issue_comment event trigger is not subject to the pull request approval mechanisms intended to prevent abuse. Even if the workflows triggered by pull request require approvals, an attacker can trigger an issue_comment workflow by simply adding a comment to the pull request, potentially executing malicious code without any review.

Mitigating the risks

  • Shifting to label gates: Instead of relying on issue comments to trigger critical workflows, consider adopting a label gates approach. Label gates use labels to trigger specific actions, allowing for more granular control and better security. Since the labeled activity type for a pull_request trigger contains details about the latest commit SHA of the pull request, there is no need for workflow to resolve a pull request number into the latest commit SHA, as it is the case with issue_comment, and, therefore, there is no window for an attacker to modify the pull request. Remember to use the commit SHA rather than the HEAD reference to prevent TOCTOU vulnerabilities.

Ineffective or incomplete mitigations

  • Actor checks: Relying solely on actor checks (verifying the identity or permissions of the commenter) is ineffective. The actor triggering the workflow might not be the same one trying to exploit it, further rendering actor checks unreliable. This is the case for TOCTOU vulnerabilities where an attacker can submit a legitimate pull request and wait for an admin to trigger an IssueOp and then swiftly mutate the pull request by adding a new commit with malicious code on it.

  • Date checks: Comparing timestamps to verify that the last commit occurred before the triggering comment is also unreliable. Currently, GitHub has no reliable way to figure out the date when a commit was pushed to a repository. An attacker can forge commit dates, rendering these checks useless in preventing malicious code execution.

  • Repository checks: Verifying the origin repository is not a useful mitigation for issue_comment event triggers. The issue_comment event always executes within the context of the target repository’s default branch, making repository checks redundant.

Wrapping up

The new CodeQL support for GitHub Actions is in public preview. The new QL packs allow you to scan your repository for a variety of vulnerabilities in GitHub Actions, helping prevent supply chain attacks in the OSS software we all depend on! If you want to give them a try, take one of the following steps depending on your case:

  • Your repository is newly configured for default setup: Code scanning will automatically attempt to analyze actions if any workflows are present on the default branch. You have nothing to do; keep calm and fix the potential alerts.
  • Your repository is already configured for default setup: You need to edit the Default Setup settings and explicitly enable actions analysis.
  • Your repository is using advanced setup: You just have to add actions to the language matrix.

Stay secure!

Related posts

Attacks on Maven proxy repositories

Learn how specially crafted artifacts can be used to attack Maven repository managers. This post describes PoC exploits that can lead to pre-auth remote code execution and poisoning of the local artifacts in Sonatype Nexus and JFrog Artifactory.