Skip to content

/ Blog

Use Copilot for free Contact sales

AI & ML
- AI & ML
  Learn about artificial intelligence and machine learning across the GitHub ecosystem and the wider industry.
  - Generative AI
    Learn how to build with generative AI.
  - GitHub Copilot
    Change how you work with GitHub Copilot.
  - LLMs
    Everything developers need to know about LLMs.
  - Machine learning
    Machine learning tips, tricks, and best practices.
- How AI code generation works
  Explore the capabilities and benefits of AI code generation and how it can improve your developer experience.
  Learn more
Developer skills
- Developer skills
  Resources for developers to grow in their skills and careers.
  - Application development
    Insights and best practices for building apps.
  - Career growth
    Tips & tricks to grow as a professional developer.
  - GitHub
    Improve how you use GitHub at work.
  - GitHub Education
    Learn how to move into your first professional role.
  - Programming languages & frameworks
    Stay current on what’s new (or new again).
- Get started with GitHub documentation
  Learn how to start building, shipping, and maintaining software with GitHub.
  Learn more
Engineering
- Engineering
  Get an inside look at how we’re building the home for all developers.
  - Architecture & optimization
    Discover how we deliver a performant and highly available experience across the GitHub platform.
  - Engineering principles
    Explore best practices for building software at scale with a majority remote team.
  - Infrastructure
    Get a glimpse at the technology underlying the world’s leading AI-powered developer platform.
  - Platform security
    Learn how we build security into everything we do across the developer lifecycle.
  - User experience
    Find out what goes into making GitHub the home for all developers.
- How we use GitHub to be more productive, collaborative, and secure
  Our engineering and security teams do some incredible work. Let’s take a look at how we use GitHub to be more productive, build collaboratively, and shift security left.
  Learn more
Enterprise software
- Enterprise software
  Explore how to write, build, and deploy enterprise software at scale.
  - Automation
    Automating your way to faster and more secure ships.
  - CI/CD
    Guides on continuous integration and delivery.
  - Collaboration
    Tips, tools, and tricks to improve developer collaboration.
  - DevOps
    DevOps resources for enterprise engineering teams.
  - DevSecOps
    How to integrate security into the SDLC.
  - Governance & compliance
    Ensuring your builds stay clean.
- How enterprise engineering teams can successfully adopt AI
  Learn how to bring AI to your engineering teams and maximize the value that you get from it.
  Learn more
News & insights
- News & insights
  Keep up with what’s new and notable from inside GitHub.
  - Company news
    An inside look at news and product updates from GitHub.
  - Product
    The latest on GitHub’s platform, products, and tools.
  - Octoverse
    Insights into the state of open source on GitHub.
  - Policy
    The latest policy and regulatory changes in software.
  - Research
    Data-driven insights around the developer ecosystem.
  - The library
    Older news and updates from GitHub.
- Unlocking the power of unstructured data with RAG
  Learn how to use retrieval-augmented generation (RAG) to capture more insights.
  Learn more
Open Source
- Open Source
  Everything open source on GitHub.
  - Git
    The latest Git updates.
  - Maintainers
    Spotlighting open source maintainers.
  - Social impact
    How open source is driving positive change.
  - Gaming
    Explore open source games on GitHub.
- An introduction to innersource
  Organizations worldwide are incorporating open source methodologies into the way they build and ship their own software.
  Learn more
Security
- Security
  Stay up to date on everything security.
  - Application security
    Application security, explained.
  - Supply chain security
    Demystifying supply chain security.
  - Vulnerability research
    Updates from the GitHub Security Lab.
  - Web application security
    Helpful tips on securing web applications.
- The enterprise guide to AI-powered DevSecOps
  Learn about core challenges in DevSecOps, and how you can start addressing them with AI and automation.
  Learn more

Categories

AI & ML
- AI & ML
  Learn about artificial intelligence and machine learning across the GitHub ecosystem and the wider industry.
  - Generative AI
    Learn how to build with generative AI.
  - GitHub Copilot
    Change how you work with GitHub Copilot.
  - LLMs
    Everything developers need to know about LLMs.
  - Machine learning
    Machine learning tips, tricks, and best practices.
- How AI code generation works
  Explore the capabilities and benefits of AI code generation and how it can improve your developer experience.
  Learn more
Developer skills
- Developer skills
  Resources for developers to grow in their skills and careers.
  - Application development
    Insights and best practices for building apps.
  - Career growth
    Tips & tricks to grow as a professional developer.
  - GitHub
    Improve how you use GitHub at work.
  - GitHub Education
    Learn how to move into your first professional role.
  - Programming languages & frameworks
    Stay current on what’s new (or new again).
- Get started with GitHub documentation
  Learn how to start building, shipping, and maintaining software with GitHub.
  Learn more
Engineering
- Engineering
  Get an inside look at how we’re building the home for all developers.
  - Architecture & optimization
    Discover how we deliver a performant and highly available experience across the GitHub platform.
  - Engineering principles
    Explore best practices for building software at scale with a majority remote team.
  - Infrastructure
    Get a glimpse at the technology underlying the world’s leading AI-powered developer platform.
  - Platform security
    Learn how we build security into everything we do across the developer lifecycle.
  - User experience
    Find out what goes into making GitHub the home for all developers.
- How we use GitHub to be more productive, collaborative, and secure
  Our engineering and security teams do some incredible work. Let’s take a look at how we use GitHub to be more productive, build collaboratively, and shift security left.
  Learn more
Enterprise software
- Enterprise software
  Explore how to write, build, and deploy enterprise software at scale.
  - Automation
    Automating your way to faster and more secure ships.
  - CI/CD
    Guides on continuous integration and delivery.
  - Collaboration
    Tips, tools, and tricks to improve developer collaboration.
  - DevOps
    DevOps resources for enterprise engineering teams.
  - DevSecOps
    How to integrate security into the SDLC.
  - Governance & compliance
    Ensuring your builds stay clean.
- How enterprise engineering teams can successfully adopt AI
  Learn how to bring AI to your engineering teams and maximize the value that you get from it.
  Learn more
News & insights
- News & insights
  Keep up with what’s new and notable from inside GitHub.
  - Company news
    An inside look at news and product updates from GitHub.
  - Product
    The latest on GitHub’s platform, products, and tools.
  - Octoverse
    Insights into the state of open source on GitHub.
  - Policy
    The latest policy and regulatory changes in software.
  - Research
    Data-driven insights around the developer ecosystem.
  - The library
    Older news and updates from GitHub.
- Unlocking the power of unstructured data with RAG
  Learn how to use retrieval-augmented generation (RAG) to capture more insights.
  Learn more
Open Source
- Open Source
  Everything open source on GitHub.
  - Git
    The latest Git updates.
  - Maintainers
    Spotlighting open source maintainers.
  - Social impact
    How open source is driving positive change.
  - Gaming
    Explore open source games on GitHub.
- An introduction to innersource
  Organizations worldwide are incorporating open source methodologies into the way they build and ship their own software.
  Learn more
Security
- Security
  Stay up to date on everything security.
  - Application security
    Application security, explained.
  - Supply chain security
    Demystifying supply chain security.
  - Vulnerability research
    Updates from the GitHub Security Lab.
  - Web application security
    Helpful tips on securing web applications.
- The enterprise guide to AI-powered DevSecOps
  Learn about core challenges in DevSecOps, and how you can start addressing them with AI and automation.
  Learn more

Contact sales Use Copilot for free

codeql

Subscribe to all “codeql” posts via RSS or follow GitHub Changelog on Twitter to stay updated on everything we ship.

→ ~ cd github-changelog

→ ~/github-changelog|main git log main

showing all changes successfully

Code scanning can now be enabled on repositories before they contain CodeQL supported languages

February 5, 2024

Code scanning can now be enabled on repositories even if they don’t contain any code written in the languages currently supported by CodeQL. Default setup will automatically trigger the first scan when a supported language is detected on the default branch. This means users can now enable code scanning using default setup, for example on empty repositories, and have confidence that they will be automatically protected in the future when the languages in the repository change to include supported languages.

This also takes effect from the organization level so you can bulk-enable code scanning on repositories without CodeQL supported languages.

Enabled on repo without supported languages

This change is now on GitHub.com and will be available in GitHub Enterprise Server 3.13. For more information, see “About code scanning default setup.”

See more See more

CodeQL 2.16.1: Swift 5.9.2 Support, New Queries, and Scanned-File Count Changes

January 30, 2024

CodeQL 2.16.1 is now available to users of GitHub code scanning on github.com, and all new functionality will also be included in GHES 3.13. Users of GHES 3.12 or older can upgrade their CodeQL version.

Important changes in this release include:

Swift 5.9.2 is now supported.

We added a new query for Swift, swift/weak-password-hashing, to detect the use of inappropriate hashing algorithms for password hashing and a new query for Java, java/exec-tainted-environment, to detect the injection of environment variables names or values from remote input.

We improved the tracking of flows from handler methods of a PageModel class to the corresponding Razor Page (.cshtml) file, which may result in additional alerts from some queries.

JavaScript now supports doT templates and Go added support for AWS Lambda functions and fasthttp framework.

In the previous version, 2.16.0, we announced that we will update the way we measure the number of scanned files in the Code Scanning UI. This change is now live for JavaScript/TypeScript, Python, Ruby, Swift, and C#.

For a full list of changes, please refer to the complete changelog for version 2.16.1.

See more See more

CodeQL 2.16: Python Dependency Installation Disabled, New Queries, and Bug Fixes

January 23, 2024

CodeQL 2.16.0 is now available to users of GitHub code scanning on github.com, and all new functionality will also be included in GHES 3.13. Users of GHES 3.12 or older can upgrade their CodeQL version.

Important changes in this release include:

In July 2023, we disabled automatic dependency installation for new CodeQL code scanning setups when analyzing Python code. With the release of CodeQL 2.16.0, we have disabled dependency installation for all existing configurations as well. This change should lead to a decrease in analysis time for projects that were installing dependencies during analysis, without any significant impact on results. A fallback environment variable flag is available to ease the transition, but will be removed in CodeQL 2.17.0. No action is required for Default setup users. Advanced setup users that had previously set the setup-python-dependencies option in their CodeQL code scanning workflows are encouraged to remove it, as it no longer has any effect.

We fixed a bug that could cause CodeQL to consume more memory than configured when using the --ram flag. If you have used this flag to manually override the memory allocation limit for CodeQL, you may be able to increase it slightly to more closely match the system’s available memory. No action is required for users of the CodeQL Action (on github.com or in GHES) who are not using this flag, as memory limits are calculated automatically.

We added 2 new C/C++ queries that detect pointer lifetime issues, and identify instances where the return value of scanf is not checked correctly. We added a new Java query that detects uses of weakly random values, which an attacker may be able to predict. Furthermore, we improved the precision and fixed potential false-positives for several other queries.

The measure of scanning Go files in the code scanning UI now includes partially extracted files, as this more accurately reflects the source of extracted information even when parts of a file could not be analyzed. We will gradually roll this change out for all supported languages in the near future.

We fixed a bug that led to errors in build commands for Swift analyses on macOS that included the codesign tool.

For a full list of changes, please refer to the complete changelog for version 2.16.0 and 2.15.5.

See more See more

Code scanning: deprecation of CodeQL Action v2

January 12, 2024

On December 13, 2023, we released CodeQL Action v3, which runs on the Node.js 20 runtime. CodeQL Action v2 will be deprecated at the same time as GHES 3.11, which is currently scheduled for December 2024.

How does this affect me?

Default setup

Users of code scanning default setup do not need to take any action in order to automatically move to CodeQL Action v3.

Advanced setup

Users of code scanning advanced setup need to change their workflow files in order to start using CodeQL Action v3.

Users of GitHub.com and GitHub Enterprise Server 3.12 (and newer)

All users of GitHub code scanning (which by default uses the CodeQL analysis engine) on GitHub Actions on the following platforms should update their workflow files:

GitHub.com (including open source repositories, users of GitHub Teams and GitHub Enterprise Cloud)
GitHub Enterprise Server (GHES) 3.12 (and newer)

Users of the above-mentioned platforms should update their CodeQL workflow file(s) to refer to the new v3 version of the CodeQL Action. Note that the upcoming release of GitHub Enterprise Server 3.12 will ship with v3 of the CodeQL Action included.

Users of GitHub Enterprise Server 3.11

While GHES 3.11 does support Node 20 Actions, it does not ship with CodeQL Action v3. Users who want to migrate to v3 on GHES 3.11 should request that their system administrator enables GitHub Connect to download v3 onto GHES before updating their workflow files.

Users of GitHub Enterprise Server 3.10 (and older)

GHES 3.10 (and earlier) does not support running Actions using the Node 20 runtime and is therefore unable to run CodeQL Action v3. Please upgrade to a newer version of GitHub Enterprise Server prior to changing your CodeQL Action workflow files.

Exactly what do I need to change?

To upgrade to CodeQL Action v3, open your CodeQL workflow file(s) in the .github directory of your repository and look for references to:

github/codeql-action/init@v2
github/codeql-action/autobuild@v2
github/codeql-action/analyze@v2
github/codeql-action/upload-sarif@v2

These entries need to be replaced with their v3 equivalents:

github/codeql-action/init@v3
github/codeql-action/autobuild@v3
github/codeql-action/analyze@v3
github/codeql-action/upload-sarif@v3

Can I use Dependabot to help me with this upgrade?

Yes, you can! For more details on how to configure Dependabot to automatically upgrade your Actions dependencies, please see this page.

What happens in December 2024?

In December 2024, CodeQL Action v2 will be officially deprecated (at the same time as the GHES 3.11 deprecation). At that point, no new updates will be made to CodeQL Action v2, which means that new CodeQL analysis capabilities will only be available to users of CodeQL Action v3. We will keep a close eye on the migration progress across GitHub. If many workflow files still refer to CodeQL Action v2, we might consider scheduling one or more brownout moments later in the year to increase awareness.

See more See more

Code scanning is now more adaptable to your codebase with CodeQL threat model settings for Java (beta)

December 20, 2023

Use CodeQL threat model settings for Java (beta) to adapt CodeQL's code scanning analysis to detect the most relevant security vulnerabilities in your code.

No two codebases are the same and each is subject to different security risks and threats. Such risks and threats can be captured in a codebase's threat model which, in turn, depends on how the code has been designed and will be deployed. To understand the threat model you need to know what type of data is untrusted and poses a threat to the codebase. Additonally, you need to know how that unstrusted (or tainted) data interacts with the application. For example, one codebase might only consider data from remote network requests to be untrusted, whereas another might also consider data from local files to be tainted.

CodeQL can perform security analysis on all such codebases, but it needs to have the right context. It needs the threat model in order to behave slightly differently on different codebases. That way, CodeQL can include (or exclude) the appropriate sources of tainted data during its analysis, and flag up the most relevant security vulnerabilities to developers who work on the code.

CodeQL's default threat model works for the vast majority of codebases. It considers data from remote sources (such as HTTP requests) as tainted. Using new CodeQL threat model settings for Java, you can now optionally mark local sources of data as tainted. This includes data from local files, command-line arguments, environment variables, and databases. You can enable the local threat model option in code scanning to help security teams and developers uncover and fix more potential security vulnerabilities in their code.

CodeQL threat model settings can be configured in repositories running code scanning with CodeQL via default setup in the GitHub UI. Alternatively, you can specify it through advanced setup (in an Actions workflow file).

If your repository is running code scanning default setup on Java code, go to the Code security and analysis settings and click Edit configuration under Code scanning default setup. Here, you can change the threat model to Remote and local sources. For more information, see the documentation on including local sources of tainted data in default setup.

If your repository is running code scanning advanced setup on Java code, you can customize the CodeQL threat model by editing the code scanning workflow file. For more information, see the documentation on extending CodeQL coverage with threat models. If you run the CodeQL CLI on the command-line or in third party CI/CD, you can specify a --threat-model when running a code scanning analysis. For more information see the CodeQL CLI documentation.

CodeQL threat model settings (beta) in code scanning default setup is available on GitHub.com for repositories containing Java code. It will be shipped in GitHub Enterprise Server 3.13.

See more See more

Code scanning default setup is now available for self-hosted runners on GitHub.com

December 19, 2023

Code scanning default setup is now available for self-hosted runners on GitHub.com. To use default setup for code scanning, assign the code-scanning label to your runner. Default setup now uses actions/github-script instead of the GH CLI. If your organization has a policy which limits GitHub Actions you will need to allow this action in your policy.

Code scanning sees assigned runners when default setup is enabled. As a result, if a runner is assigned to a repository which is already running default setup, you must disable and re-enable default setup to initiate using the runner.

Larger runners are in beta support, with the limitations that you can only define one single larger runner at the org level with the label code-scanning, and Swift analysis is not supported.

For more information, see “Using labels with self-hosted runners.”

Runner with code-scanning label

This is now available on GitHub.com. Self-Hosted runners for default setup are already supported from GitHub Enterprise Server 3.9.

See more See more

CodeQL 2.15.4: Performance Improvements and Updated Language Support

December 13, 2023

CodeQL 2.15.4 is rolling out to users of GitHub code scanning on github.com this week, and all new functionality will also be included in GHES 3.12. Users of GHES 3.11 or older can upgrade their CodeQL version.

Important changes in this release include:

Performance improvements on large runners (instances with 8 to 16 vCPUs) lead to a reduction in end to end analysis time between 5% and 15%, due to more effective parallelization. Where possible, upgrading to larger instances is recommend for projects that currently use 4 or fewer vCPUs and take more than 10 minutes to analyze.
Analysis times for C and C++ code bases of any size are reduced on average by 6%
TypeScript 5.3, Java 21 and Python 3.12 are now supported.
We have resolved a problem causing scan timeouts on macOS (the default for Swift analysis). This problem affected up to 10% of scans for some projects. Although timeouts may still occur, they are now expected in less than 0.5% of scans. We are actively addressing the remaining issues.

For a full list of changes, please refer to the complete changelog for version 2.15.4.

See more See more

CodeQL 2.15.3 supports Swift 5.9, improves C# analysis speed

November 22, 2023

CodeQL 2.15.3 is rolling out to users of GitHub code scanning on github.com this week, and all new functionality will also be included in GHES 3.12. Users of GHES 3.11 or older can upgrade their CodeQL version.

Important changes in this release include:

CodeQL now runs more than 400 security checks across all supported languages when configured with the Default suite, 10% more compared to a year ago
CIL extraction for C# code bases is now disabled by default, which improves query execution time for C# CodeQL databases by up to 25%
Swift code bases using Swift 5.9.1 can now be analyzed using CodeQL, and two new security queries have been added
We’ve also improved the depth and quality of existing queries

For a full list of changes, please refer to the complete changelog for version 2.15.3.

See more See more

Code scanning default setup automatically includes all CodeQL supported languages

October 23, 2023

Code scanning default setup now automatically attempts to analyze all CodeQL supported languages in a repository. This means default setup supports all CodeQL languages at the organization level, including enabling code scanning from an organization's Security Overview coverage page or settings page.

Previously, users would have to manually include the languages C, C++, C#, Java, or Kotlin in a default setup analysis, and enabling these languages was not supported at the organization level. Now, code scanning default setup automatically attempts to analyze all languages supported by CodeQL in a repository. If any analyses fail, the failed language will be automatically deselected from the code scanning configuration. Any alerts from the successfully analyzed languages will be shown on GitHub. This means code scanning will automatically set up the best possible configuration to get started easily with CodeQL and show the most relevant alerts to developers.

A warning banner is shown in the repository settings page if any languages fail and are deseslected. The "edit configuration" page shows all languages in the configuration, and allows users to change the language selection if required. For more information about the languages and versions supported by CodeQL and code scanning, see Supported languages and frameworks. To learn more about code scanning, see About code scanning.

This change is already available on GitHub.com and will be available in GitHub Enterprise Server 3.12.

See more See more

Code scanning with CodeQL supports Go 1.21

October 20, 2023

To enable developers to write code as securely as possible in their language of choice and using the latest features available, we constantly update code scanning with CodeQL. As such we are happy to announce that CodeQL now supports analyzing code written in Go 1.21.

Go 1.21 support is available by default in GitHub.com code scanning, CodeQL version 2.14.6, and GHES 3.11. For more information about the languages and versions supported by CodeQL and code scanning, see Supported languages and frameworks. To learn more about code scanning, see About code scanning.

See more See more

CodeQL code scanning deprecates ML-powered alerts

September 29, 2023

In February 2022, we introduced experimental CodeQL queries that utilize machine learning to identify more potential vulnerabilities. This feature was only available for JavaScript / TypeScript code and was available to code scanning users that enabled the optional security-extended or security-and-quality query suites.

We disabled this experimental feature for new code scanning users in June 2023. Today, we're sunsetting it for all users.

Any currently open code scanning alerts from these queries (Rule ID starts with js/ml-powered/) will be closed. Closed alerts will still be visible in the code scanning alerts view in your repository’s Security tab. The complete history of each alert will remain accessible by clicking on the alert.

CodeQL will continue to run the existing non-ML versions of these queries and provide you with highly precise and actionable alerts.

We’ve learned a lot from the feedback and experience of the repositories that participated in this experiment, and we’ve since ramped up our investment in AI-powered security technology. This new technology is already boosting our ability to cover more sources and sinks of untrusted data in order to significantly increase the coverage and depth of all queries.

See more See more

Easily customize code scanning using CodeQL model packs for Java (beta)

September 19, 2023

With CodeQL model packs for Java, users can improve their code scanning results by ensuring that any custom Java libraries and frameworks used by their codebase are recognised by CodeQL.

The out-of-the-box CodeQL threat models provide great coverage for identifying large numbers of potential vulnerabilities in GitHub repositories using code scanning. We are continually working to improve CodeQL's ability to recognize and track potential sources of untrusted data to potentially-vulnerable locations ('sinks'). To do that, we keep a close eye on the most widely-used open-source libraries and frameworks. That way, CodeQL can recognize untrusted data that enters an application through, for example, commonly-used web frameworks. We are even using advances in AI to boost our threat modeling efforts and help developers write even more secure code.

There will always be cases which are not covered by CodeQL's standard threat models, such as custom-built or inner-sourced frameworks and libraries. Using CodeQL's new model pack functionality for Java (beta), security teams and security-conscious developers can create custom models that help CodeQL detect and flag additional security vulnerabilities. These custom model packs work seamlessly in GitHub code scanning, which means developers get the most relevant code scanning alerts during their day-to-day work.

CodeQL model packs are part of the CodeQL package management ecosystem. The packs contain structured data which describe whether a method within a library is a taint source, sink, or propagator (also known as a flow summary). You can create CodeQL model packs for Java using the CodeQL model editor, a new feature in the CodeQL extension for VS Code. The CodeQL model editor includes support for:

identifying methods in your codebase that aren't recognised by the standard CodeQL analysis
interactively classifying those methods as a source, sink, or summary
automatically generating a CodeQL model pack that can be easily added to code scanning.

For more information about using CodeQL model packs in code scanning, see:

For more information about using the CodeQL model editor, see Using the CodeQL model editor.

See more See more

Code scanning with CodeQL improves support for Java codebases that use Project Lombok

September 1, 2023

Code scanning with CodeQL now supports Java codebases that use Project Lombok. Previously, code scanning users were able to scan Java applications that contained Lombok code, but all the contents of files containing Lombok code were either skipped or users had to apply a workaround to prepare the applications for scanning. The improved support means that code with Lombok features will be automatically scanned without requiring any workaround.

As more code will now be analyzed by the CodeQL engine, we can establish more accurate data flow (or lack thereof) through Lombok code. This might have an impact on the number of alerts produced by a scan. The most common scenario is that additional alerts appear in the newly-analyzed code. Conversely, there is a very small chance that some existing alerts are closed.

Improved support for Java applications built using Lombok is available for code scanning users on GitHub.com starting today and GitHub Enterprise Server users starting with 3.11. CodeQL CLI will provide out of the box support starting with the upcoming version 2.14.4. Security researchers can set up the CodeQL CLI and VS Code extension by following these instructions.

See more See more

Code scanning default setup now analyzes on a weekly schedule

August 22, 2023

In addition to scanning push and pull requests, code scanning default setup now also analyzes repositories on a weekly schedule. This ensures that a scan with the most recent version of CodeQL is run regularly on your code, better protecting both active and inactive repositories. This allows users to always benefit from CodeQL engine and query improvements which are continuously released, and which could uncover new potential vulnerabilities.

When setting up code scanning, the fixed time for the weekly scan is randomly chosen. The scan will take place at the same time every week, and the schedule is displayed after the setup is completed, so you can easily see when the next scheduled analysis will occur. The scheduled analysis will be automatically disabled if a repository has seen no activity for 6 months. Opening a PR or pushing to the repo will re-enable the scheduled analysis.

Screenshot that shows the weekly scheduled scan

This has shipped to GitHub.com and will be released with GitHub Enterprise Server 3.11.

See more See more

New dataflow API for writing custom CodeQL queries

August 14, 2023

We have released a new API for people who write custom CodeQL queries which make use of dataflow analysis. The new API offers additional flexibility, improvements that prevent common pitfalls with the old API, and improves query evaluation performance by 5%. Whether you’re writing CodeQL queries for personal interest, or are participating in the bounty programme to help us secure the world’s code: this post will help you move from the old API to the new one.

This API change is relevant only for users who write their own custom CodeQL queries. Code scanning users who use GitHub’s standard CodeQL query suites will not need to make any changes.

With the introduction of the new dataflow API, the old API will be deprecated. The old API will continue to work until December 2024; the CodeQL CLI will start emitting deprecation warnings in December 2023.

To demonstrate how to update CodeQL queries from the old to the new API, consider this example query which uses the soon-to-be-deprecated API:

class SensitiveLoggerConfiguration extends TaintTracking::Configuration {
  SensitiveLoggerConfiguration() { this = "SensitiveLoggerConfiguration" } // 6: characteristic predicate with dummy string value (see below)

  override predicate isSource(DataFlow::Node source) { source.asExpr() instanceof CredentialExpr }

  override predicate isSink(DataFlow::Node sink) { sinkNode(sink, "log-injection") }

  override predicate isSanitizer(DataFlow::Node sanitizer) {
    sanitizer.asExpr() instanceof LiveLiteral or
    sanitizer.getType() instanceof PrimitiveType or
    sanitizer.getType() instanceof BoxedType or
    sanitizer.getType() instanceof NumberType or
    sanitizer.getType() instanceof TypeType
  }

  override predicate isSanitizerIn(DataFlow::Node node) { this.isSource(node) }
}

import DataFlow::PathGraph

from SensitiveLoggerConfiguration cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink.getNode(), source, sink, "This $@ is written to a log file.",
 source.getNode(),
  "potentially sensitive information"

To convert the query to the new API:

You use a module instead of a class. A CodeQL module does not extend anything, it instead implements a signature. For both data flow and taint tracking configurations this is DataFlow::ConfigSig or DataFlow::StateConfigSigif FlowState is needed.
Previously, you would choose between data flow or taint tracking by extending DataFlow::Configuration or TaintTracking::Configuration. Instead, now you define your data or taint flow by instantiating either the DataFlow::Global<..> or TaintTracking::Global<..> parameterized modules with your implementation of the shared signature and this is where the choice between data flow and taint tracking is made.
Predicates no longer override anything, because you are defining a module.
The concepts of sanitizers and barriers are now unified under isBarrier and it applies to both taint tracking and data flow configurations. You must use isBarrier instead of isSanitizer and isBarrierIn instead of isSanitizerIn.
Similarly, instead of the taint tracking predicate isAdditionalTaintStep you use isAdditionalFlowStep .
A characteristic predicate with a dummy string value is no longer needed.
Do not use the generic DataFlow::PathGraph. Instead, the PathGraph will be imported directly from the module you are using. For example, SensitiveLoggerFlow::PathGraph in the updated version of the example query below.
Similar to the above, you’ll use the PathNode type from the resulting module and not from DataFlow.
Since you no longer have a configuration class, you’ll use the module directly in the from and where clauses. Instead of using e.g. cfg.hasFlowPath or cfg.hasFlow from a configuration object cfg, you’ll use flowPath or flow from the module you’re working with.

Taking all of the above changes into account, here’s what the updated query looks like:

module SensitiveLoggerConfig implements DataFlow::ConfigSig {  // 1: module always implements DataFlow::ConfigSig or DataFlow::StateConfigSig
  predicate isSource(DataFlow::Node source) { source.asExpr() instanceof CredentialExpr } // 3: no need to specify 'override'
  predicate isSink(DataFlow::Node sink) { sinkNode(sink, "log-injection") }

  predicate isBarrier(DataFlow::Node sanitizer) {  // 4: 'isBarrier' replaces 'isSanitizer'
    sanitizer.asExpr() instanceof LiveLiteral or
    sanitizer.getType() instanceof PrimitiveType or
    sanitizer.getType() instanceof BoxedType or
    sanitizer.getType() instanceof NumberType or
    sanitizer.getType() instanceof TypeType
  }

  predicate isBarrierIn(DataFlow::Node node) { isSource(node) } // 4: isBarrierIn instead of isSanitizerIn

}

module SensitiveLoggerFlow = TaintTracking::Global<SensitiveLoggerConfig>; // 2: TaintTracking selected 

import SensitiveLoggerFlow::PathGraph  // 7: the PathGraph specific to the module you are using

from SensitiveLoggerFlow::PathNode source, SensitiveLoggerFlow::PathNode sink  // 8 & 9: using the module directly
where SensitiveLoggerFlow::flowPath(source, sink)  // 9: using the flowPath from the module 
select sink.getNode(), source, sink, "This $@ is written to a log file.", source.getNode(),
  "potentially sensitive information"

While not covered in this example, you can also implement the DataFlow::StateConfigSig signature if flow-state is needed. You then instantiate DataFlow::GlobalWithState or TaintTracking::GlobalWithState with your implementation of that signature. Another change specific to flow-state is that instead of using DataFlow::FlowState, you now define a FlowState class as a member of the module. This is useful for using types other than string as the state (e.g. integers, booleans). An example of this implementation can be found here.

This functionality is available with CodeQL version 2.13.0. If you would like to get started with writing your own custom CodeQL queries, follow these instructions to get started with the CodeQL CLI and the VS Code extension.

See more See more

View more changes