Skip to content

New dataflow API for writing custom CodeQL queries

We have released a new API for people who write custom CodeQL queries which make use of dataflow analysis. The new API offers additional flexibility, improvements that prevent common pitfalls with the old API, and improves query evaluation performance by 5%. Whether you’re writing CodeQL queries for personal interest, or are participating in the bounty programme to help us secure the world’s code: this post will help you move from the old API to the new one.

This API change is relevant only for users who write their own custom CodeQL queries. Code scanning users who use GitHub’s standard CodeQL query suites will not need to make any changes.

With the introduction of the new dataflow API, the old API will be deprecated. The old API will continue to work until December 2024; the CodeQL CLI will start emitting deprecation warnings in December 2023.

To demonstrate how to update CodeQL queries from the old to the new API, consider this example query which uses the soon-to-be-deprecated API:

class SensitiveLoggerConfiguration extends TaintTracking::Configuration {
  SensitiveLoggerConfiguration() { this = "SensitiveLoggerConfiguration" } // 6: characteristic predicate with dummy string value (see below)

  override predicate isSource(DataFlow::Node source) { source.asExpr() instanceof CredentialExpr }

  override predicate isSink(DataFlow::Node sink) { sinkNode(sink, "log-injection") }

  override predicate isSanitizer(DataFlow::Node sanitizer) {
    sanitizer.asExpr() instanceof LiveLiteral or
    sanitizer.getType() instanceof PrimitiveType or
    sanitizer.getType() instanceof BoxedType or
    sanitizer.getType() instanceof NumberType or
    sanitizer.getType() instanceof TypeType
  }

  override predicate isSanitizerIn(DataFlow::Node node) { this.isSource(node) }
}

import DataFlow::PathGraph

from SensitiveLoggerConfiguration cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink.getNode(), source, sink, "This $@ is written to a log file.",
 source.getNode(),
  "potentially sensitive information"

To convert the query to the new API:

  1. You use a module instead of a class. A CodeQL module does not extend anything, it instead implements a signature. For both data flow and taint tracking configurations this is DataFlow::ConfigSig or DataFlow::StateConfigSigif FlowState is needed.
  2. Previously, you would choose between data flow or taint tracking by extending DataFlow::Configuration or TaintTracking::Configuration. Instead, now you define your data or taint flow by instantiating either the DataFlow::Global<..> or TaintTracking::Global<..> parameterized modules with your implementation of the shared signature and this is where the choice between data flow and taint tracking is made.
  3. Predicates no longer override anything, because you are defining a module.
  4. The concepts of sanitizers and barriers are now unified under isBarrier and it applies to both taint tracking and data flow configurations. You must use isBarrier instead of isSanitizer and isBarrierIn instead of isSanitizerIn.
  5. Similarly, instead of the taint tracking predicate isAdditionalTaintStep you use isAdditionalFlowStep .
  6. A characteristic predicate with a dummy string value is no longer needed.
  7. Do not use the generic DataFlow::PathGraph. Instead, the PathGraph will be imported directly from the module you are using. For example, SensitiveLoggerFlow::PathGraph in the updated version of the example query below.
  8. Similar to the above, you’ll use the PathNode type from the resulting module and not from DataFlow.
  9. Since you no longer have a configuration class, you’ll use the module directly in the from and where clauses. Instead of using e.g. cfg.hasFlowPath or cfg.hasFlow from a configuration object cfg, you’ll use flowPath or flow from the module you’re working with.

Taking all of the above changes into account, here’s what the updated query looks like:

module SensitiveLoggerConfig implements DataFlow::ConfigSig {  // 1: module always implements DataFlow::ConfigSig or DataFlow::StateConfigSig
  predicate isSource(DataFlow::Node source) { source.asExpr() instanceof CredentialExpr } // 3: no need to specify 'override'
  predicate isSink(DataFlow::Node sink) { sinkNode(sink, "log-injection") }

  predicate isBarrier(DataFlow::Node sanitizer) {  // 4: 'isBarrier' replaces 'isSanitizer'
    sanitizer.asExpr() instanceof LiveLiteral or
    sanitizer.getType() instanceof PrimitiveType or
    sanitizer.getType() instanceof BoxedType or
    sanitizer.getType() instanceof NumberType or
    sanitizer.getType() instanceof TypeType
  }

  predicate isBarrierIn(DataFlow::Node node) { isSource(node) } // 4: isBarrierIn instead of isSanitizerIn

}

module SensitiveLoggerFlow = TaintTracking::Global<SensitiveLoggerConfig>; // 2: TaintTracking selected 

import SensitiveLoggerFlow::PathGraph  // 7: the PathGraph specific to the module you are using

from SensitiveLoggerFlow::PathNode source, SensitiveLoggerFlow::PathNode sink  // 8 & 9: using the module directly
where SensitiveLoggerFlow::flowPath(source, sink)  // 9: using the flowPath from the module 
select sink.getNode(), source, sink, "This $@ is written to a log file.", source.getNode(),
  "potentially sensitive information"

While not covered in this example, you can also implement the DataFlow::StateConfigSig signature if flow-state is needed. You then instantiate DataFlow::GlobalWithState or TaintTracking::GlobalWithState with your implementation of that signature. Another change specific to flow-state is that instead of using DataFlow::FlowState, you now define a FlowState class as a member of the module. This is useful for using types other than string as the state (e.g. integers, booleans). An example of this implementation can be found here.

This functionality is available with CodeQL version 2.13.0. If you would like to get started with writing your own custom CodeQL queries, follow these instructions to get started with the CodeQL CLI and the VS Code extension.

Today's update brings the ability to set an allowlist for languages within the IntelliJ extension, quickly switch to an annual GitHub Copilot for Individuals plan, and the private preview of code referencing.

Select languages setting within IntelliJ

The previous disabledLanguages configuration is replaced with a new, more flexible languageAllowList configuration. This change allows enabling or disabling all languages at once using the * wildcard.

github-copilot.xml location

The github-copilot.xml file is located at

~/Library/Application Support/JetBrains/<IDE+VERSION>/options/github-copilot.xml

For example the path to github-copilot.xml for IntelliJ version 2022.3 is

~/Library/Application Support/JetBrains/IntelliJIdea2022.3/options/github-copilot.xml

github-copilot.xml for enabling all languages (default behavior)

<application>
  <component name="github-copilot">
    <languageAllowList>
      <map>
        <entry key="*" value="true" />
      </map>
    </languageAllowList>
  </component>
</application>

You can now specify an individual language override if your configuration also includes a wildcard.

github-copilot.xml for disabling all languages except for Kotlin and Java

<application>
  <component name="github-copilot">
    <languageAllowList>
      <map>
        <entry key="*" value="false" />
        <entry key="kotlin" value="true" />
        <entry key="java" value="true" />
      </map>
    </languageAllowList>
  </component>
</application>

A new hidden languageAllowListReadOnly configuration property has been added that makes languageAllowList readonly in the UI.

github-copilot.xml for making the UI setting readonly and enabling all languages

<application>
  <component name="github-copilot">
    <option name="languageAllowListReadOnly" value="true" />
    <languageAllowList>
      <map>
        <entry key="*" value="true" />
      </map>
    </languageAllowList>
  </component>
</application>

An easier way manage your Copilot for Individuals trial and plan

We've heard confusion from users on how to switch between monthly and annual billing. We want you to feel fully in control of your GitHub Copilot plan so we've updated the Plans and Usage page to make it easier to swap between your plan options. Just head down to the Copilot plan section and hit the Manage subscription button to see your options.

We've also added the option to activate your Copilot trial directly from this page and while we'd hate to see you go, if you find that Copilot isn't for you during your trial, you can quickly cancel it before it converts into a paid plan.

DURING TRIAL

Try out code referencing [Private Beta]!

Last week, we announced our private beta for code referencing in Copilot. Learn more by heading to our blog post or join the waitlist today!

Questions, suggestions, or ideas?

Join the conversation in the Copilot community discussion. We'd love to hear from you!

See more

GitHub Advanced Security customers can now perform on-demand validity checks for supported partner patterns, and the alert index view now shows if a secret is active. This builds on our release of enabling automatic validation checks for supported partner patterns back in April.

When the “Automatically verify if a secret is valid” setting is enabled on a repository, users will see a “Verify secret” button on the alert page. This sends the secret to our relevant partner provider to see if the secret is active and updates the status on the alert and index pages.

screenshot of an adafruit io key alert with a verify secret button

As we work with our partners to add support for more secrets, we'll update the "Validity check" column in the documented supported secrets list.

See more