GitHub code scanning is powered by the CodeQL analysis engine. To identify potential security vulnerabilities, you can enable CodeQL to run queries against your codebase. These open source queries are written by members of the community and GitHub security experts, and each query is carefully crafted to recognize as many variants of a particular vulnerability type as possible and provide broad Common Weakness Enumeration (CWE) coverage. Queries are continuously updated to recognize emerging libraries and frameworks. Identifying such libraries is important: it allows us to accurately identify flows of untrusted user data, which are often the root cause of security vulnerabilities.
With the rapid evolution of the open source ecosystem, there is an ever-growing long tail of libraries that are less commonly used. We use examples surfaced by the manually-crafted CodeQL queries to train deep learning models to recognize such open source libraries, as well as in-house developed closed-source libraries. Using these models, CodeQL can identify more flows of untrusted user data, and therefore more potential security vulnerabilities.
Want to learn about the machine learning framework powering these alerts? Check out this post, which describes how we trained our deep learning models.
security-and-quality analysis suites. If you’re already using one of these suites, your code will be analyzed using the new machine learning technology.
If you’re already using code scanning, but not using either of these suites yet, you can enable the new experimental analysis by modifying your code scanning Actions workflow configuration file as follows:
[...] - uses: github/codeql-action/init@v1 with: queries: +security-extended [...]
security-and-quality analysis suites described above.
If the new experimental analysis finds additional results, you will see the new alerts displayed alongside the other code scanning alerts in the “Security” tab of your repository. They will also appear on pull requests. The new alerts are clearly marked with the “Experimental” label.
It’s important to note that while we continue to improve and test our machine learning models, this new experimental analysis can have a higher false-positive rate relative to results from our standard CodeQL analysis. As with most machine learning models, the results will improve over time. We encourage everyone to try out this new experimental feature; with your feedback we can continue to innovate and secure the world’s code!