Attacks on Maven proxy repositories
Learn how specially crafted artifacts can be used to attack Maven repository managers. This post describes PoC exploits that can lead to pre-auth remote code execution and poisoning of the local artifacts in Sonatype Nexus and JFrog Artifactory.
As someone who’s been breaking the security of Java applications for many years, I was always curious about the supply chain attacks on Java libraries. In 2019, I accidentally discovered an arbitrary file read vulnerability on search.maven.org, a website that is closely tied to the Maven Central Repository. Maven Central is a place where most of the Java libraries are downloaded from and its security is paramount for all companies who develop in Java. If someone is able to infiltrate Maven Central and replace a popular library, they can get the key to the whole kingdom, as almost every large tech company uses Java.
Last year, I committed myself to have a look at how Maven works under the hood. I decided to challenge myself: Perhaps I could find a way to get inside?
In this blog post, I’ll reveal some intriguing vulnerabilities and CVEs that I’ve recently found in popular Maven repository managers. I’ll illustrate how specially crafted artifacts can be used to attack the repository managers that distribute them. Finally, I’ll demonstrate some exploits that can lead to pre-auth remote code execution and poisoning of the local artifacts.
What is Maven and how does it work?
Apache Maven is a popular tool to build Java projects. One of its widely adopted features allows you to resolve dependencies for the project. In a very simplistic scenario, a developer needs to create an pom.xml file that will list all the dependencies for their project, like this:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
<version>3.3.0</version>
</dependency>
During the build process, the developer executes the maven console tool to download these dependencies for local use. For example, the widely used mvn package
command invokes Maven Artifact Resolver to make the following HTTP request to download this dependency:
GET /org/springframework/boot/spring-boot-starter-web/3.3.0/spring-boot-starter-web-3.3.0.jar
Host: repo.maven.apache.org
Since the artifact can have its own dependencies, maven also fetches the /org/springframework/boot/spring-boot-starter-web/3.3.0/pom.xml
file to identify and download all transitive dependencies.
Downloaded dependencies are stored in the local file system (on MacOS it’s ~/.m2/repository
) following the Maven Repository Layout.
Attack surface
It’s important to note that Maven, as a console tool, is built with some security assumptions in mind:
The purpose of Maven is to perform the actions defined in the supplied pom.xml, which commonly includes compiling and running the associated code and using plugins and dependencies downloaded from the configured repositories.
As such, the Maven security model assumes you trust the pom.xml and the code, dependencies, and repositories used in your build. If you want to use Maven to build untrusted code, it’s up to you to provide the required isolation.
Maven repositories (places from where artifacts are downloaded), on the other hand, are essentially web applications that allow uploading, storing, and downloading compiled artifacts. Their security is crucial, because if a hacker is able to publish or replace a commonly used artifact in them, all repository clients will execute the malicious code from this artifact.
Maven Central and other public repositories
By default, Maven downloads all dependencies from https://repo.maven.apache.org
, the address of the Maven Central repository. It is hardcoded in the default installation of Maven, but can be changed in settings. This website is hosted by Sonatype on AWS S3 buckets and served with Fastly CDN.
Maven Central repository is public. Anybody can publish an artifact to it, but users’ publishing rights are restricted by the groupId
ownership. So, only the company or user who owns the org.springframework.boot
groupId is allowed to publish artifacts with this groupId. To upload artifacts, publishers can use either a new Sonatype Central Portal or a legacy OSSRH (OSS Repository Hosting).
Maven Central has a complex infrastructure hosted by Sonatype. At GitHub Security Lab, we only audit open source code, which means that Sonatype’s website is out of scope for us. Still, I realized that a lot of companies publish through the legacy OSSRH portal (https://oss.sonatype.org/), which is backed by the product Sonatype Nexus 2.
Apart from Maven Central, there are also some other public Maven repositories:
Address | Product | |
---|---|---|
Maven Central (default) | repo1.maven.org , repo.maven.apache.org |
Amazon S3 + Fastly Infrastructure is managed by Sonatype |
Maven Central OSSRH (synced with default) | oss.sonatype.org , s01.oss.sonatype.org |
Sonatype Nexus 2 |
Apache | repository.apache.org |
Sonatype Nexus 2 (behind a proxy) |
Spring | repo.spring.io |
JFrog Artifactory |
Atlassian | packages.atlassian.com |
JFrog Artifactory |
JBoss | repository.jboss.org |
Sonatype Nexus 2 |
As you can see from the table, the biggest repositories are powered by two major products: Sonatype Nexus and JFrog Artifactory. These products are (partially) open source and have free versions that you can test locally.
So in my research, I decided to challenge myself with breaking the security of these repository managers. Additionally, I thought it would be good to also include a completely free open source alternative: Reposilite.
In-house Maven repository managers
While downloading artifacts from Maven Central and other public repositories is free, many companies choose to use their own in-house Maven repository managers for additional benefits, such as:
- Ability to publish and use company’s private artifacts
- Ability to restrict and get clarity on which libraries are used within the organization
- Reduced bandwidth consumption by minimizing external network calls
These in-house repository managers are powered by the same open source products as the public repositories: Nexus, JFrog, and Reposilite.
All of these products support multiple access roles. Typically an anonymous role allows you to download any artifact, a developer’s role can publish new artifacts, and an admin role can manage repositories, users, and enforce policies.
Looking at Proxy mode from a security perspective
Apart from handling artifacts developed within a company, Maven repository managers are also often used as dedicated proxy servers for public Maven repositories. In this mode, when a repository manager handles a request to download an artifact, it first checks if the artifact is available locally. If not, it forwards this request to the upstream repository.
The proxy mode is particularly interesting from the security perspective. First, because it allows even anonymous users to fetch their own artifact from the public repository and plant it in the local repository manager. Second, because in-house repository managers not only store and serve these artifacts, but also try to have a “sneak peek” into their content by expanding archives, analyzing “pom.xml” files, building dependency graphs, checking them for malware, and displaying their content in the Admin UI.
This may introduce a second-order vulnerability when an attacker uploads a specially crafted artifact to the public repository first, and then uses it to attack the in-house manager. As someone who built DAST and SAST products in the past, I know firsthand that these types of issues are very hard to detect with automation, so I decided to have a look at the source code to manually identify some.
Attacks on proxy Mode: Stored XSS
Artifacts published to Maven repositories are normally the JAR archives that contain compiled java classes (with .class
file extension), but technically they can contain arbitrary data and extensions. All the repository managers I tested have their web admin interfaces listening on the same port as the application that serves the artifact’s content. So, what if an artifact’s pom.xml file contains some JavaScript inside?
<?xml version="1.0" encoding="UTF-8"?>
<a:script xmlns:a="http://www.w3.org/1999/xhtml">
alert(`Secret key: ${localStorage.getItem('token-secret')}`)
</a:script>
It turned out that at least two of the tested Repository managers (Reposilite and Sonatype Nexus 2) fall into this basic Stored XSS vulnerability. The problem lies in the fact that the artifact’s content is served via the same origin (protocol/host/port) as the Admin UI. If the artifact contains HTML content with javascript inside, the javascript is executed within the same origin. Therefore, if an authenticated user views the artifact’s content, the javascript inside can make authenticated requests to the Admin area, which can lead to the modification of other artifacts, and subsequently to remote code execution on users who download them.
In case of the Reposilite vulnerability, an XSS payload can be used to access the browser’s local storage where the user’s password (aka “token-secret”) is located. That’s game over, as the same token can be used on another device to access the admin area.
How to protect from that? Obviously, we cannot “escape” the special characters, as it would break the legitimate functionality. Instead, a combination of the following approaches can be used:
- “Content-Security-Policy: sandbox;” header when serving the artifact’s content. This means the resource will be treated as being from a special origin that always fails the same-origin policy (potentially preventing access to data storage/cookies and some JavaScript APIs). It’s an elegant solution to protect the Admin area, but it still allows HTML content rendering, leaving some opportunities for phishing.
- “Content-Disposition: attachment” header. This will prevent the browser from displaying the content entirely, so it just saves it to the “Download” folder. This may affect the UX though.
Example: Look at the advisories for CVE-2024-36115 in Reposilite and CVE-2024-5083 in Sonatype Nexus 2 on the GitHub Security Lab website.
Archive expansion and path traversal
All the tested Repository managers support unpacking the artifact’s files on the server to serve individual files from the archive. Most of them use Java’s ZipInputStream interface, which allows you to do it in-memory only, without storing anything on disk, which makes it safe from path traversal vulnerabilities.
Still, I was able to find one instance of this vulnerability in Reposilite’s support for JavaDoc files.
CVE-2024-36116: Arbitrary file overwrite in Reposilite
JavadocContainerService.kt#L127-L136
jarFile.entries().asSequence().forEach { file ->
if (file.isDirectory) {
return@forEach
}
val path = Paths.get(javadocUnpackPath.toString() + "/" + file.name)
path.parent?.also { parent -> Files.createDirectories(parent) }
jarFile.getInputStream(file).copyToAndClose(path.outputStream())
}.asSuccess<Unit, ErrorResponse>()
The file.name
taken from the archive can contain path traversal characters, such as ‘/../../../anything.txt’, so the resulting extraction path can be outside the target directory.
If the archive is taken from an untrusted source, such as Maven Central, an attacker can craft a special archive to overwrite any local file on the repository manager. In the case of Reposilite, this could lead to remote code execution, for example by placing a new plugin into the $workspace$/plugins
directory. Alternatively, an attacker can overwrite the content of any other package.
CVE-2024-36117: Arbitrary file read in Reposilite
Another CVE I discovered in Reposilite was in the way the expanded javadoc files are served. Reposilite has the GET /javadoc/{repository}/<gav>/raw/<resource>
route to find the file in the exploded archive and return its content to the user.
In that case, the /../.
Since the path is concatenated with the main directory, it opens the possibility to read files outside the target directory.
I reported both of these vulnerabilities using GitHub’s Private Vulnerability Reporting feature. If you want to read more details about them, look at the advisories for CVE-2024-36116: Arbitrary file overwrite in Reposilite and CVE-2024-36117: Leaking internal database in Reposilite.
Name confusion attacks
When a repository manager processes requests to download artifacts, it needs to map the incoming URL path value to the artifact’s coordinates: GroupId, ArtifactId, and Version (commonly known as GAV).
The Maven documentation suggests using the following convention for mapping from URL path to GAV:
/${groupId}/${artifactId}/${baseVersion}/${artifactId}-${version}-${classifier}.${extension}
GroupId can contain multiple forward slashes, which are translated to dots while parsing. For instance, the following URL path:
GET /org/apache/maven/apache-maven/3.8.4/apache-maven-3.8.4-bin.tar.gz HTTP/1.1
will be translated to these coordinates:
groupId: org.apache.maven
artifactId:apache-maven
version: 3.8.4:bin:tar.gz
classifier: bin
Extension: tar.gz
While this operation looks like a simple regexp matching, there is some room for misinterpretation, especially in how the URL decoding, path normalization, and control characters of the URL are handled.
For instance, if the path contains any special url encoded characters, such as “?” (%3b) or “#” (%23), they will be decoded and considered as part of the artifact name:
GET /com/company/artifact/1.0/artifact-1.0.jar%23/xyz/anything.any?isRemote=true
Interpreted by proxy and transferred to upstream as:
/com/company/artifact/1.0/artifact-1.0.jar\#/xyz/anything.any
On the upstream server however, everything after the hash sign will be parsed as hash properties. The path to artifact will be truncated to:
/com/company/artifact/1.0/artifact-1.0.jar#/xyz/anything.any
Essentially, this would allow attackers to create files on the target proxy repository with arbitrary names and extensions, as long as their path starts with a predefined value, for example:
This behavior affects almost every product I tested, but it’s hardly exploitable on its own, as no client would use an artifact with such a weird name.
While testing this, I noticed that JFrog Artifactory also has a special handling for the semicolon character “;”. Artifactory considers everything in the path after semicolon as “path parameters”, but not a part of the artifact name.
GET /com/company1/artifact1/1.0/artifact1-1.0.jar;/../../../../company2/artifact2/2.0/artifact2-2.0.jar
When processing a request like that, JFrog considers artifact1-1.0.jar as the artifact name, but still forwards the full url to the upstream repository. Contrary to that, Nexus 3 and some public servers perform path normalization and reduce this path to /company2/artifact2/2.0/artifact2-2.0.jar
, which is expected according to RFC 3986.
In cases where Artifactory is configured to proxy an external repository, this behavior can lead to a severe vulnerability: artifact poisoning (CVE-2024-6915). Technically, it allows saving any HTTP response from the remote endpoint to an arbitrary artifact on the Artifactory instance.
The straightforward way to exploit that would be to publish a malicious artifact into the upstream repository with any name, and then save it under a commonly used name on Artifactory, (for example, “spring-boot-starter-web”). Then, the next time a client fetches this commonly used artifact, Artifactory will serve malicious content of another package.
In cases when an attacker is unable to publish anything into the upstream repository, it can still be potentially exploitable with an open redirect or a reflected/stored XSS on the upstream server. Artifactory does not check what is passed after “;/../”, so it can be not only an “artifact2-2.0.jar” but any relative URL path.
The ultimate requirement for this exploitation is that the upstream server should perform a path normalization process to consume “/../” characters. Maven Central repository does not perform it, but several other public repositories such as Apache or JitPack do. Moreover, this vulnerability in JFrog Artifactory affects not only Maven repositories, but any other proxy types, such as npm or Docker repositories.
artifact/../ | artifact/%2e%2e/ | |
---|---|---|
Maven Central repo1.maven.org | ✗ | ✗ |
Apache repository.apache.org | ✓* | ✓ |
JitPack jitpack.io | ✓ | ✗ |
npm | ✓ | ✗ |
Docker Registry | ✓ | ✓ |
Rubygems.io | ✗ | ✗ |
Python Package Index (PyPI) | ✗ | ✗ |
GO package registry (gocenter.io) | ✓ | ✗ |
- “✓” means path traversal is accepted by repository, “✗” – not
Example CVE-2024-6915: Is it even or is it odd?
To demonstrate its impact in my bug bounty report, I chose to use an Artifactory instance that proxies to npm. Npm has a different layout than Maven, but the core idea is the same: we just need to overwrite a package.json file with the content of another package. In the following request, we simply replace the package.json file of the is-even package with the content of the is-odd package.
When I install this poisoned package from Artifactory, the npm client shows a warning that the name in the package.json file (is-odd) is different from the requested one (is-even), but as the downloaded file is properly formatted and contains the links to archive with the source code, the npm client still downloads and executes it.
The npm client is designed with the assumption that it trusts the source. If the integrity of the source registry is compromised (which was the case for JFrog Artifactory), npm clients cannot really do anything to protect from such malicious artifacts. Even hash checksums can be bypassed if they are tampered with in Artifactory.
When I reported this issue to the JFrog bug bounty program, it was assigned a critical severity and later awarded with a $5000 bounty. Since I’m doing this research as a part of my work at GitHub, we donated the full bounty amount to charity, specifically Cancer Research UK.
Magic parameters for exploiting name confusion attacks
Both Nexus and JFrog support some URL query parameters for proxy repositories. In JFrog Artifactory, the following parameters are accepted:
When attacking proxy repositories, these parameters may be applied on the proxy side, or “smuggled” into the upstream repository by using URL encoding. In both cases, they may alter how one or another repository processes the request, leaving options for potential exploitation.
For example, by applying a :properties
suffix to the path, we can trigger a local redirect to the same URL. By default, Artifactory does not perform path normalization for incoming requests, but with a redirect we can force the path normalization to be performed on the HTTP client side, instead of the server. It may help to perform a path traversal for name confusion attacks.
Nexus 2 also supports a few parameters, but perhaps only these two are interesting for attackers:
Disrupting internal metadata: Nexus 2 RCE (CVE-2024-5082)
Along with the artifacts uploaded by the users, repository managers also store additional metadata, such as checksums, creation date, the user who uploaded it, number of downloads, etc. Most of this data is stored in the database, but some repository managers also store files in the same directory as artifacts. For instance, Nexus 2 stores the following files:
/.meta/repository-metadata.xml
– contains repository properties in XML format-
/.meta/prefixes.txt
-
/.index/nexus-maven-repository-index.properties
-
/.index/nexus-maven-repository-index.gz
-
/.nexus/tmp/<artifact>nx-tmp<random>.nx-upload
– temporary file name during artifact’s upload -
/.nexus/attributes/<artifact-name>
– for every artifact, Nexus creates this json file with artifact’s metadata
The last file is the only one that Nexus prohibits access to. Indeed if you try to download or upload a file that starts with /.nexus/attributes/
, Nexus rejects this request:
At the same time, I figured out that we can circumvent this protection by using a different prefix (/nexus/service/local/repositories/test/content/
instead of /nexus/content/repositories/test/
) and by using a double slash before the .nexus/attributes
:
Reading local attributes is probably not that interesting for attackers, but the same bug can be abused to overwrite them using PUT HTTP requests instead of GET. By default, Nexus does not allow you to update artifact’s content in its release repositories, but we can update the attributes of any maven-metadata.xml
file:
For exploitation purposes, I discovered a supported attribute that is particularly interesting: “contentGenerator”:”velocity”. If present, this attribute changes the way how the artifact’s content is rendered, enabling resolution of Velocity templates in the artifact’s content. So if we upload the ‘maven-metadata.xml’ file with the following content:
And then reissue the previous PUT request to update the attributes, the content of the ‘maven-metadata.xml’ file will be rendered as a Velocity template.
Sweet, as the velocity template I used in the example above triggers the execution of the java.lang.Runtime.getRuntime().exec("id")
command.
It’s not a real RCE unless it’s pre-auth
To overwrite the metadata in the previous requests, I used a ‘PUT’ request method that requires the cookie of a user who has sufficient privileges to upload artifacts. This severely reduces the potential impact, as obtaining even a low-privileged account on the target repository might be a difficult task. Still, it wouldn’t be like myself if I didn’t try to find a way to exploit it without any authentication.
One of the ways to achieve that would be combining this vulnerability with the stored XSS (CVE-2024-5083) in proxy repositories that I discovered earlier. Planting an XSS payload would not require any permissions on Nexus. Still, that XSS requires an administrator user to view an artifact, with a valid session, so the exploitation is still not that clean.
Another way to trigger this vulnerability would be through a ‘proxy’ repository. If an attacker is able to publish an artifact into the upstream repository, it’s possible to exploit this vulnerability without any authentication on Nexus.
You may reasonably assume that publishing an artifact with a Maven Group ID that starts with ‘.nexus/attributes’ may be unrealistic in popular upstream repositories like Maven Central, Apache Snapshots or JitPack. While I could not test this myself in their production systems, I noticed that one may publish an artifact with the group ID of ‘org.example’ and then force Nexus to save it as /.nexus/attributes/…
with the same path traversal trick as in the name confusion attacks:
GET /nexus/service/local/repositories/apache-snapshots/content//.nexus/attributes/%252e./%252e./com/sbt/ignite/ignite-bom/maven-metadata.xml
When processing this request, Nexus will decode the URL path to /.nexus/attributes/%2e./%2e./com/sbt/ignite/ignite-bom/maven-metadata.xml
and forward it to the Apache Snapshots upstream repository. Then, Apache’s repository will (quite reasonably) perform URI normalization and return the content of the file. This would allow you to store an artifact with an arbitrary name from Apache Snapshots in the /.nexus/attributes/
directory.
Apache Snapshots is enabled by default in the Nexus installation. Also, as I mentioned earlier, pulling artifacts from it does not require any permissions on Nexus—it can be done with a simple GET request without any cookies.
A real attacker would probably try to publish their own artifact to the Apache Snapshots repository and therefore use it to attack all Nexus instances worldwide. Additionally, it’s possible to enumerate all the Apache user names and their emails. Perhaps some of their credentials can be found on websites that accumulate leaked passwords, but testing these kinds of attacks lies beyond the legal and moral scope of my research.
Summary
As we can see, using repository managers such as Nexus, JFrog, and Reposilite in proxy mode can introduce an attack surface that is otherwise only available to authenticated users.
All tested solutions not only store and serve artifacts, but also perform complex parsing and indexing operations on them. Therefore, a specially crafted artifact can be used to attack the repository manager that processes it. This opens a possibility for XSS, XXE, archive expansion, and path traversal attacks.
The innate URL decoding mechanism along with special characters sparks parsing discrepancies. All repository managers parse URLs differently and cache proxied artifacts locally, which can lead to cache poisoning vulnerabilities, such as CVE-2024-6915 in JFrog, for example.
The major public and private Maven repositories are powered by just a few partially open source solutions. Although these solutions are already backed by reputable companies with strong security teams and bug bounty programs, it’s still possible to find critical vulnerabilities in them.
Lastly, these kinds of attacks are not even specific to Maven, but for all other dependency ecosystems, whether its NPM, Docker, RubyGems or anything else. I encourage every hacker to test this ‘proxy repository’ functionality in other products as well, as this may bring about many fruitful findings.
Note: I presented this research at the Ekoparty Security Conference in November 2024.
Tags:
Written by
Related posts
How to secure your GitHub Actions workflows with CodeQL
In the last few months, we secured 75+ GitHub Actions workflows in open source projects, disclosing 90+ different vulnerabilities. Out of this research we produced new support for workflows in CodeQL, empowering you to secure yours.
Announcing CodeQL Community Packs
We are excited to introduce the new CodeQL Community Packs, a comprehensive set of queries and models designed to enhance your code analysis capabilities. These packs are tailored to augment…
Uncovering GStreamer secrets
In this post, I’ll walk you through the vulnerabilities I uncovered in the GStreamer library and how I built a custom fuzzing generator to target MP4 files.