Getting RCE in Chrome with incomplete object initialization in the Maglev compiler

In this post, I’ll exploit CVE-2023-4069, a type confusion in Chrome that allows remote code execution (RCE) in the renderer sandbox of Chrome by a single visit to a malicious site.

|
| 13 minutes

In this post I’ll exploit CVE-2023-4069, a type confusion vulnerability that I reported in July 2023. The vulnerability—which allows remote code execution (RCE) in the renderer sandbox of Chrome by a single visit to a malicious site—is found in v8, the Javascript engine of Chrome. It was filed as bug 1465326 and subsequently fixed in version 115.0.5790.170/.171.

Vulnerabilities like this are often the starting point for a “one-click” exploit, which compromises the victim’s device when they visit a malicious website. What’s more, renderer RCE in Chrome allows an attacker to compromise and execute arbitrary code in the Chrome renderer process. That being said, the renderer process has limited privilege and such a vulnerability needs to be chained with a second “sandbox escape” vulnerability (either another vulnerability in the Chrome browser process or one in the operating system) to compromise Chrome itself or the device.

While many of the most powerful and sophisticated “one-click” attacks are highly targeted, and average users may be more at risk from less sophisticated attacks such as phishing, users should still keep Chrome up-to-date and enable automatic updates, as vulnerabilities in v8 can often be exploited relatively quickly.

The current vulnerability, CVE-2023-4069, exists in the Maglev compiler, a new mid-tier JIT compiler in Chrome that optimizes Javascript functions based on previous knowledge of the input types. This kind of optimization is called speculative optimization and care must be taken to make sure that these assumptions on the inputs are still valid when the optimized code is used. The complexity of the JIT engine has led to many security issues in the past and has been a popular target for attackers.

Maglev compiler

The Maglev compiler is a mid-tier JIT compiler used by v8. Compared to the top-tier JIT compiler, TurboFan, Maglev generates less optimized code but with a faster compilation speed. Having multiple JIT compilers is common in Javascript engines, the idea being that with multiple tier compilers, you’ll find a more optimal tradeoff between compilation time and runtime optimization.

Generally speaking, when a function is first run, slow bytecode is generated, as the function is run more often, it may get compiled into more optimized code, first from a lowest-tier JIT compiler. If the function gets used more often, then its optimization tier gets moved up, resulting in better runtime performance—but at the expense of a longer compilation time. The idea here is that for code that runs often, the runtime cost will likely outweigh the compile time cost. You can consult An Introduction to Speculative Optimization in v8 by Benedikt Meurer for more details of how the compilation process works.

The Maglev compiler is enabled by default starting from version 114 of Chrome. Similar to TurboFan, it goes through the bytecode of a Javascript function, taking into account the feedback that was collected from previous runs, and transforms the bytecode into more optimized code. However, unlike TurboFan, which first transforms bytecodes into a “Sea of Nodes”, Maglev uses an intermediate representation and first transforms bytecodes into SSA (Static Single-Assignment) nodes, which are declared in the file maglev-ir.h. At the time of writing, the compilation process of Maglev consists mainly of two phases of optimizations: the first phase involves building a graph from the SSA nodes, while the second phase consists of optimizing the representations of Phi values.

Object construction in v8

The bug in this post really has more to do with object constructions than with Maglev, so now I’ll go through more details and some concepts of how v8 handles Javascript constructions. A Javascript function can be used as a constructor and called with the new keyword. When it is called with new, the new.target variable exists in the function scope that specifies the function being called with new. In the following case, new.target is the same as the function itself.

function foo() {
  %DebugPrint(new.target);
}
new foo();  // foo
foo();      // undefined

This, however, is not always the case and new.target may be different from the function itself. For example, in case of a construction via a derived constructor:


class A {
  constructor() {
    %DebugPrint(new.target);
  }
}

class B extends A {
}

new A();  // A
new B();  // B

Another way to have a different new.target is to use the Reflect.construct built-in function:


Reflect.construct(A, [], B);  // B

The signature of Reflect.construct is as follows, which specifies newTarget as the new.target:


Reflect.construct(target, argumentsList, newTarget)

The Reflect.construct method sheds some light on the role of new.target in object construction. According to the documentation, target is the constructor that is actually executed to create and initialize an object, while newTarget provides the prototype for the created object. For example, the following creates a Function type object and only Function is called.


var x = Reflect.construct(Function, [], Array);

This is consistent with construction via class inheritance:


class A {}

class B extends A {}

var x = new B();
console.log(x.__proto__ == B.prototype);  //<--- true

Although in this case, the derived constructor B does get called. So what is the object that’s actually created? For functions that actually return a value, or for class constructors, the answer is more clear:


function foo() {return [1,2];}
function bar() {}
var x = Reflect.construct(foo, [], bar); //<--- returns [1,2]

but less so otherwise:


function foo() {}
function bar() {}
var x = Reflect.construct(foo, [], bar); //<--- returns object {}, instead of undefined

So even if a function does not return an object, using it as target in Reflect.construct still creates a Javascript object. Roughly speaking, object constructions follow these steps: (see, for example, Generate_JSConstructStubGeneric.)

First a default receiver (the this object) is created using FastNewObject, and then the target function is invoked. If the target function returns an object, then the default receiver is discarded and the return value of target is used as the returned object instead; otherwise, the default receiver is returned.

Default receiver object

The default receiver object created by FastNewObject is relevant to this bug, so I’ll explain it in a bit more detail. Most Javascript functions contain an internal field, initial_map. This is a Map object that determines the type and the memory layout of the default receiver object created by this function. In v8, Map determines the hidden type of an object, in particular, its memory layout and the storage of its fields. Readers can consult “JavaScript engine fundamentals: Shapes and Inline Caches” by Mathias Bynens to get a high-level understanding of object types and maps.

When creating the default receiver object, FastNewObject will try to use the initial_map of new.target (new_target) as the Map for the default receiver:


TNode ConstructorBuiltinsAssembler::FastNewObject(
    TNode context, TNode target,
    TNode new_target, Label* call_runtime) {
  // Verify that the new target is a JSFunction.
  Label end(this);
  TNode new_target_func =
      HeapObjectToJSFunctionWithPrototypeSlot(new_target, call_runtime);
  ...
  GotoIf(DoesntHaveInstanceType(CAST(initial_map_or_proto), MAP_TYPE),
         call_runtime);
  TNode initial_map = CAST(initial_map_or_proto);
  TNode new_target_constructor = LoadObjectField(
      initial_map, Map::kConstructorOrBackPointerOrNativeContextOffset);
  GotoIf(TaggedNotEqual(target, new_target_constructor), call_runtime);  //<--- check
  ...

  BIND(&instantiate_map);
  return AllocateJSObjectFromMap(initial_map, properties.value(), base::nullopt,
                                 AllocationFlag::kNone, kWithSlackTracking);
}

This is curious, as the default receiver should have been created using target and new_target should only be used to set its prototype. The reason for this is because of an optimization that caches both the initial_map of target and the prototype of new_target in the initial_map of new_target, which I’ll explain now.

In the above, FastNewObject has a check (marked as “check” in the above snippet) that makes sure that target is the same as the constructor field of the initial_map. For most functions, initial_map is created lazily, or its constructor field is pointing to itself (new_target in this case). So when new_target is first used to construct an object with a different target, the call_runtime slow path is likely taken, which uses JSObject::New:


MaybeHandle JSObject::New(Handle constructor,
                                    Handle new_target,
                                    Handle site) {
  ...
  Handle initial_map;
  ASSIGN_RETURN_ON_EXCEPTION(
      isolate, initial_map,
      JSFunction::GetDerivedMap(isolate, constructor, new_target), JSObject);
  ...
  Handle result = isolate->factory()->NewFastOrSlowJSObjectFromMap(
      initial_map, initial_capacity, AllocationType::kYoung, site);
  return result;
}

This function calls GetDerivedMap, which may call FastInitializeDerivedMap to create an initial_map in the new_target:


bool FastInitializeDerivedMap(Isolate* isolate, Handle new_target,
                              Handle constructor,
                              Handle constructor_initial_map) {
  ...
  Handle map =
      Map::CopyInitialMap(isolate, constructor_initial_map, instance_size,
                          in_object_properties, unused_property_fields);
  map->set_new_target_is_base(false);
  Handle prototype(new_target->instance_prototype(), isolate);
  //Also sets map.prototype to prototype and map.constructor to constructor
  JSFunction::SetInitialMap(isolate, new_target, map, prototype, constructor);
  ...

The initial_map created here is a copy of the initial_map of target (constructor), but with its prototype set to the prototype of new_target and its constructor set to target. This is the only case when the constructor of an initial_map points to a function other than itself and provides the context for the initial_map of new_target to be used in FastNewObject: if the constructor of an initial_map points to a different function, then the initial_map is a copy of the initial_map of the constructor. Checking that new_target.initial_map.constructor equals target, FastNewObject ensures that the initial_map of new_target is a copy of target.initial_map, but with new_target.prototype as its prototype, which is the correct Map to use.

The vulnerability

Derived classes often have no-op default constructors, which do not modify receiver objects, for example:


class A {}
class B extends A {}
class C extends B {}
const o = new C();

In this case, when calling new C(), the default constructor calls to B and A are no-op and can be omitted. The FindNonDefaultConstructorOrConstruct bytecode is an optimization to omit redundant calls to no-op default constructors in these cases. In essence, it walks up the chain of super constructors and skips the default constructors that can be omitted. If it can skip all the intermediate default constructors and reach the base constructor, then FastNewObject is called to create the default receiver object. The bytecode is introduced in a derived class constructor:


class A {}
class B extends A {}
new B();

Running the above with the print-bytecode flag in d8 (the standalone version of v8), I can see that FindNonDefaultConstructorOrConstruct is inserted in the bytecode of the derived constructor B:


[generated bytecode for function: B (0x1a820019ba41 )]
Bytecode length: 45
Parameter count 1
Register count 7
Frame size 56
         0x1a820019be6c @    0 : 19 fe f9          Mov , r1
 1700 S> 0x1a820019be6f @    3 : 5a f9 fa f5       FindNonDefaultConstructorOrConstruct r1, r0, r5-r6
 ...
         0x1a820019be7e @   18 : 99 0c             JumpIfTrue [12] (0x1a820019be8a @ 30)
 ...
         0x1a820019be8a @   30 : 0b 02             Ldar 
         0x1a820019be8c @   32 : ad                ThrowSuperAlreadyCalledIfNotHole
         0x1a820019be8d @   33 : 19 f7 02          Mov r3, 
 1713 S> 0x1a820019be90 @   36 : 0d 01             LdaSmi [1]
 1720 E> 0x1a820019be92 @   38 : 32 02 00 02       SetNamedProperty , [0], [2]
         0x1a820019be96 @   42 : 0b 02             Ldar 
 1727 S> 0x1a820019be98 @   44 : aa                Return

In particular, if FindNonDefaultConstructorOrConstruct succeeds (returns true), then the default receiver object will be returned immediately.

The vulnerability happens in the handling of FindNonDefaultConstructorOrConstruct in Maglev.


void MaglevGraphBuilder::VisitFindNonDefaultConstructorOrConstruct() {
  ...
          compiler::OptionalHeapObjectRef new_target_function =
              TryGetConstant(new_target);
          if (kind == FunctionKind::kDefaultBaseConstructor) {
            ValueNode* object;
            if (new_target_function && new_target_function->IsJSFunction()) {
              object = BuildAllocateFastObject(
                  FastObject(new_target_function->AsJSFunction(), zone(),
                             broker()),
                  AllocationType::kYoung);
  ...

If it manages to skip all the default constructors and reach the base constructor, then it’ll check whether new_target is a constant. If that is the case, then BuildAllocateFastObject, instead of FastNewObject, is used to create the receiver object. The problem is that, unlike FastNewObject, BuildAllocateFastObject uses the initial_map of new_target without checking its constructor field:


ValueNode* MaglevGraphBuilder::BuildAllocateFastObject(
    FastObject object, AllocationType allocation_type) {
  ...
  ValueNode* allocation = ExtendOrReallocateCurrentRawAllocation(
      object.instance_size, allocation_type);
  BuildStoreReceiverMap(allocation, object.map);  // new_target.initial_map
  ...
  return allocation;
}

Why is this bad? As explained before, when constructing an object, target, rather than new_target, is called to initialize the object fields. If new_target is not of the same type as target, then creating an object with the initial_map of new_target can leave fields uninitialized:


class A {}
class B extends A {}
var x = Reflect.construct(B, [], Array);

In this case, new_target is Array, so if the initial_map of new_target is used to create x (receiver), then x is going to be an Array type object, which has a field length that specifies the size of the array and is used for bounds checking. If B, which is the target, is used to initialize the Array object, then length would become uninitialized. This problematic scenario is prevented by checking the constructor of new_target.initial_map to make sure that it is B, and the absence of the check in Maglev results in the vulnerability.

There is one problem here: I need new_target to be a constant to reach this code, but when used with Reflect.construct, new_target is an argument to the function and is never going to be a constant. To overcome this, let’s take a look at what TryGetConstant, which is used in FindNonDefaultConstructorOrConstruct to check that new_target is a constant, does:


compiler::OptionalHeapObjectRef MaglevGraphBuilder::TryGetConstant(
    ValueNode* node, ValueNode** constant_node) {
  if (auto result = TryGetConstant(broker(), local_isolate(), node)) {  //<--- 1.
    if (constant_node) *constant_node = node;
    return result;
  }
  const NodeInfo* info = known_node_aspects().TryGetInfoFor(node);      //is_constant()) {
    if (constant_node) *constant_node = info->constant_alternative;
    return TryGetConstant(info->constant_alternative);
  }
  return {};
}

When checking whether a node is constant, TryGetConstant first checks if the node is a known global constant (marked as 1. in the above), which will be false in our case. However, it also checks NodeInfo for the node to see if it has been marked as a constant by other nodes (marked as 2. in the above). If the value of the node has been checked against a global constant previously, then its NodeInfo will be set to a constant. If that’s the case, then I can store new.target to a global variable that has not been changed, which will cause Maglev to insert a CheckValue node to ensure that new.target is the same as the global constant:


class A {}

var x = Array;

class B extends A {
  constructor() {
    x = new.target;  //<--- insert CheckValue node to cache new.target as constant (Array)
    super();
  }
}

Reflect.construct(B, [], Array); //<--- Calls `B` as `target` and `Array` as `new_target`

When B is optimized by Maglev and the optimized code is run, Reflect.construct is likely to return an Array with length 0. This is because initially, the free spaces in the heap mostly contain zeroes, so when the created Array uses an uninitialized value as its length, this value is most likely going to be zero. However, once a garbage collection is run, the free spaces in the heap will likely contain some non-trivial values (objects that are freed by garbage collection). By creating some objects in the heap, deleting them, and then triggering a garbage collection, I could carefully arrange the heap to make the uninitialized Array created through the bug take any value as its length. In practice, a rather crude trial-and-error approach (which mostly involves triggering a garbage collection and creating uninitialized Array with the bug until you get it right) is sufficient to give me consistent and reliable results:


//----- Create incorrect Maglev code ------
var x = Array;

class B extends A {
  constructor() {
    x = new.target;
    super();
  }
}
function construct() {
  var r = Reflect.construct(B, [], x);
  return r;
}
//Compile optimize code
for (let i = 0; i < 2000; i++) construct();
//-----------------------------------------
//Trigger garbage collection to fill the free space of the heap
new ArrayBuffer(gcSize);
new ArrayBuffer(gcSize);

corruptedArr = construct();  // length of corruptedArr is 0, try again...
corruptedArr = construct();  // length of corruptedArr takes the pointer of an object, which gives a large value

While this already allows out-of-bounds (OOB) access to a Javascript array, which is often sufficient to gain code execution, the situation is slightly more complicated in this case.

Gaining code execution

The Array created via the bug has no elements, so its element store is set to the empty_fixed_array. The main problem is that empty_fixed_array is located in a read-only region of the v8 heap, which means that an OOB write needs to be large enough to pass the entire read-only heap or it’ll just crash on access:


DebugPrint: 0x10560004d5e5: [JSArray]
 - map: 0x10560018ed39  [FastProperties]
 - prototype: 0x10560018e799 
 - elements: 0x105600000219  [HOLEY_SMI_ELEMENTS]   //<------- address of empty_fixed_array
 ...

As you can see above, the lower 32 bits of the address of empty_fixed_array is 0x219, which is fairly small. The lower 32 bits of the address is called the compressed address. In v8, most references are only stored as the lower 32 bits of the full 64-bit pointers in the heap, while the higher 32 bits remain constant and are cached in a register. In particular, v8 objects are referenced using the compressed address and this is an optimization called pointer compression.

As explained in Section “Bypassing the need to get an infoleak” of my other post, the addresses of many objects are very much constant in v8 and depend only on the software version. In particular, the address of empty_fixed_array is the same across different runs and software versions, and more importantly, it remains a small address. This means most v8 objects are going to be placed at an address larger than that of the empty_fixed_array. In particular, with a large enough length, it is possible to access any v8 object.

While at least in theory, this bug can be used to exploit, it is still unclear how I can use this to access and modify a specific object of my choice. Although I can use the uninitialized Array created by the bug to search through all objects that are allocated behind empty_fixed_array, doing so is inefficient and I may end up accessing some invalid objects that could result in a crash. It would be good if I can at least have an idea of the addresses for objects that I created in Javascript.

In a talk that I gave at the POC2022 conference last year, I shared how object addresses in v8 can indeed be predicted accurately by simply knowing the version of Chrome. What I didn’t know then was that, even after a garbage collection, object addresses can still be predicted reliably.


//Triggers garbage collection
new ArrayBuffer(gcSize);
new ArrayBuffer(gcSize);

corruptedArr = construct();
corruptedArr = construct();

var oobDblArr = [0x41, 0x42, 0x51, 0x52, 1.5];  //<---- address remains consisten across runs

For example, in the above situation, the object oobDblArr created after garbage collection remains in the same address fairly consistently across different runs. While the address can sometimes change slightly, it is sufficient to give me a rough starting point to search for oobDblArr in corruptedArr (the Array created from the bug). With this, I can corrupt the length of oobDblArr to gain an OOB access with oobDblArr. The exploit flow is now very similar to the one described in my previous post, and consists of the following steps:

  1. Place an Object Array, oobObjArr after oobDblArr, and use the OOB read primitive to read the addresses of the objects stored in this array. This allows me to obtain the address of any v8 object.
  2. Place another double array, oobDblArr2 after oobDblArr, and use the OOB write primitive in oobDblArr to overwrite the element field of oobDblArr2 to an object address. Accessing the elements of oobDblArr2 then allows me to read/write to arbitrary addresses.
  3. While this gives me arbitrary read and write primitives within the v8 heap and also obtains the address of any object, due to the recently introduced heap sandbox in v8, the v8 heap is fairly isolated and it still can’t access arbitrary memory within the renderer process. In particular, I can no longer use the standard method of overwriting the RWX pages that are used for storing Web Assembly code to achieve code execution. Instead, JIT spraying can be used to bypass the heap sandbox.
  4. The idea of JIT spraying is that a pointer to the JIT optimized code of a function is stored in a Javascript Function object, by modifying this pointer using arbitrary read and write primitive within the v8 heap, I can make this pointer jump to the middle of the JIT code. If I use data structures, such as a double array, to store shell code as floating point numbers in the JIT code, then jumping to these data structures will allow me to execute arbitrary code. I refer readers to this post for more details.

The exploit can be found here with some set up notes.

Conclusion

With different tiers of optimizations in Chrome, the same functionality often needs to be implemented multiple times, each with different and specific optimization considerations. For complex routines that rely on subtle assumptions, this can result in security problems when porting code between different optimizations, as we have seen in this case, where the implementation of FindNonDefaultConstructorOrConstructhas missed out an important check.

Written by

Related posts

Attacks on Maven proxy repositories

Learn how specially crafted artifacts can be used to attack Maven repository managers. This post describes PoC exploits that can lead to pre-auth remote code execution and poisoning of the local artifacts in Sonatype Nexus and JFrog Artifactory.