Bypassing ASLR and DEP on Adobe Reader X

Fri 22 June 2012 by guillaume

Due to their complexity and their large deployment on users' machines, Adobe products (especially Flash and Reader) have often received a lot of attention from attackers. Being aware of this fact, Adobe has taken one step forward in security with the latest version of their PDF reader, Adobe Reader X.

Adobe Reader X currently makes use of three different techniques to back off attackers on Windows :

  • DEP (permanently enabled)
  • ASLR
  • Application sandboxing, with a derivative of the Chrome's sandbox implementation

We will see here how a bug in the Chrome sandbox can lead to the full bypass of ASLR and DEP in the renderer process with a good reliability (although not breaking the sandbox protection itself). The target will be an up-to-date Adobe Reader 10.1.3 on Windows 7 x64.

A few words about the Adobe Reader's sandbox

Sandboxing appeared in the major Adobe Reader X version on Windows. The idea is to run most of the code into a lower privileged process, the renderer, preventing access to system ressources (e.g. filesystem, registry...). Any privileged operation has to be executed by a higher privileged process, the broker, and is validated against an internal policy to prevent a malicious code to spread on the system.

The Adobe's sandbox is actually directly derived from Google Chrome's implementation. I will not describe here the internals of the sandbox, as this is not necessary to understand the rest of this article. If you're interested, I recommend you to read the presentation of Paul Sabanal and Mark Vincent Yason at Black Hat US 2011.

The only interesting thing here will be the initialization phase of the sandbox.

When Adobe Reader is launched, the broker process is started first, then spawning the lower privileged renderer process.

However, the broker has to set up a few things before running the renderer :

  • it creates a shared memory zone and a set of events to communicate with the renderer (IPC)
  • it installs hooks on syscalls into the renderer process by patching ntdll.dll code in memory

The renderer is not aware that it is running with low privileges. If no hooks were installed into its address space, most of its Win32 API calls would fail, leading quickly to application failure. Instead of that, system calls requiring higher privileges are intercepted and redirected to the broker with an IPC call. The broker verifies the operation, executes it if access is granted, and returns the result to the renderer through the IPC.

The unexpected behavior

Let's take a look at how hooks are installed by the broker :

  1. System call stubs are located inside the ntdll.dll image of the renderer
  2. A read-write-execute page is allocated with VirtualAllocEx and the original stubs are copied there
  3. The allocated page is set with read-execute rights
  4. JMP instructions are inserted in ntdll.dll functions to redirect to custom Adobe's handlers

Those handlers will first try to execute the syscalls in the renderer by jumping on the copied syscall stubs. If the call fails, it will send an IPC and the syscall will be handled by the broker.

If you look at the renderer memory map after the hooks installation, here's what you will see:

vmmap.PNG

Close Adobe Reader and run it again a few more times. You will see that the page at 0x90000 is almost constantly mapped at this address. This is the page allocated by the broker containing the syscalls stub. But ASLR is supposed to be enabled on this process!

intlevel.PNG

So what's happening ? The answer is simple: Windows does not randomize addresses allocated with ``VirtualAllocEx``. Dynamic memory randomization is actually performed in userland when calling functions like HeapAlloc, and does not happen with VirtualAlloc and VirtualAllocEx functions.

When this page is allocated by the broker, Windows will simply map it at the first memory address available. Hence the nearly constant memory address of 0x90000. This will always be the case unless the main module image or the first thread stack have already been mapped at this address.

This page is then executable, with a predictable address, and contains a few bytes corresponding to syscall assembly stubs. Unfortunately those few bytes are sufficient to bypass ASLR and DEP, as we will explain below.

Return-to-syscalls

Adobe Reader is compiled as a 32-bit process. Since we target it on Windows 7 x64, here is what looks like a syscall stub in a WOW64 process:

MOV EAX , syscall_number        ; EAX = syscall number
XOR ECX , ECX                   ; ECX = sign extend
LEA EDX , [ ESP + 4]            ; EDX = arguments
CALL DWORD PTR FS :[ C0h ]      ; Syscall gate
ADD ESP , 4
RETN nargs                      ; Stdcall return

In this capture, we can see the original syscall 0x51 and the patched version of 0x52 just below (NtCreateFile):

hook.PNG

And the relocated code in the allocated page at 0x90000:

interceptor.PNG

The idea

Let's suppose we are in the situation where we control return addresses from a vulnerable function. We can then execute in a row any of the hooked system calls. But we have very restrictive constraints:

  • Only a few system calls are available, those requiring higher privileges, and we are sandboxed
  • Some arguments are difficult to create (like pointers to OBJECT_ATTRIBUTES structures)

However if we had control over EAX, we could be able to call any Windows system call. There is no gadget available to modify EAX in the few bytes of the 0x90000 page, but there is a solution.We can modify ``EAX`` with the return values of the Windows system calls, by generating Windows error codes on purpose.

When a syscall succeeds, Windows returns the NT_SUCCESS value (which is 0). When it fails, it will return the error code in EAX or'ed with the 0xC0000000 constant. For example, 0xC0000005 stands for the STATUS_ACCESS_VIOLATION error code. No Windows syscall is represented by a such big number, but fortunately the Windows kernel will simply ignore the most significant bits passed into EAX. It is consequently possible to make "bad" system calls, get the return value into EAX and then get access to new system calls.

Example with NtSetInformationProcess

It is possible to disable DEP at runtime with the syscall NtSetInformationProcess. This syscall is not hooked by the sandbox, so we cannot access it directly. Instead we will make a call to the hooked NtUnmapViewOfSection with a bad address as an argument. Windows will return the STATUS_NOT_MAPPED_VIEW into EAX, which is 0xC0000019. We can then call the syscall number 0x19, which is NtSetInformationProcess.

This example is useless in our case because DEP has been permanently enabled and cannot be disabled. But it is possible to do more complex things by chaining errors.

The NtWaitForMultipleObjects trick

Playing with Windows error codes leads to straightforward observation : some error codes are impossible to generate (e.g. STATUS_NONEXISTENT_SECTOR). Fortunately, there exists one system call which will not return 0 on success : NtWaitForMultipleObjects. This system call takes up an array of handles as an argument, up to 64, and will return the index of the first handle that is in a signaled state.

If we can

  1. create an array of 64 valid handles in memory
  2. control the index of one signaled handle
  3. get access to NtWaitForMultipleObjects

we will be able to modify EAX between 0 and 63 and then get access to the 64 first Windows system calls.

Accessing NtWaitForMultipleObjects

The syscall number of NtWaitForMultipleObjects is 0x17. The corresponding error code is STATUS_NO_MEMORY (0xC0000017). We can call the hooked syscall NtMapViewOfSection with

  • A valid section object handle
  • A pointer to a valid writable section size
  • A number of null high-order bits to 20 for the allocated address

This will return STATUS_NO_MEMORY. As a valid section handle, we can set 0x4, this is a constant handle representing the memory shared with the broker. The broker also allocates a read-write memory zone used for the renderer initialization. As with the syscall page, this memory is constantly mapped at 0x80000. We set the size argument as a pointer into this memory and we trigger the NtMapViewOfSection call. We get STATUS_NO_MEMORY and we now have access to NtWaitForMultipleObjects.

Generating the array of handles

We now need to create an array of 64 valid Windows handles. We have access to

  • NtCreateMutant (already hooked)
  • NtCreateEvent by triggering the STATUS_INVALID_PAGE_PROTECTION error when calling the hooked NtCreateSection

We can then create 63 unsignaled event handles into the RW zone at 0x80000, and create a last signaled event to complete the array. By modifying the base address and size of the array when calling NtWaitForMultipleObjects, EAX will vary between 0 and 63.

However this method is suboptimal, as generating the handles one by one takes a lot of space in the shellcode. A better solution is to load the array of handles directly from an opened file that we control. Obviously here, that will be the PDF document being rendered. So basically we need this time:

  1. The handle of the opened PDF document
  2. A set of hardcoded valid handles in the document. Of course they need to be predictable to be hardcoded.
  3. Access to NtReadFile

We can easily access to NtReadFile by calling NtSetInformationThread with -2 (GetCurrentThread()) and a bad ThreadInformationClass as arguments. The resulting error code will be STATUS_INVALID_INFO_CLASS (0xC0000003). NtReadFile has syscall number 3 so that is fine for us.

The broker creates a lot of event handles (for IPC purpose) before the renderer is even started. As Windows handles are assigned incrementally (by multiples of 4), the broker's events have predictable handle values. Most of those events are unsignaled. You can observe this behavior in the capture of Process Explorer below.

handles.PNG

We can for example put up an array of 63 DWORDs of value 0x10 (NtWaitForMultipleObjects accepts duplicates), and load them into memory with NtReadFile. Then make a last call to NtCreateMutant to create the 64th signaled object.

Now we finally need the handle of our PDF document. We will not bruteforce it as it would be ugly (and we want to reduce our shellcode size). Thanks to JavaScript we will use an alternative technique : handle spraying !

The JavaScript method openDataObject will read a PDF document embedded as an attachment into the current PDF document. The document is parsed by Adobe, but it is not rendered. Each time this method is called, Adobe Reader extracts the attachment into a temporary file. So we set up our PDF with the 63 handles into another PDF, and we massively call the method openDataObject on it. This will create a lot of Windows file handles pointing to our desired PDF, spraying them into our process handle space. Then if we choose a sufficiently high handle value, we are sure to land on our PDF handle.

You can see the result below:

handle_spraying.PNG

Although they have different names, those files occupy a single file space into the container document. The PDF attachment will look like this:

pdfhandles_resized.png

The NtReadFile syscall requires a pointer to a 64-bit offset as an argument. We will then first read an offset at the beginning of the file (finding a pointer to 0 is easy), then we can make another call to NtReadFile to read the array of handles. Adobe allows putting junk before the PDF header, up to a certain point. To save us some time we will also load the final payload from the file during this operation.

In a nutshell,

  1. we put an array of handles, fixed to 0x10 into a PDF document
  2. we store this PDF into another document as an attachment
  3. we spray the file handles in JavaScript
  4. we call NtReadFile twice to read the array of handles, then we complete it with NtCreateMutant
  5. we call NtWaitForMultipleObjects on this array with different base addresses and sizes to modify EAX between 0 and 63

At this point, the 64 first Windows system calls are now accessible to us.

Copying the shellcode in memory

Among the new syscalls available for us, there are:

  • NtFreeVirtualMemory (0x1B), to delete a piece of allocated memory
  • NtAllocateVirtualMemory (0x15), to allocate some memory with write and execute rights
  • NtWriteVirtualMemory (0x37), equivalent to memcpy if we pass an handle of value -1 (GetCurrentProcess())

We could call NtAllocateVirtualMemory with a NULL address to let the system choose an available spot, but it would be difficult to pass this address as an argument to NtWriteVirtualMemory. Consequently I prefered to ensure a page of memory is free by calling NtFreeVirtualMemory at a fixed address (it does not matter if the call fails), then reallocating that page with writable/executable attributes. The payload read from the PDF document earlier is then copied into that page with NtWriteVirtualMemory.

Executing the payload

Last but not least, we dress up as a ninja and return to our newly allocated address containing our payload.

Mission accomplished!

What about Chrome?

This clearly seems to be a bug in the Chrome sandbox, so is Chrome vulnerable as well?

A patch has been commited into the Chromium source tree last February. A new function is now called right after the renderer process creation:

// Reserve a random range at the bottom of the address space in the target
// process to prevent predictable alocations at low addresses.
void PoisonLowerAddressRange(HANDLE process) {
  unsigned int limit;
  rand_s(&limit);
  char* ptr = 0;
  const size_t kMask64k = 0xFFFF;
  // Random range (512k-4.5mb) in 64k steps.
  const char* end = ptr + ((((limit % 4096) + 512) * 1024) & ~kMask64k);
  while (ptr < end) {
    MEMORY_BASIC_INFORMATION memory_info;
    if (!::VirtualQueryEx(process, ptr, &memory_info, sizeof(memory_info)))
      break;
    size_t size = std::min((memory_info.RegionSize + kMask64k) & ~kMask64k,
                           static_cast<SIZE_T>(end - ptr));
    if (ptr && memory_info.State == MEM_FREE)
      ::VirtualAllocEx(process, ptr, size, MEM_RESERVE, PAGE_NOACCESS);
    ptr += size;
  }
}

This code will choose a random address between 0x80000 and 0x470000, taken as a multiple of 0x10000. Every free memory region between 0 and that address will be reserved. Since Windows chooses the first available address for VirtualAllocEx, the entropy is increased to 6 bits (not great, but still better than nothing).

Conclusion

It is interesting to see how a mistake in the implementation of a security measure can impact the efficiency of another. ASLR particularly is a very fragile feature. It is only working under the assumption that every executable page in the process address space is randomized. How many products, supposedly protected by ASLR, have we seen still exploitable because one DLL wasn't compiled to be ASLR-compatible (oops...) ? This fragility speaks for itself here, where a single static page with a few instructions is enough to nullify the protection.

It also shows how a security measure can be dependent on the OS. When I first observed the non-randomized page, I browsed the code and couldn't find anything wrong with it. Until I did some tests on VirtualAllocEx by myself and realized the result was never randomized. Dear Windows, that is not the expected behavior.

The following proof-of-concept code will create a new thread in Adobe Reader X, set up a stack frame with a ROP shellcode as described in this article and execute a RET instruction. You need to open the linked PDF before so that handle spraying can happen.

Scripts used to generate the PDF:

The MessageBox payload was generated with Metasm. The handle spraying PDF was generated with Origami.

(Update) `A new patch`_ has been commited to Chromium by Justin Schuh, increasing the entropy to 16 bits.