Reading the NAND memory

A NAND memory chip is made of multiple pages regrouped in blocks. Each page is split into two parts: the actual data area and a smaller spare area that contains error correcting information, and optionally metadata. Pages can be read and written to, but must be erased first before writing new data. This is the main physical limitation of NAND storage. Erase operations are done at the block level: multiple pages must be erased at once. Additionally, pages can only support a limited number of erase/write cycles before presenting too many errors and becoming unusable.

iOS devices use one or more identical chips addressed by their "chip enable" number or CE. The actual geometry parameters (number of CEs, number of blocks per CE, number of pages per block, page and spare area sizes) depend on the device model and the total storage capacity. A physical address is composed of the "chip enable number" (CE), and a physical page number (PPN) on that chip.

Because of NAND limitations, translation mechanisms (FTL) are commonly used to allow operating systems to use NAND memory as a regular block device while optimizing its lifespan and performance under the hood. The main goal is to reduce the number of erase operations, and spread them evenly across all blocks. From a forensics point of view, the interesting side effect is that when a logical block is overwritten at the block device level, most of the time the older data is not erased immediately at the physical level. Thus, working on the raw image of the NAND can be very useful when searching for deleted data.

It is possible to read the raw NAND using openiBoot, but currently USB transfers are quite slow, which makes it impractical for dumping the whole Flash memory.

Starting with iOS 3, a program called ioflashstoragetool can be found on Apple iOS ramdisks. This utility can perform many low-level operations related to flash storage, and can read the raw NAND pages and spare areas (without performing any kind of decryption). These functionalities are exposed by the IOFlashControllerUserClient kernel service.

In iOS 5, most of the functions exposed by this IOKit interface were removed. To create a dump using this interface, we can boot a ramdisk using an older iOS 4 kernel. This works well, however we then lose the ability to use the existing kernel code to bruteforce the newer iOS 5 keybags. In order to dump the NAND with the iOS 5 kernel, we re-implemented the part of the IOFlashControllerUserClient::externalMethod function responsible for the read functionality. When our NAND dumper runs under the iOS 5 kernel, it replaces this function with one that handles the kIOFlashControllerReadPage selector.

Additionally, we can set the nand-disable-driver boot flag to prevent high-level access to the NAND and make sure it is not modified during the acquisition.

iOS Flash Translation Layer

The Flash Translation Layer used on iOS devices is based on the Samsung Whimory FTL (WMR). The openiBoot team did a great job at reverse-engineering it, so we could understand the mechanisms and structures used. Whimory translation code is split into two layers: VFL and FTL.

The Virtual Flash Layer (VFL) is responsible for remapping bad blocks and presenting an error-free NAND to the FTL layer. The VFL layer knows the physical geometry and translates virtual page numbers used by the FTL to physical addresses (CE number + physical page number).

The FTL layer operates over VFL, and presents the block device interface to the operating system. It translates block device logical page numbers (LPNs) to virtual page numbers, handles wear leveling and garbage collection of blocks containing outdated data. On devices that support hardware encryption, all pages that contain data structures related to VFL and FTL are encrypted by a static metadata key.

ftl_overview.png The two FTL subsystems have evolved through the various devices and iOS versions:

  • iOS 1.x and 2.x : “legacy” FTL/VFL
  • iOS >= 3 : YaFTL /VSVFL
  • Since iOS 4.1, some of the newer devices are equipped with PPN (Perfect Page New) NAND that uses a specific PPNFTL

PPN devices have their own controller, running a firmware that can be upgraded through the IOFlashControllerUserClient interface, but most of the FTL work seems to still be done in software, using YaFTL on top of a new PPNVFL.

Based on the openiBoot code, we wrote a minimal read-only Python implementation of YaFTL/VSVFL to get the block-device view over raw NAND images. Combined with a Python HFS+ implementation, this allows extraction of the logical partitions to get the equivalent of a dd-image. We then need to understand the YaFTL mechanisms in order to leverage the additional data available in the NAND image.

YaFTL

YaFTL is a page mapping FTL, where logical pages can be stored anywhere and in any order on the physical media. It is quite similar to DFTL : page mapping information (called index pages) is stored on Flash and cached in memory on access. YaFTL splits the virtual address space presented by the VFL layer into superblocks. A superblock can be seen as a "row" of physical NAND blocks. There are 3 types of superblocks, based on the type of pages they store:

  • Context pages store all the information required to initialize YaFTL, including the userTocPages array that points to up-to-date index pages. It also stores erase counters for wear-levelling.
  • Index pages store pointers to user pages
  • User pages contain block device data

The following figure summarizes the YaFTL translation process:  yaftl.png

During normal operation, only one superblock of each type is “open” at a given time: pages are written sequentially in a log-block fashion. When the current superblock is full, YaFTL finds a free superblock to continue the process. Outdated user data is only erased when the garbage collector kicks in.

The last pages of Index and User superblocks are used to store the BTOC (block table of contents). For User blocks, the BTOC lists the Logical Page Numbers of all the pages stored in the block. For Index blocks, the BTOC indicates the first LPN pointed by each index page.

FTL restore is performed at boot when the FTL was not unmounted properly (after a kernel panic or hard reboot for instance) and the latest context information was not committed to flash storage. The FTL restore function has to examine all superblocks (using BTOCs to speed up the process) in order to reconstruct the correct context.

Spare area metadata

iOS reserves 12 bytes in the spare area of each page to store metadata. The YaFTL metadata format is described in openiBoot :

// Page types (as defined in the spare data "type" bitfield)
#define PAGETYPE_INDEX          (0x4)   // Index block indicator
#define PAGETYPE_LBN            (0x10)  // User data
#define PAGETYPE_FTL_CLEAN      (0x20)  // FTL context (unmounted, clean)
#define PAGETYPE_VFL            (0x80)  // VFL context

...
typedef struct {
    uint32_t lpn;            // Logical page number
    uint32_t usn;            // Update sequence number
    uint8_t  field_8;
    uint8_t  type;            // Page type
    uint16_t field_A;
} __attribute__((packed)) SpareData;

The lpn field allows the FTL code to check if the translation was correct when reading a page. It is also used during the FTL restore process, to identify pages in “open” superblocks that do not have a BTOC.

The usn field records the global update sequence number at the time the page was written. This number is incremented every time a new version of the FTL context is committed or when a superblock is full and a new one is open. The usn allows to easily sort superblocks by age.

Metadata whitening

The hardware encryption only applies to the pages data and not to the spare area. On recent devices running iOS 4, metadata stored in the spare area is scrambled through a process called "metadata whitening". The 12 bytes of the SpareData structure are XORed with pseudorandom values depending on the physical page number. The algorithm can be found in openiBoot :

static uint32_t h2fmi_hash_table[256];

...

void h2fmi_init()
{
...
    // This is a very simple PRNG with
    // a preset seed. What are you
    // up to Apple? -- Ricky26
    // Same as in 3GS NAND -- Bluerise
    uint32_t val = 0x50F4546A;
    for(i = 0; i < 256; i++)
    {
        val = (0x19660D * val) + 0x3C6EF35F;

        int j;
        for(j = 1; j < 763; j++)
        {
            val = (0x19660D * val) + 0x3C6EF35F;
        }

        h2fmi_hash_table[i] = val;
    }
...
}

error_t h2fmi_read_single_page(...)
{
...
    if(h2fmi_data_whitening_enabled)
    {
        uint32_t i;
        for(i = 0; i < 3; i++)
            ((uint32_t*)_meta_ptr)[i] ^= h2fmi_hash_table[(i + _page) % ARRAY_SIZE(h2fmi_hash_table)];
    }
...
}

Whether metadata whitening is enabled or not is indicated in the flags field of the NANDDRIVERSIGN special page.

Recovering deleted files

Once the NAND image is acquired and the geometry is known, we can start digging for deleted files. The first step is to build a lookup table that references all available versions of each logical page. Two methods can be used:

  • Read every spare area in the image to find pages where type is PAGETYPE_LBN, and read the lpn field (bruteforce approach)
  • Loop through each non-empty user superblock, read BTOCs when available or scan all pages if the block is not full

Once this lookup table is built, we can then easily access all available versions of a given logical page. In order to recover deleted files in the data partition, we implemented a simple algorithm, similar to the HFS journal carving technique :

  • list all the file IDs in the data partition in its current state : we use the regular FTL translation, and use the EMF key to decrypt data
  • get the location of the current catalog file and attribute file (ranges of LBAs)
  • for each LBA belonging to the catalog file
    • for each available version of the current LBA
      • read the page for this version, decrypt it with the EMF key
      • search the page for catalog file records whose file ID is not present in the current file IDs list (deleted files)
  • repeat the same process on the attribute file to find the encryption keys for the deleted files identified previously (cprotect extended attributes)
  • for each deleted file found
    • loop through all the possible encryption keys and versions of the first logical block until the decrypted contents matches a magic number (common file headers magic). See the isDecryptedCorrectly function (which can be improved).
    • if the decryption key and the USN of the first block are found, read the next file blocks using that USN as reference. Another method is to start reading pages starting from the first file block we found, following the FTL “write order”: read until end of superblock, and then continue in the next one with a higher USN and so on until all file blocks are found.

This naive algorithm gives good results on “static” files like pictures, where the whole file is written once and never updated. For files such as SQLite databases it would require some more logic to recover consistent snapshots of the successive versions, by detecting writes to the file header or tracking modifications to the catalog file entry (file modification date) for instance.

One file that could be interesting to recover is the system keybag. If an attacker was able to access the first version of the system keybag (when no passcode is set, right after the firmware restore), he could then access all the class keys without having to attack the current user's passcode. However, it is not possible to exploit older versions of the system keybag because of a second layer of encryption: the systembag.kb payload is encrypted with the BAG1 key, which is stored in the effaceable area and regenerated randomly every time a new version of the file is written to disk (when the user changes his passcode). This mechanism was clearly designed to prevent such attacks, as explained in the "Securing application data" talk from Apple WWDC 2010 (Session 209).

iOS 3.x wipe vulnerability

Once we had the ability to read the raw NAND contents, we took another look at the iOS 3 mechanisms. At that time, the whole data partition (including file contents) was encrypted with the EMF key, which was stored (encrypted) in the last logical block of the partition. Since there was no effaceable area at that time, we supposed that this last logical block was managed by the FTL just like the rest of the partition. By acquiring a NAND image right after wiping an iOS 3 device, it is indeed possible to find multiple versions of this logical block: one with the new EMF key generated during the wipe, and the older one that was only overwritten at the block device level. Thus by using the old key and collecting the old data partition pages (based on the USN of the wipe), it is possible to (partially) reconstruct the wiped data partition. Of course this vulnerability is fixed since iOS 4 with the effaceable area that allows encryption keys to be erased securely.

The NAND acquisition and carving tools are now available on the iphone-dataprotection repository. Additional details are also available on the wiki. Finally, many thanks to Patrick Wildt and the openiBoot team for their great work on the iOS FTL that allowed us to build these tools.