An easy start

The crackme is a Windows application, and asks for a name and a serial number. The serial must be composed of two parts, separated by a hyphen. The first part is 8 chars long, the second one is 20 chars.

First, the name is hashed with MD5. The resulting hash is converted to a string of 16 lower chars, using a basic encoding function:

for(i = 15; i >= 0; i--)
{
    out[i] = 'a' + (hash[i] % 26);
}

The resulting string is converted to UNICODE and hashed with the RtlHashUnicodeString function. This function is exported by ntdll.dll. It returns a 32 bits hash using the x65599 (?) hashing algorithm. Input string is processed case insensitive. A pseudo-code for this function would be:

def x65599(string_to_hash):
    hash = 0
    s2 = string_to_hash.upper()
    for i in range(len(s2)):
        c = ord(s2[i])
        hash = (hash * 65599) & 0xffffffff
        hash = (hash + c) & 0xffffffff
    return hash

To sum up, a 32 bit hash is obtained from the user name.

Reversing CRC32

The first part of the serial, composed of 8 chars, is hashed using CRC32. The computation is done with the RtlComputeCrc32 function. The initial value of the CRC32 is fixed to 'ESET' (0x45534554). The resulting hash must be equal to the x65599 hash previously computed.

Obtaining a string with a given CRC32 is easy, if you can fix 4 bytes in it. For more information about CRC32 reversing, check this paper. The additional difficulties here are the custom start value and the fact that this string must be written using a custom set of 32 chars: "2WQKHTL3VJBYG6PZCM9AXF0UED5RS7N8".

How to handle such case? I chose to generate a string of 4 random chars taken in the charset, and to fix its CRC adding 4 bytes to it, so that the CRC is equal to the x65599 hash. Then, if the 4 added bytes are in the charset, the problem is solved. Else, I try with another string.

It is a small bruteforce: considering the possibility of each char to be in the charset is 32/256, the probability that the 4 added chars are in the charset is 1/2^12, which is really acceptable.

Here is the Python code used to fix the CRC:

def build_crc_tables():
    for i in range(256):
        fwd = i
        rev = i << 24
        for j in range(8, 0, -1):
            # build normal table
            if (fwd & 1) == 1:
                fwd = (fwd >> 1) ^ 0xedb88320
            else:
                fwd >>= 1
            crc32_table[i] = fwd
            # build reverse table =)
            if rev & 0x80000000 == 0x80000000:
                rev = ((rev ^ 0xedb88320) << 1) | 1
            else:
                rev <<= 1
            crc32_reverse[i] = rev

def crc32forge3(wanted_crc):
    ''' return a 8 chars string whose crc32 is equal to wanted_crc '''
    charset = '2WQKHTL3VJBYG6PZCM9AXF0UED5RS7N8'
           
    start_value = 0x45534554 ^ 0xffffffff # ESET
    while 1:
        # generate random string of 4 chars        
        s = ''.join(charset[random.randint(0, 31)] for i in range(4))
       
        crc = start_value        
        for i in range(4):
            crc = (crc >> 8) ^ crc32_table[(crc ^ ord(s[i])) & 0xff]
       
        s += pack('<L', crc)
        crc = wanted_crc ^ 0xffffffff
       
        for i in range(7, 3, -1):
            crc = (crc << 8) ^ crc32_reverse[crc >> 24] ^ ord(s[i])
            crc &= 0xffffffff
       
        s2 = s[:4] + pack('<L', crc)
       
        # check if added bytes are in the charset
        # else same player play again
        valid = True
        for i in range(4):            
            if charset.find(chr((crc >> (8 * i)) & 0xff)) == -1:
                valid = False
                continue
               
        if valid == True:
            return s2

crc32forge3(x65599_hash) # for example

Finding the first part of the serial is quite immediate.

A strange decimal conversion

The second part of the serial is 20 chars long. It is converted to a 64 bit integer n in the sub_4012DC function. This function is a bit tricky. Basically, it does a decimal string to integer conversion: each time a byte is read, n is multiplied by 10, the serial byte is converted to an integer between 0 and 9, and added to n.

How are the chars converted to a decimal value? This is the difficulty of this part. Input chars are checked like this:

movzx   eax, byte ptr [esi] ; esi points to the input string
                inc     esi
                test    eax, eax
                jz      short end_function
                xor     ebx, ebx
                xor     edx, edx
                sub     eax, 32h        ; eax -= 0x32
                jb      short end_function
                test    eax, 0E0h
                jnz     short end_function
                mov     edx, 1Fh        ; 5 bits set to 1
                mov     ecx, 4

loop1_start:                            ; CODE XREF: sub_4012DC+3D
                cmp     eax, edx
                jz      short loop1_end
                btc     edx, ecx        ; test and complement
                inc     ebx
                dec     ecx
                jns     short loop1_start

loop1_end:                              ; CODE XREF: sub_4012DC+36
                jz      short loop2_end
                mov     ecx, 4

loop2_start:                            ; CODE XREF: sub_4012DC+4F
                cmp     eax, edx
                jz      short loop2_end
                btc     edx, ecx        ; test and complement
                inc     ebx
                dec     ecx
                jns     short loop2_start

loop2_end:                              ; CODE XREF: sub_4012DC:loop1_end
                                        ; sub_4012DC+48
                jnz     short end_function

0x32 is subtracted to each char. Resulting value must be different from 0. Then, two loops are executed. They iterate reading the 5 lower bits of each char. Each loop exits when the read bit is different from the previous one. Once the two loops have been executed, the 5 bits have to be read. That means that to be valid, a char must be composed of

  • a sequence of 1 followed by a sequence of 0,
  • or a sequence of 0 followed by a sequence of 1.

Only 10 5-bit values validate this condition: 11111, 01111, 00111, 00011, 00001, 10000, 11000, 11100, 11110. Adding 0x32 to each of these values, we obtain the 10 bytes charset: 'Q', 'A', '9', '5', '3', '2', 'B', 'J', 'N' and 'P'. The value between 0 and 9 is contained, after the two loops, in the ebx register. It means that it is equal to the number of iterations the two loops have been executed while reading the byte.

To sum up, we know that sub_4012DC converts a string encoded in a 10 bytes custom charset into a 64 bit integer. In order to write a key generator, the inverse function has to be coded. It takes a number as input, and generates the corresponding encoded string. Here is the code:

def decimal_encode(n):    
    charset = 'QA9532BJNP' # btc stuff
    s = ''
   
    for i in range(20):
        s = charset[n % 10] + s
        n /= 10
    return s

ElGamal signature

The last part of the crackme is about a ElGamal signature verification. ElGamal signatures are composed of two numbers (r, s).

Retrieving message and signature

The first part of the serial, whose generation is explained above, is hashed with MD5. A 16 byte digest is obtained. It is used to generate a 64 bit integer. This integer is the r in the ElGamal signature. The function that converts the digest to a 64 bit integer is the following one:

sub_4012C2      proc near               ; CODE XREF: sub_4014C3+C6
                                        ; sub_4014C3+DB

arg_0           = dword ptr  4

                push    esi
                mov     esi, [esp+4+arg_0]
                mov     eax, [esi]
                mov     edx, [esi+4]
                xor     eax, [esi+8]
                xor     edx, [esi+0Ch]
                and     edx, 7FFFFFFFh
                pop     esi
                retn    4
sub_4012C2      endp

Basically it casts the digest into an array of unsigned integers ll, and computes  (ll[0] ^ ll[2] & 0x7fffffffffffffff.

s is the second part of the serial after its conversion to a number using explained above.

The user name is hashed with MD5. The resulting hash is converted to a 64 bit integer with sub_4012C2. Let's call this number h.

Finding group parameters

ElGamal security relies on the discrete logarithm problem. To make it short, it consists in finding x such as g^x = y mod p, given g, y and p. g, y and p are public parameters.

The signature is checked in sub_40121A. This function calls two big integers functions, powmod64 (sub_4011B0) which is a basic square-and-multiply modular exponentiation, and mulmod64 (sub_401178). Before being able to generate valid signatures, public parameters have to be identified.

The group generator is easy to spot: it is present in the signature verification function.

.text:00401237                 mov     dword ptr [edx], 3 ; generator
.text:0040123D                 mov     dword ptr [edx+4], 0
.text:00401244                 push    esi             ; h
.text:00401245                 push    edx
.text:00401246                 push    eax             ; result
.text:00401247                 call    powmod64        ; compute g^h mod p

So g = 3. y is also easy to identify: its value is 0x5353453233444f4e ("NOD32ESS").

p is not a parameter of the powmod64 function, which is a bit confusing. It is actually hardcoded inside this function, more precisely in the mod64 function (sub_40110B):

.text:0040113E                 push    0FFFFFFFFh
.text:00401140                 push    0FFFEA007h
.text:00401145                 push    edx
.text:00401146                 push    eax
.text:00401147                 call    RtlLargeIntegerSubtract

Hence p = 0xFFFFFFFFFFFEA007, which is prime.

Generating signatures

All the parameters have been found, we can now generate a signature. First, the private key has to be found: it is x such as g^x = y mod p.

The order of the generator g=3 is p-1, which is a smooth number:

p-1 = 2 * 3 * 3 * 5 * 7 * 11 * 13 * 3797 * 42701 * 1262887

Hence the private key can be efficiently computed using the Pohlig-Hellman algorithm. I included a version of the algorithm in the keygen source code. It is inspired from the Handbook of Applied Cryptography.

x is quickly found: x = 0x62dd00828bb0e577L

Now we have all the necessary information to generate valid signatures. Wikipedia gives us the formula:

p = 0xfffffffffffea007L
g = 3
y = 0x5353453233444f4eL
x = 0x62dd00828bb0e577L

h = ...
k = ... # random_value such as gcd(k, p-1) == 1

r = pow(g, k, p)
s = ((h - x * r) % (p-1))
s = s * number.inverse(k, p-1) % (p-1)

And the crackme is solved. Almost.

A last problem

The code just above generates r and s values that sign the message h. The problem is different here: remind that r comes from the MD5 of the first part of the serial. It cannot be chosen randomly, and must be considered as a fixed value...

r = pow(g, k, p) is fixed. What is rather annoying is that k, normally chosen at random, is necessary to compute s. How to retrieve k from r? Solving another discrete logarithm problem is necessary...

r depends on the hash of the first part of the serial. This part depends on the hash of the user name. Hence, for each user name, a discrete logarithm has to be solved to generate a valid signature.

Finally, once k has been found, one has to check that it is co-prime with p-1, else k has no inverse mod p-1, and s cannot be computed.

Once all theses steps have been understood, it is time to code a key generator. Source code is attached. The more interesting parts in this source are the CRC32 reversing function, and the runtime DLP solver, which uses Pohlig-Hellman and Pollard's Rho algorithm to compute rather efficiently the logarithms, even if it is written in Python.

I thank the ESET team for this nice crackme. Hiding such tricks in a so small piece of code is tremendous =)