Keygen 3: ELF x64 - Basic KeygenMe

Link: https://www.root-me.org/en/Challenges/Cracking/ELF-x64-Basic-KeygenMe (binary)

The aim of this KeygenMe is to generate a valid key for a given user name:

Find the serial for the “root-me.org” user. The validation password is the serial’s sha256 hash.

The binary asks for a user name or “Login”, but no serial number:

$ ./ch36.bin
[!] x64 NASM Keygen-Me
[A] root-me
[?] Login : root-me.org   <-- typed in

[.|..] THE GAME

Using strings reveals the message that is presumably printed when the correct key is generated:

$ strings ./ch36.bin
[!] x64 NASM Keygen-Me
[A] root-me
[?] Login : [?] Key :
[\o/] Yeah, good job bro, now write a keygen :)
[.|..] THE GAME
.m.key
...

If we open ch36.bin in Hopper, the first things we can notice are the lack of symbols and the small size of this binary (only 960 bytes). There’s one procedure called EntryPoint (which I believe is a name given by Hopper and not an actual symbol), and two anonymous functions. The output does mention NASM, so this was likely built by hand in assembly language and not compiled. It might both make things easier with code that is more readable (since it wasn’t generated) and harder due to the lack of recognizable functions.

There are a few system calls being made, so it’ll be helpful to have a reference table of system calls at hand.

We see the initial message being printed with a syscall for write (rax = 1, rdi = 1 for stdout, rsi = contents, edx = 0x32 for length):

000000000040018a         mov        eax, 0x1
000000000040018f         mov        edi, 0x1
0000000000400194         movabs     rsi, aX64NasmKeygenm    ; "[!] x64 NASM Keygen-Me..."
000000000040019e         mov        edx, 0x32
00000000004001a3         syscall 

The second syscall is a read (rax = 0, edi = 0 for stdin, rsi = 0x600260 for the buffer, edx = 0x20 for the length):

00000000004001a5         mov        eax, 0x0
00000000004001aa         mov        edi, 0x0
00000000004001af         movabs     rsi, 0x600260
00000000004001b9         mov        edx, 0x20
00000000004001be         syscall 

And finally the third one is an open (rax = 2, rdi = 0x40012e for the file name, esi = 0x0 for O_RDONLY):

00000000004001c0         mov        eax, 0x2
00000000004001c5         movabs     rdi, 0x40012e
00000000004001cf         mov        esi, 0x0
00000000004001d4         syscall 
00000000004001d6         cmp        rax, 0xfffffffffffffffe
00000000004001da         je         loc_400215

We can dump the file name with GDB:

gdb$ b *0x040018a
Breakpoint 1 at 0x40018a
gdb$ r
Starting program: ./ch36.bin

Breakpoint 1, 0x000000000040018a in ?? ()
=> 0x000000000040018a:  b8 01 00 00 00  mov    eax,0x1
gdb$ x /8b 0x40012e
0x40012e:   0x2e    0x6d    0x2e    0x6b    0x65    0x79    0x0 0xb8
gdb$ printf "%s", 0x40012e
.m.key

Remember that we saw this value in the output of strings. This is a hardcoded string in .data, something we can confirm with objdump -xs ch36.bin. Here rax contains the return value of open and is being compared to -1; we jump to a different place if that’s the case. This would happen if .m.key is missing, so we’ll likely need to create this file. It might also explain why we were only asked for a login and no serial number.

With the file open and its descriptor in rax, we continue with another read system call. rax is moved to rdi which is the first parameter for read syscall and rax (or eax here) is set to zero to encode the read operation:

00000000004001dd         mov        rdi, rax
00000000004001e0         mov        eax, 0x0
00000000004001e5         movabs     rsi, 0x600280
00000000004001ef         mov        edx, 0x20
00000000004001f4         syscall 

The buffer passed as input was originally initialized with null bytes, and will now contain the first 0x20 or 32 bytes of .m.key:

gdb$ x /32b 0x600280
0x600280:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x600288:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x600290:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x600298:    0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00

The addresses for our input buffer and read buffer are then respectively placed in rdi and rsi, a function is called, and then a conditional jump is made based on the value of rax. This is likely what checks the input against the data read from .m.key, since rdi points to the buffer containing our input and rsi points to the buffer we read into:

00000000004001f6         movabs     rdi, 0x600260    ; argument #1, our previous input buffer
0000000000400200         movabs     rsi, 0x600280    ; argument #2, what we just read from `.m.key`
000000000040020a         call       sub_400146       ; unknown behavior so far
000000000040020f         cmp        rax, 0x0         ; checking for a return value of rax=0
0000000000400213         je         loc_400232

Let’s jump to 400146 and look at the code. The procedure is pretty short, and using simple instructions that are easy to follow:

sub_400146:
0000000000400146         call       sub_400135
000000000040014b         cmp        rax, 0x1
000000000040014f         je         loc_40017b

0000000000400151         mov        rbx, rax
0000000000400154         xor        rdx, rdx
0000000000400157         xor        rcx, rcx
000000000040015a         xor        rax, rax
000000000040015d         sub        rbx, 0x1

                     loc_400161:
0000000000400161         cmp        rcx, rbx
0000000000400164         je         loc_400182

0000000000400166         mov        dl, byte [rdi+rcx]
0000000000400169         mov        dh, byte [rsi+rcx]
000000000040016c         mov        al, dl
000000000040016e         sub        al, cl
0000000000400170         add        al, 0x14
0000000000400172         cmp        al, dh
0000000000400174         jne        loc_40017b

0000000000400176         inc        rcx
0000000000400179         jmp        loc_400161

                     loc_40017b:
000000000040017b         mov        eax, 0x1337
0000000000400180         jmp        loc_400185

                     loc_400182:
0000000000400182         xor        rax, rax

                     loc_400185:
0000000000400185         ret 

This procedure starts with an immediate call to 400135, just 17 bytes above our current position. It reads:

sub_400135:
0000000000400135         mov        eax, 0x0

                     loc_40013a:
000000000040013a         cmp        byte [rdi+rax], 0x0
000000000040013e         je         loc_400145

0000000000400140         inc        rax
0000000000400143         jmp        loc_40013a

                     loc_400145:
0000000000400145         ret 

This could hardly be simpler. eax is set to zero, then we compare the byte at rdi + rax to zero and jump to ret if they match; otherwise we increment rax and jump back to the cmp. So this is something like to rax = strlen(rdi), or more accurately:

eax = 0
while(*(rdi+rax) != 0) {
    rax++;
}

side note: It is somewhat unusual to see both eax and rax being used here, it seems that this program is making the assumption that the higher 32 bits of rax will be zero. This is not the only place where this assumption is made or that 32 and 64-bit registers are mixed up. If we called this with rax set to 1<<48, the mov eax, 0x0 would not have any effect and the cmp would be done with rdi + 1<<48 in the first loop, instead of rdi + 0. It would make more sense for the first instruction to read xor rax, rax. In addition, mov eax, 0x0 encodes to B8 00 00 00 00 while the equivalent xor rax, rax only takes 3 bytes with 48 31 C0. Not the most efficient way to do this and a clear indication that the program was written by hand, possibly by someone relatively new to assembly language.

Back to 400146. If rax is 1 we jump almost to the end, at 40017b. There, eax is set to 0x1337 (leet) and we jump to ret. The instruction we jump over is a xor rax, rax which looks like an alternative end to this procedure. If rax was not 1, we continue with a few instructions that operate on our inputs. Recall that rdi points to our input and rsi points to a buffer in which we’ve read the contents of .m.key.

First, rax (our input length) is decremented and stored in rbx, and rax, rcx, rdx are cleared:

0000000000400151         mov        rbx, rax     ; rbx = length(input)
0000000000400154         xor        rdx, rdx     ; rdx = 0
0000000000400157         xor        rcx, rcx     ; rcx = 0
000000000040015a         xor        rax, rax     ; rax = 0
000000000040015d         sub        rbx, 0x1     ; rbx--

If rcx has reached rbx, we jump to the xor rax, rax at the very end.

0000000000400161         cmp        rcx, rbx
0000000000400164         je         loc_400182

It looks like rcx will be our counter as we go over the input.

Now for the core of the loop:

0000000000400166         mov        dl, byte [rdi+rcx]
0000000000400169         mov        dh, byte [rsi+rcx]
000000000040016c         mov        al, dl
000000000040016e         sub        al, cl
0000000000400170         add        al, 0x14
0000000000400172         cmp        al, dh
0000000000400174         jne        loc_40017b

0000000000400176         inc        rcx
0000000000400179         jmp        loc_400161

We move the input byte at offset rcx into dl, and the byte from the same offset in the file into dh. dl is transferred to al, and we subtract cl (the bottom 8 bits from rcx) from al before adding 0x14. If the total isn’t dh, we jump again to 40017b, which is the mov eax, 0x1337. As long as they match, we increment rcx and go back to the cmp at 400161. We can re-write this whole loop as:

top:
    if (rcx == rbx) goto ret_zero;

    al = input[rcx] - cl + 0x14;
    dh = file_contents[rcx];
    if (al != dh) goto ret_1337;

    rcx++
    goto top;

ret_1337:
    rax = 0x1337;
    return;

ret_zero:
    rax = 0;
    return;

So it looks like we’re comparing the contents of .m.key with input[i]-i+0x14 for i covering the length of the input. If they all match we’ll return 0, otherwise we’ll return 0x1337. We saw earlier that there was a conditional jump after the call to this function based on whether rax was zero, so this is likely our success indicator.

In Python, let’s start with the input string, then compute the value above for each byte:

>>> s = 'root-me.org'
>>> list(zip(s, range(len(s))))  # zip with increasing i offset
[('r', 0), ('o', 1), ('o', 2), ('t', 3), ('-', 4), ('m', 5), ('e', 6), ('.', 7), ('o', 8), ('r', 9), ('g', 10)]
>>> ''.join([chr(ord(s[i])-i+0x14) for i in range(len(s))])  # input[i]-i+0x14
'\x86\x82\x81\x85=|s;{}q'

And then try it out:

$ echo -ne "\x86\x82\x81\x85=|s;{}q" > .m.key
$ ./ch36.bin
[!] x64 NASM Keygen-Me
[A] root-me
[?] Login : root-me.org

[\o/] Yeah, good job bro, now write a keygen :)

This wasn’t a particularly difficult keygen. I found that the mistakes made by the author in their use of mismatched registers or inefficient instructions to be the most interesting part, clearly revealing that the code was handcrafted by someone still learning the ropes. Good job nonetheless!

You can download an annotated version of the Hopper file for this binary here.