Crackme 13: buga0205’s Code Linux

Link: http://crackmes.cf/users/buga0205/code_linux/ (now dead, see archive alt.) (binary)

$ ./CodeLinux
*****This is Code Linux Agent!*****
- Enter The Code : 1234

Alert! Unidentified User!

The first thing to notice is the size of this executable, which contains a large amount of code since it was statically linked. It also seems like most symbols were stripped, so Hopper only lists procedures as sub_XXXXXX with hex addresses. By looking at the parameters, we can deduce the names of some of them. For example, sub_8050120 seems to be printf:

08049716         mov        dword [esp+0x20+var_20], aThisIsCodeLinu
0804971d         call       sub_8050120
...
0804978a         mov        dword [esp+0x20+var_20], aNalertUnidenti
08049791         call       sub_8050120

We can rename them as we go along in Hopper. This program also has some basic anti-debugging protection, visible here:

# strace ./CodeLinux
execve("./CodeLinux", ["./CodeLinux"], [/* 8 vars */]) = 0
[...]
ptrace(PTRACE_TRACEME)                  = -1 EPERM (Operation not permitted)
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xf7730000
write(1, "Alert! I hate debugging stuff,,,"..., 34Alert! I hate debugging stuff,,,\n) = 34

This string is easy to find in the binary:

080496fc         call       sub_8075510
08049701         test       eax, eax
08049703         jns        loc_8049716 ; this is the block below

08049705         mov        dword [esp+0x20+var_20], aAlertIHateDebu
0804970c         call       sub_804fab0
08049711         jmp        loc_8049832

                loc_8049716:
08049716         mov        dword [esp+0x20+var_20], aThisIsCodeLinu

In the past, we’ve replaced the conditional jump at 08049703 with a JMP to bypass the alert; let’s use a different approach here, just for the novelty of it: we’ll replace test eax, eax with xor eax, eax and it will never print the error message. We can find the code for it at 08049862 and notice that it also takes 2 bytes, 0x31 0xC0 when the test was 0x85 0xC0. We can therefore replace a single byte at 08049701 to create a new binary. While we’re at it, let’s remove the call to nanosleep that comes before “Alert! Unidentified User”. It seems to sleep for 1 second, and this value matches:

08049742         mov        dword [esp], 0xf4240 ; 1 million
08049749         call       sub_80754c0

We can replace 0xf4240 with zeros, which gives us a new binary. Tracing it with an incorrect input shows the fixed nanosleep call:

write(1, "- Enter The Code : ", 19- Enter The Code : )     = 19
read(0, 1234
"1234\n", 1024)                 = 5
nanosleep({tv_sec=0, tv_nsec=0}, NULL)  = 0
write(1, "\n", 1
)                       = 1
write(1, "Alert! Unidentified User! \n", 27Alert! Unidentified User!

With these two out of the way, let’s start looking at the way our input is processed. We enter sub_805c060 with our password on the stack, and make eax point to it. It processes the input 4 bytes at a time, successively masking it with interesting constants like 0xfefefeff or 0x1010100. This is a fast implementation of strlen; see the glibc’s take on it on StackOverflow.

We test the parity of the length and fail if it’s not odd:

0804977e         and        edx, 0x1
08049781         sub        edx, eax
08049783         mov        eax, edx
08049785         cmp        eax, 0x1        ; check for odd length
08049788         jne        sub_80496cc+214 ; jump to 080497a2, over the printf call
0804978a         mov        dword [esp], aNalertUnidenti ; argument #1 for method printf, "\\nAlert! Unidentified User! "
08049791         call       printf                       ; printf
08049796         mov        dword [esp], 0x1
0804979d         call       sub_804f0d0
080497a2         mov        eax, dword [dword_80f3f9c]

We then add 1 to the lenght, subtract 9, multiply it by 2, compare it to 0x29 (41), and only ontinue if it’s under. So if (length + 1) * 2 < 41, the length can’t be more than 19:

080497a7         sub        eax, 0x9
080497aa         add        eax, eax
080497ac         cmp        eax, 0x29
080497af         jle        sub_80496cc+253

At 080497fc, we call sub_080495a9, which encodes our input. At 0x080495b5, we observe that 0x80f50a0 contains our input string. After the call to sub_08049535, the same address contains a base-64 encoded version of it.

sub_8049058 then transforms our input. Hopper decompiles it as such:

int sub_8049058(int arg0, int arg1) {
    var_8 = arg1;
    var_4 = arg0;
    var_C = strlen(var_4);
    var_10 = 0x0;
    do {
        eax = var_10;
        if (eax >= var_C) {
            break;
        }
        if (((*(int8_t *)(var_4 + var_10) & 0xff) > 0x60) && ((*(int8_t *)(var_4 + var_10) & 0xff) <= 0x7a)) {
            *(int8_t *)(var_10 + var_4) = (*(int8_t *)(var_4 + var_10) & 0xff) - 0x61;
            if (var_8 + sign_extend_32(*(int8_t *)(var_4 + var_10) & 0xff) < 0x0) {
                *(int8_t *)(var_10 + var_4) = (*(int8_t *)(var_4 + var_10) & 0xff) + 0x1a;
            }
            ecx = sign_extend_32(*(int8_t *)(var_4 + var_10) & 0xff) + var_8;
            *(int8_t *)(var_10 + var_4) = ecx - ((SAR(HIDWORD(ecx * 0x4ec4ec4f), 0x3)) - (SAR(ecx, 0x1f))) * 0x1a;
            *(int8_t *)(var_10 + var_4) = (*(int8_t *)(var_4 + var_10) & 0xff) + 0x61;
        }
        if (((*(int8_t *)(var_4 + var_10) & 0xff) > 0x40) && ((*(int8_t *)(var_4 + var_10) & 0xff) <= 0x5a)) {
            *(int8_t *)(var_10 + var_4) = (*(int8_t *)(var_4 + var_10) & 0xff) - 0x41;
            if (var_8 + sign_extend_32(*(int8_t *)(var_4 + var_10) & 0xff) < 0x0) {
                *(int8_t *)(var_10 + var_4) = (*(int8_t *)(var_4 + var_10) & 0xff) + 0x1a;
            }
            ecx = sign_extend_32(*(int8_t *)(var_4 + var_10) & 0xff) + var_8;
            *(int8_t *)(var_10 + var_4) = ecx - ((SAR(HIDWORD(ecx * 0x4ec4ec4f), 0x3)) - (SAR(ecx, 0x1f))) * 0x1a;
            *(int8_t *)(var_10 + var_4) = (*(int8_t *)(var_4 + var_10) & 0xff) + 0x41;
        }
        var_10 = var_10 + 0x1;
    } while (true);
    return eax;
}

We can recognize some values in the alphabet range: 0x60 is right before 0x61, which is ‘a’. Similarly, 0x7a is ‘z’. var_10 is incremented at each loop, so it’s a counter. var_4 is added to var_10 and dereferenced, so it’s a pointer. Cleaning it up is straightforward:

int sub_8049058(char *arg0, int arg1) {
    length_plus_1 = arg1; // the length of our input, plus one.
    input = arg0; // our input
    length = strlen(input);
    int i = 0;
    do {
        eax = i;
        if (eax >= length) {
            break;
        }
        if ((input[i] >= 'a') && (input[i] <= 'z')) {
            input[i] = input[i] - 'a';
            if (length_plus_1 + input[i] < 0x0) {
                input[i] = input[i] + 26;
            }
            tmp = input[i] + length_plus_1;
            input[i] = tmp - ((SAR(HIDWORD(tmp * 0x4ec4ec4f), 0x3)) - (SAR(tmp, 0x1f))) * 26;
            input[i] = input[i] + 'a';
        }
        if ((input[i] >= 'A') && (input[i] <= 'Z')) {
            input[i] = input[i] - 'A';
            if (length_plus_1 + input[i] < 0x0) {
                input[i] = input[i] + 26;
            }
            tmp = input[i] + length_plus_1;
            input[i] = tmp - ((SAR(HIDWORD(tmp * 0x4ec4ec4f), 0x3)) - (SAR(tmp, 0x1f))) * 26;
            input[i] = input[i] + 'A';
        }
        i++;
    } while (true);
    return eax;
}

If our input is “12345”, the string we operate on is “MTIzNDU=”. Let’s run through the transformation of the first character, with length_plus_1 = 6:

‘M’ - ‘A’ is 0xC (12)
12 + length_plus_1 = 18
if 18 < 0, we’d add 26. Not the case here.
We multiply 0x12 by 0x4EC4EC4F, this is 0x589d89d8e which is split between EDX (higher 32 bits) and EAX (lower 32 bits). HIDWORD is EDX here.
EDX is then shift right by 3 bits which will reduce it to zero. Similarly we shift tmp by 0x1f bits which also sets it to zero.
The new input is therefore the old input + length_plus_1, mod 26 and scaled back to its original a-z or A-Z range.
chr(0x41 + (ord(‘M’) - ord(‘A’) + 6)) == ‘S’

Returning from the encoding procedures, we end up at 0x08049820. Our input was transformed in-place, so 0x80f4fa0 contains our shifted base64-encoded input. We then call 0x080495ed, in which the first instructions seem to refer to characters in the alphabet range:

080495ed         push       ebp
080495ee         mov        ebp, esp
080495f0         push       ebx
080495f1         sub        esp, 0x44
080495f4         mov        eax, dword [ebp+arg_0]
080495f7         mov        dword [ebp+var_3C], eax
080495fa         mov        eax, dword [gs:0x14]
08049600         mov        dword [ebp+var_C], eax
08049603         xor        eax, eax
08049605         mov        dword [ebp+var_29], 0x3654634b
0804960c         mov        dword [ebp+var_25], 0x4a40564c
08049613         mov        dword [ebp+var_21], 0x6c315543
0804961a         mov        dword [ebp+var_1D], 0x4a623656
08049621         mov        dword [ebp+var_19], 0x63503456
08049628         mov        dword [ebp+var_15], 0x66305554
0804962f         mov        dword [ebp+var_11], 0x3939657e

Taken in order but as big-endian values, they encode “KcT6LV@JCU1lV6bJV4PcTU0f~e99”. This doesn’t look like base-64, but let’s continue. A loop follows, performing an XOR of each byte with 0x4:

                loc_804964a:
0804964a         mov        edx, dword [ebp+var_30]     ; our string
0804964d         mov        eax, dword [ebp+var_3C]     ; a counter initialized at zero
08049650         add        edx, eax
08049652         mov        ecx, dword [ebp+var_30]     ; still our string
08049655         mov        eax, dword [ebp+var_3C]
08049658         add        eax, ecx
0804965a         movzx      eax, byte [eax]
0804965d         xor        eax, 0x4
08049660         mov        byte [edx], al               ; store result in place
08049662         add        dword [ebp+var_30], 0x1      ; this is a counter

; [...]
                loc_8049666:
08049666         mov        ebx, dword [ebp+var_30]
08049669         lea        eax, dword [ebp+var_29]
0804966c         mov        dword [esp+0x48+var_48], eax ; argument #1 for method strlen
0804966f         call       strlen                       ; strlen
08049674         cmp        ebx, eax
08049676         jb         loc_804964a                  ; loop over the full string

"KcT6LV@JCU1lV6bJV4PcTU0f~e99" XOR’d with 0x4, gives "OgP2HRDNGQ5hR2fNR0TgPQ4bza==" which definitely looks like base-64 data.

This string is transformed using the length_plus_1 shift, and then compared to our shifted-encoded input in 08049699:

08049699         test       eax, eax
0804969b         jne        loc_80496b5 ; jumps right over the success message

0804969d         mov        dword [esp+0x48+var_48], aNyouAreACodeLi            ; argument #1 for method printf, "\\nYou are a Code Linux Memeber!!"
080496a4         call       printf                                              ; printf
080496a9         mov        dword [esp+0x48+var_48], 0x1
080496b0         call       sub_804f0d0

Note that at this point we could reverse the JNE into a JE or just NOP it and this would let us reach the success message. Here’s a patched binary with this change:

$ ./CodeLinux.swapped
*****This is Code Linux Agent!*****
- Enter The Code : 12345

You are a Code Linux Memeber!!

For the sake of it, let’s decode what the secret actually was. Since the length is 19, it was shifted by 20 positions so we need to reconstruct the string by shifting it 20 times in the other direction – by subtracting instead of adding:

>>> transform = lambda c: chr(((26 - 20 + c - 0x41) % 26) + 0x41) if c >= 0x41 and c <= 0x5a \
... else chr(((26 - 20 + c - 0x61) % 26) + 0x61) if c >= 0x61 and c <= 0x7a \
... else chr(c)
>>> tmp = ''.join(map(lambda c: transform(ord(c)), "OgP2HRDNGQ5hR2fNR0TgPQ4bza=="))
>>> print tmp
UmV2NXJTMW5nX2lTX0ZmVW4hfg==
>>> import base64
>>> base64.b64decode(tmp)
'Rev5rS1ng_iS_FfUn!~'

In action:

$ ./CodeLinux
*****This is Code Linux Agent!*****
- Enter The Code : Rev5rS1ng_iS_FfUn!~

You are a Code Linux Memeber!!