Practical Reverse Engineering, Exercise 1

Intro

So I decided to buy Practical Reverse Engineering (Bruce Dang, Alexandre Gazet, Elias Bachaalany, ISBN:978-1-118-78731-1).

After reading the first 11 pages I ran into the first exercise. I got stuck. So I  started looking for some help online but not much was out there. I found a post on reddit.com/r/ReverseEngineering about this book. The post was about 4 months old and seemed dead. I also found a blog by someone else doing these exercises and posting solutions to them online. So I decided to create my own Blog and post these solutions so that others may compare solutions and, if stuck, find some help.

Here is my solution to Exercise 1 (Page 11):

Note: I am reading this book to learn about reverse engineering. I have no previous experience with reverse engineering. Previous to starting this book I had experience using Java, C, ARM assembly. As well as some basic concepts of computer architecture and organisation.

Let’s reverse

[ebp+8] and [ebp+0Ch] are memory locations relative to the base of the stack. Usually this indicates the code accessing arguments of a function.

Let’s start off by analysing each line of code:

8B 7D 08    mov    edi, [ebp+8]

If you know anything about how stacks and function calls are organised you may guess what this does. For those that do not, read Function and Stack Frames, Standard Entry Sequence. The assembler pushes functions parameters in reverse order onto the stack. So this line of code accesses a parameter that was passed for this function.

8B D7    mov    edx, edi

This line makes a copy of the address we got in line 1. Probably saving it to prevent having to make another memory access again.

33 C0    xor eax, eax

This line simply performs a binary XOR operation. This produces 0x00, NULL character, in eax.

83 C9 FF    or    ecx, 0FFFFFFFFh

This line just loads the value 0xFFFFFFFF into ecx. This is either 4,294,967,295, if we assume that this value is an unsigned int, or -1, if we assume it is a signed int.

F2 AE    repne scasb

According to the book REP prefix means repeat until eax is equal to edi. Since in this case we have a repne it means repeat while eax does not equal edi.

SCASB compares eax with a byte found at edi in memory. It also automatically decrements the value of ecx and updates edi, depending on the DF flag in EFLAG register. In other words it moves through memory one byte at a time looking for, in this case, a NULL byte.

If you know how strings are organised in memory you can kind of guess what we are working with. YES ! it’s a string. Also the fact that SCAS/STOS are used for string/memory comparison ,when the length of the string is known at compile time, is also a good indicator.

Here is how the string “Hello” stored in memory: ( Picture from TutorialsPoint.com )

Picture from TutorialsPoint.com
String “Hello” stored in memory

Ignore the Address line.

If the repne scasb was performed on the string in the image: It would start with ‘H’ compare it to NULL byte. If not equal repeat until it finds a NULL character (also written as “\ 0″ (No space in between ‘\’ and ‘0’ ))

83 C1 02    add ecx, 2

This line adds 2 to ecx.

After the repne scasb instruction ecx has been decremented each time. Decrementing is done last by SCAS so after it found a null byte it still decremented ecx. Adding two will return ecx to what it was before scasb found the null byte,or when it was looking at ‘o’.

F7 D9    neg ecx

This simply inverts ecx. Which was a counter counting how many letters there are in this string. ecx went from 0xFFFFFFFF to 0xFFFFFFFE to 0xFFFFFFFD and so on. So inverting it and adding 1 to it, which was done by like 6, will change the sign of this number. You can imagine that rather than counting 1, 2, 3, 4, and having to use inc instruction, the program counted -1, -2, -3. Inverting this value, and adding 0b1 to it would give us the same answer. This may be a bit hard to understand but it is more efficient since we do not require to use a register as a counter.

8A 45 0C    mov al, [ebp+0Ch]

This line of code copies the 8 least significant bits from [ebp+0Ch] ( second argument to this mystery function).

Since each arguments is stored as 4 bytes. If we stored ‘A’ (101 in decimal, 41 in hexadecimal) at that location it would be stored as “\ 0,\ 0,\ 0,A”. This instruction just moves the ‘A’ to AL and drops the 3 x ‘\ 0′.

8B FA    mov edi,edx

This just sets edi to be what edi was after line 02 was executed. In other words, it resets the edi to point to the start of the string we are working with.

This line could be replaced with

 mov edi, [ebp+8]

but this involves a memory access, which is usually many times more time consuming that a register access. So the compiler decided to store edi in edx, line 02, to improve time performance of this function.

F3 AA    rep stosb 

We know what rep prefix does ( repeats stosb instruction ecx number of times ). Remember that ecx is equal to the number of characters in our string.

 8B C2    mov eax, edx 

This is the last line of our mystery function so we can assume that this function returns a pointer to the string it just edited. We may assume this since edx points to this string and eax is usually used for returning values from a function.

Summary

If we sum up what we discovered while taking a look at these instructions one at a time:

  • We count the number of characters in a string before the NULL character.
  • We write that many characters over the same string

We also know that this mystery function takes two parameters; a pointer to a string and a character. It also returns the address of the edited string at the end. From this we can assume that this function looks like this :

char* blank_string(char* string, char ch){
    int length = strlen(string);
    memset(string,ch,length);
    return string;
}

If you wish to obtain a fully compilable assembly code you can head over to this blog by Johannes Bader. He has written an interesting article on the same code, except he walks through it line by line in gdb.

Hope this helped somepeople.
Feel free to point out mistakes bellow in the comments.

2 thoughts on “Practical Reverse Engineering, Exercise 1

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s