EvilZone
Hacking and Security => Reverse Engineering => : TheWormKill June 01, 2014, 09:50:23 PM
-
Hi guys,
finally got something I think is worth contributing. I wrote a small tool that takes a file with an assembly function in it
and generates a stub for it (in C). It tries to recognize local variables and their datatypes, args and the function name.
The return-type may be added. It's not free of any faults, it merely guesses, so pointers may be recognized as an int.
But it helps me quite a bit.
Note: Program wants a function in a single file, generated via gcc -S -masm=intel func.c
Run it like this: python analyzer.py path/to/file.asm (the extension isn't important ^^)
Written in Python.
Greets, TheWormKill
-
I found a tiny "bug" in your script:
line 6: 'self_var_str = []' should be 'self.var_str = []' (replace underscore with dot)
-
Oh, thanks! It didn't affect the working of the program, that's why I didn't see it. Fixed it.
-
Aaah, i don't get. What is supposed to be written in ASM. The program or function? And then there comes C for some reason. This is confusing though for the record, i see that the program is in python. How could this be of use to me?
-
assuming you have a function in ASM (as specified in the first post). The program analyzes it and generates a C-stub that contains the parameters and local variables as well as the function name. It's intended to work like that.
Consider this function:
_reverse:
push ebp
mov ebp, esp
sub esp, 24
mov eax, DWORD PTR [ebp+8]
mov DWORD PTR [esp], eax
call _string_length
mov DWORD PTR [ebp-4], eax
mov eax, DWORD PTR [ebp+8]
mov DWORD PTR [ebp-12], eax
mov eax, DWORD PTR [ebp+8]
mov DWORD PTR [ebp-16], eax
mov DWORD PTR [ebp-8], 0
L3:
mov eax, DWORD PTR [ebp-4]
dec eax
cmp eax, DWORD PTR [ebp-8]
jle L4
lea eax, [ebp-16]
inc DWORD PTR [eax]
lea eax, [ebp-8]
inc DWORD PTR [eax]
jmp L3
L4:
mov DWORD PTR [ebp-8], 0
L6:
mov edx, DWORD PTR [ebp-4]
mov eax, edx
sar eax, 31
shr eax, 31
lea eax, [edx+eax]
sar eax
cmp eax, DWORD PTR [ebp-8]
jle L2
mov eax, DWORD PTR [ebp-16]
movzx eax, BYTE PTR [eax]
mov BYTE PTR [ebp-17], al
mov edx, DWORD PTR [ebp-16]
mov eax, DWORD PTR [ebp-12]
movzx eax, BYTE PTR [eax]
mov BYTE PTR [edx], al
mov edx, DWORD PTR [ebp-12]
movzx eax, BYTE PTR [ebp-17]
mov BYTE PTR [edx], al
lea eax, [ebp-12]
inc DWORD PTR [eax]
lea eax, [ebp-16]
dec DWORD PTR [eax]
lea eax, [ebp-8]
inc DWORD PTR [eax]
jmp L6
L2:
leave
ret
The output will be:
reverse(int arg_8){
int var_4;
int var_8;
char * var_12;
char * var_16;
char var_17;
}
Which is an approximation to the original:
void reverse(char *string)
{
int length, c;
char *begin, *end, temp;
length = string_length(string);
begin = string;
end = string;
for ( c = 0 ; c < ( length - 1 ) ; c++ )
end++;
for ( c = 0 ; c < length/2 ; c++ )
{
temp = *end;
*end = *begin;
*begin = temp;
begin++;
end--;
}
}
It tries to get the arguments, local variables and their datatypes, altough it's far from perfect (doesn't recognize some pointers and stuff). The idea is to create a stub which can be used as a starting point while rewriting ASM code in C.
Does this explanation help?
-
Yup! That explains alot. Will work for RE projects. Problem is i don't have any.
-
IDA Basic or Hopper does this, but it's a great little project for yourself. You can also pay 800$+1800$ for IDA Pro and a decompiler
-
So it is actually a partial decompiler for ASM -> C.
Nice project. Something you can extend and build upon for months.
-
Thanks! I'm doing a large project for school which will be counted as an exam, so if I'm good enough I'll get a better grade, if I don't... well I won't "use" the work. Quite simple, but since I have two final exams on subjects that aren't my best (english and history), that might help me getting inito a decent university.
-
Good luck for the exam, and, great work! I hope you will continue the development of this project! It will be very useful!
-
Thanks! It will be reworked as soon as I get my new PC, and I still need to upload a few other tools.