EvilZone
Programming and Scripting => Projects and Discussion => : Satan911 April 06, 2011, 01:45:15 AM
-
Simple demonstration of inline ASM efficiency
Comparing decryption time in C versus ASM
Introduction
So I was doing a little assignment for school not so long ago. It was a simple exercise to practice inline ASM by translating a C function into ASM. Took a few minutes and I moved on.. Today I was doing something a lot bigger in ASM and was wondering if programming directly in ASM is more efficient performance wise compared to a high level language like C. I decided to use the code I had from that old exercise to make a small demonstration.
The Code
The code is really simple. The program will decrypt a string encrypted using a Caesar cipher with a shift of 4. So basically to get a 'b' in clear text you'll see 'f' in the encrypted string.
C version: (decrypt_c.c)
/*************************************************
* Author: Satan911
* Description: Simple demonstration of inline ASM efficiency
* Date: April 2011
**************************************************/
#include <stdio.h>
char encrypted_message[25]="Wexer=55$D$Izmp~sri2svk";
char decrypted_message[25];
void decrypt() {
/* decrypted_message[i] = encrypted_message[i] - 4; */
int i = 0;
while(encrypted_message[i] != '\0')
{
decrypted_message[i] = encrypted_message[i] - 4;
i++;
}
}
int main(void) {
/* To test performance */
int j = 0;
while(j < 100000000)
{
decrypt();
j++;
}
printf("Encrypted message: \t%s\nDecrypted message: \t%s\n",encrypted_message, decrypted_message);
return 0;
}
Pastebin (with syntax highlighting): http://pastebin.com/9Up2DrN6 (http://pastebin.com/9Up2DrN6)
With inline ASM: (decrypt_asm.c) - Might wanna check the Pastebin below for proper indenting
/*************************************************
* Author: Satan911
* Description: Simple demonstration of inline ASM efficiency
* Date: April 2011
**************************************************/
#include <stdio.h>
char encrypted_message[25]="Wexer=55$D$Izmp~sri2svk";
char decrypted_message[25];
void decrypt() {
/* decrypted_message[i] = encrypted_message[i] - 4; */
asm(
"xor %ecx, %ecx\n\t" /* %ecx = 0 (Used as i here) */
"xor %eax, %eax\n\t" /* %eax = 0 */
"bouclefor:\n\t" /*for loop */
"movb encrypted_message(%ecx), %dl\n\t" /* move encrypted_message[i] in %dl register */
"cmp %dl, %al\n\t" /* Compare %dl and %al */
"je fin\n\t" /* Jump to fin: if %dl == 0 (end of string) */
"sub $4, %dl\n\t" /* encrypted_message[i] = encrypted_message[i] - 4 */
"movb %dl, decrypted_message(%ecx)\n\t" /* decrypted_message[i] = encrypted_message[i] - 4 */
"incl %ecx\n\t" /* %ecx += 1 (i++) */
"jmp bouclefor\n\t" /* Jump to bouclefor: (while loop in C) */
"fin:\n\t"
"movb %dl, decrypted_message(%ecx)\n\t" /* This will be used for the last char to move \0 at the end of the string */
);
}
int main(void) {
/* To test performance */
int j = 0;
while(j < 100000000)
{
decrypt();
j++;
}
printf("Encrypted message: \t%s\nDecrypted message: \t%s\n",encrypted_message, decrypted_message);
return 0;
}
Pastebin (with syntax highlighting): http://pastebin.com/AFAD8AzP (http://pastebin.com/AFAD8AzP)
Note: The ASM syntax used here is the AT&T syntax. It works great with GCC and that's also the kind of ASM GCC produces when it compiles a program (Will be used later). Also, the C code could be different but I tried to make it as similar as I could to the ASM code. I think they are pretty identical now.
If you read the code you are probably wondering why I would decrypt() the message 100000000 times. Well it's because this is a really simple decrypting and if you only run it once you won't notice any difference between the C and ASM versions. That's a technique we actually use in software development to check the efficiency of a function over time.
Decryption Time
(http://i.imgur.com/NVZB4.png)
The time command is used to time a command / program or give resource usage.
So I compiled both versions using the same command and then ran both with time. The results are pretty clear here.. The C version took almost 3x more time to decrypt 100000000 times the message than the ASM version. But why?
I'll try to explain the 'why' a little bit here. First, here's the ASM code generated by GCC for the C version of the program.
# gcc -S -O decrypt_c.c
-S generates the ASM code and -O is for optimized
This is a short version only showing the decrypt() function - See the Pastebin link for the whole code
.file "decrypt_c.c"
.text
.globl decrypt
.type decrypt, @function
decrypt:
pushl %ebp
movl %esp, %ebp
pushl %ebx
movzbl encrypted_message, %edx
testb %dl, %dl
je .L4
movl $0, %eax
movl $decrypted_message, %ebx
movl $encrypted_message, %ecx
.L3:
subl $4, %edx
movb %dl, (%ebx,%eax)
addl $1, %eax
movzbl (%ecx,%eax), %edx
testb %dl, %dl
jne .L3
.L4:
popl %ebx
popl %ebp
ret
.size decrypt, .-decrypt
.section .rodata.str1.4,"aMS",@progbits,1
.align 4
.LC0:
.string "Encrypted message: \t%s\nDecrypted message: \t%s\n"
.text
Pastebin: http://pastebin.com/kr9WgnKi (http://pastebin.com/kr9WgnKi)
Basically a compiler works this way:
Source code -> ASM code -> Machine code -> Executable
(Of course there are more steps than that but you get the idea)
I won't go through the whole ASM code because it would take a little while but the code generated by GCC (even optimized) is still bigger and a bit more complicated than the code I wrote. Also consider that my ASM code could be even shorter than that but the one you saw is a bit easier to understand.
Conclusion
Even if the compilers we use now are way more efficient than what we had a few years ago, they are still not perfect and a human brain is still more capable of writing short and efficient ASM. Don't get me wrong, there's just no way anyone would code big programs in ASM just for to save a few seconds.. But this whole thread is just a proof of concept to show that indeed it can be interesting to use inline ASM for some functions like the one I showed you.
That's about it. If you have any questions I'll try my best to answer. I tried to make this as clear as I could for anyone to read and understand and I hope you enjoyed it.
-
interesting article bro so the resume is:
ASM is equal of powerful than C?
-
As far as I can tell, the primary reason for this being slow in C is that you used a math operation (d[ i ] -= 4) that performs software integer bounds checks. Disassemble the two resulting binaries to see how it worked. Also, did you compile with gcc -O3 on both?
gh0st - I'd say asm is more powerful than C, simply because you can do things like sysenter/sysexit. However, since you can inline asm into C it's probably best to use C for the mostpart.
-
interesting article bro so the resume is:
ASM is equal of powerful than C?
interesting article bro so the resume is:
ASM is equal of powerful than C?
Computers are built with layers of abstractions on top of each other.
At the lowest level (for a programmer at least) everything is binary. Using binary strings the processor is given instructions on which transistors to switch (1,0, on/off). These instructions are given a more read able format by assembly language. Essentially an assembly(ASM) instruction represents a string of bits in binary. Now what C compiler does is takes a language and it's set of instructions that the user defines (your code) and compiles it into a list of assembly instructions and then into a binary file for execution. The reason inline ASM can sometimes make things more efficient is because the compiler doesn't always pick the most efficient way, so a human can specify inline assembly to fine-tune what the compiler does.
The Compile Process:
C/C++
|
v
Assembly
|
V
BINARY
Anyways I think I explained that semi-correctly and somewhat clear. Hope that helps.