Author Topic: The PE(EXE) Structure  (Read 3881 times)

0 Members and 1 Guest are viewing this topic.

Offline seci

  • Serf
  • *
  • Posts: 22
  • Cookies: 8
  • Seci :D
    • View Profile
    • www.evilzone.org
The PE(EXE) Structure
« on: May 28, 2011, 04:11:47 am »
Hello again, been a few days since I have posted anything at Evilzone now. I have been reading a lot of papers on various things related to assembly, disassembly, CPU architecture, PE/COFF structure documentation and so on. But now am back for further questions!

Also, for those of you who have been reading my other topics. I will update them with solutions once they are solved, I am still working on translating bytes into assembly instructions and so on. Should be doing some progress tonight/tomorrow.


Now, to my question. Or rather, verification.
I just want someone with experience with this to verify that I got things right. When or if this topic/my question gets verified to be correct it can probably be used as guidance for others to :) So. I will just write up what I have read/interpriated about the PE structure below this text block. Then someone who knows his shit read thought it and see if I got anything wrong. There might be a question or two in the text to. Thanks in advance :)
-------------------------------


The PE(Portable Executable) Structure

Any PE file will consist of the following structures(roughly, we will go in more details later). List will be in the order as they will be in the file structure.
  • DOS MZ Header. This is the very first bytes in any PE, it states various things about the PE. (In-depth details will be put along side the structure(s) below)
  • DOS Stub. This is 100 bytes++ with machine code just to output "This program cannot run in DOS mode". This is here to avoid programs from running in DOS Mode. This part of the PE is not required tho, you can tell your compiler to not include this.
  • PE Signature, part of the PE Header. Its the first 4 bytes in the IMAGE_NT_HEADERS. Its there to verify that this is in fact a PE. The signature for a valid PE file is the ASCII letters 'PE' followed by 2x 0-bytes
  • PE Header(s). This is the vital part of a PE, it contains information about the entire PE structure. It also contains two inner headers and one inner inner header. They are listed below.
  • PE Header - File Header. This header contains further vital information about the file. Details will be listed in the header structures below.
  • PE Header - Optional Header. Despite its name, this header is NOT optional at all. This is the largest header in the PE Header. Once again, this header contains tons of information regarding the PE layout/structure. Among other things, the PE entry point. (The first 'line' of code that will be executed once the program runs)
  • PE Header - Optional Header - DataDirectory. The DataDirectory header is an array of the IMAGE_DATA_DIRECTORY structure that contains information about the various directories in the PE. Import Directory, Export Directory, Resource Directory and so on. These directories contains information about where to locate other information, in short.
  • Section Header(s). Section header structures. One structure for each of the sections in the file. Normally like 4-6 sections. Structure contains section name, size, address and so on. More info on sections 4 points down. The number of sections are set in PE Headers -> File Header -> NumberOfSections
  • Data Directories. These are the directories mentioned above, containing various information. Resources, imported DLL's and so on. (I haven't read so much about all of these yet)
  • Data Directories ...
  • [...]
  • Sections. Sections are large chunks of bytes that can be anything. Code, resources, icon(s), information and so on. Each section have its own "rules", some are read-only, some are executable and some are data - read and writable. The sections are typically called ".text", ".data", ".rsrc", ".reloc". There can be up to 16 sections. The number of sections are set by a value in the File Header. Each section have a structure of IMAGE_SECTION_HEADER about itself containing its name, size, address, offset and "rules" or Characteristics.
  • Sections ...
  • [...]
  • EOF. End Of File
The structures (C/++ syntax)

DOS MZ Header
Code: [Select]
typedef struct _IMAGE_DOS_HEADER {  // DOS .EXE header
    USHORT e_magic;         // Magic number                                       // 2 bytes
    USHORT e_cblp;          // Bytes on last page of file                         // 2 bytes
    USHORT e_cp;            // Pages in file                                     // 2 bytes
    USHORT e_crlc;          // Relocations                                       // 2 bytes
    USHORT e_cparhdr;       // Size of header in paragraphs                      // 2 bytes
    USHORT e_minalloc;      // Minimum extra paragraphs needed                   // 2 bytes
    USHORT e_maxalloc;      // Maximum extra paragraphs needed                    // 2 bytes
    USHORT e_ss;            // Initial (relative) SS value                        // 2 bytes
    USHORT e_sp;            // Initial SP value                                   // 2 bytes
    USHORT e_csum;          // Checksum                                           // 2 bytes
    USHORT e_ip;            // Initial IP value                                    // 2 bytes
    USHORT e_cs;            // Initial (relative) CS value                       // 2 bytes
    USHORT e_lfarlc;        // File address of relocation table                  // 2 bytes
    USHORT e_ovno;          // Overlay number                                     // 2 bytes
    USHORT e_res[4];        // Reserved words                                     // 2*4 bytes
    USHORT e_oemid;         // OEM identifier (for e_oeminfo)                     // 2 bytes
    USHORT e_oeminfo;       // OEM information; e_oemid specific                 // 2 bytes
    USHORT e_res2[10];      // Reserved words                                    // 2 * 10 bytes
    LONG   e_lfanew;        // File address of new exe header                     // 4 bytes
} IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER;
Total bytes: 64
e_lfanew points to the offset of the PE Header, this is because of the DOS header. Because its optional and it can be as small or large as you want/your compiler wants. The offset of the PE Header can vary.
Couldn't find any MS.com link :S


PE Header
Code: [Select]
typedef struct _IMAGE_NT_HEADERS {        // Includes the PE signature, File Header and Optional Header
  DWORD                 Signature;        // The PE signature                                               // 4 bytes
  IMAGE_FILE_HEADER     FileHeader;        // The File Header structure                                      // 20 bytes
  IMAGE_OPTIONAL_HEADER OptionalHeader;        // The Optional Header structure                            // 224 bytes
} IMAGE_NT_HEADERS, *PIMAGE_NT_HEADERS;
Total bytes: 248 (including the other structures)(Maximum, can be less depending on the amount of DataDirectories in the Optional Header
http://msdn.microsoft.com/en-us/library/ms680336%28v=vs.85%29.aspx


File Header
Code: [Select]
typedef struct _IMAGE_FILE_HEADER {
  WORD  Machine;                // Machine type (alpha/motorola/...)(001, 076 == I386)        // 2 bytes
  WORD  NumberOfSections;            // Number of sections in the file                      // 2 bytes
  DWORD TimeDateStamp;                // The time that the file was created                  // 4 bytes
  DWORD PointerToSymbolTable;            // The file offset of the COFF symbol table         // 4 bytes
                                                // This field is only used in OBJ files and PE files
                                                // with COFF debug info
  DWORD NumberOfSymbols;            // The number of symbols in the COFF symbol table         // 4 bytes
  WORD  SizeOfOptionalHeader;            // Size of the OptionalHeader structure           // 2 bytes
  WORD  Characteristics;            // Flags of the file (EXE/DLL/System file/...)             // 2 bytes
} IMAGE_FILE_HEADER, *PIMAGE_FILE_HEADER;
Total bytes: 20
http://msdn.microsoft.com/en-us/library/ms680313%28v=vs.85%29.aspx


Optional Header
Code: [Select]
typedef struct _IMAGE_OPTIONAL_HEADER {
  WORD                 Magic;                            // 2 bytes identifying the state of the file.                                                  // 2 bytes
  BYTE                 MajorLinkerVersion;                    // Linker major version number.                                                                // 1 bytes
  BYTE                 MinorLinkerVersion;                    // Linker minor version number.                                                                // 1 bytes
  DWORD                SizeOfCode;                        // Size of the code section OR Sum of all code sections (multiple sections).                   // 4 bytes
  DWORD                SizeOfInitializedData;                    // Size of the initialized data OR .... multiple data sections.                                // 4 bytes
  DWORD                SizeOfUninitializedData;                    // Size of the uninitialized data section (BSS) OR ... multiple BBS sections.                  // 4 bytes
  DWORD                AddressOfEntryPoint;                    // Address of entry point (RVA of the 1st instruction to be executed)                          // 4 bytes
  DWORD                BaseOfCode;                        // Address (RVA) of beginning of code section.                                                 // 4 bytes
  DWORD                BaseOfData;                        // Address (RVA) of beginning of data section.                                                 // 4 bytes
  DWORD                ImageBase;                        // The *preferred* load address of the file (default is 0x00400000).                           // 4 bytes
  DWORD                SectionAlignment;                    // Alignment (in as strings) of sections when loaded into memory.                              // 4 bytes
  DWORD                FileAlignment;                        // Alignment (in as strings) of sections in the file (multiplies of 512 as strings).           // 4 bytes
  WORD                 MajorOperatingSystemVersion;                // Major version number of required OS.                                                        // 2 bytes
  WORD                 MinorOperatingSystemVersion;                // Minor version number of required OS.                                                        // 2 bytes
  WORD                 MajorImageVersion;                    // Major version number of image.                                                              // 2 bytes
  WORD                 MinorImageVersion;                    // Minor version number of image.                                                              // 2 bytes
  WORD                 MajorSubsystemVersion;                    // Major version number of subsystem.                                                          // 2 bytes
  WORD                 MinorSubsystemVersion;                    // Minor version number of subsystem.                                                          // 2 bytes
  DWORD                Win32VersionValue;                    // Dunno! But I guess for future use.                                                          // 4 bytes
  DWORD                SizeOfImage;                        // Total size of the PE image in memory (All Headers & Sections aligned to SectionAlignment).  // 4 bytes
  DWORD                SizeOfHeaders;                        // Size of all headers & section table. (The file offset of the first section in the PE file)  // 4 bytes
  DWORD                CheckSum;                        // Image file checksum. (computing algorithm is in IMAGHELP.DLL)                               // 4 bytes
  WORD                 Subsystem;                        // Target subsystem of the PE file. (Mostly GUI & CUI)                                         // 2 bytes
  WORD                 DllCharacteristics;                    // Flags used to indicate if a DLL image includes EPs.                                         // 2 bytes
  DWORD                SizeOfStackReserve;                    // Size of stack to reserve.                                                                   // 4 bytes
  DWORD                SizeOfStackCommit;                    // Size of stack to commit.                                                                    // 4 bytes
  DWORD                SizeOfHeapReserve;                    // Size of local heap space to reserve.                                                        // 4 bytes
  DWORD                SizeOfHeapCommit;                    // Size of local heap space to commit.                                                         // 4 bytes
  DWORD                LoaderFlags;                        // Choose to (break/debug/run normally (default)) on load.                                     // 4 bytes
  DWORD                NumberOfRvaAndSizes;                    // The length of the DataDirectory array that follows.                                         // 4 bytes
  IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];        // An array of IMAGE_DATA_DIRECTORY, lenght = 16                                               // 8 bytes * 16
} IMAGE_OPTIONAL_HEADER, *PIMAGE_OPTIONAL_HEADER;
Total max bytes: 224
http://msdn.microsoft.com/en-us/library/ms680339%28v=vs.85%29.aspx


DataDirectory
Code: [Select]
typedef struct _IMAGE_DATA_DIRECTORY {
  DWORD VirtualAddress;                               // RVA of the location of the directory                 // 4 bytes
  DWORD Size;                                               // Size of the directory                           // 4 bytes
} IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;
* IMAGE_NUMBEROF_DIRECTORY_ENTRIES
Additionally I am guessing there is structures for each of the directories, export, import, resource, exception, security, basereloc, debug, copyright. etc. etc
http://msdn.microsoft.com/en-us/library/ms680305%28v=vs.85%29.aspx


Section Header(s)
Code: [Select]
typedef struct _IMAGE_SECTION_HEADER {
  BYTE  Name[IMAGE_SIZEOF_SHORT_NAME];        // Simply the name of the section (ANSI name)                                 // 8 bytes
  union {
    DWORD PhysicalAddress;            //?                                          //
    DWORD VirtualSize;                // Size of the file when its mapped to memory. Must be multiple of 4096       // 4 bytes
  } Misc;
  DWORD VirtualAddress;                // The RVAof the file where it should be mapped to memory                     // 4 bytes
  DWORD SizeOfRawData;                // The size of the section, rounded to the next multiply of the FileAlighment // 4 bytes
  DWORD PointerToRawData;            // The offset in the file of the beginning og the section                     // 4 bytes
  DWORD PointerToRelocations;            // Not used                                                                   // 4 bytes
  DWORD PointerToLinenumbers;            // Not used                                                                   // 4 bytes
  WORD  NumberOfRelocations;            // Not used                                                                   // 2 bytes
  WORD  NumberOfLinenumbers;            // File-based offset of the line number table. (Not really used)              // 2 bytes
  DWORD Characteristics;            // Flags for section charactaristics as follow below                          // 4 bytes
} IMAGE_SECTION_HEADER, *PIMAGE_SECTION_HEADER;
Total bytes: 40
There is one of this structure for each section in the PE
http://msdn.microsoft.com/en-us/library/ms680341%28v=vs.85%29.aspx



Now, I believe this is all the headers and whatnot. Except the directory structures, import, export and so on. Am I rit?
« Last Edit: May 28, 2011, 04:18:56 am by seci »
6b619af0d7042db45f3e215b3dd7b977e8d1c82f

iMorg

  • Guest
Re: The PE(EXE) Structure
« Reply #1 on: May 28, 2011, 04:37:10 am »
Great post, its good to have the major structure definitions for something in one place when your writing a program.

This is pretty much the de facto paper on the PE file format if anyones interested. Its an update from an older article. Its from 2002 but is still widely used. http://msdn.microsoft.com/en-us/magazine/cc301805.aspx
« Last Edit: May 28, 2011, 04:38:59 am by iMorg »

Offline ca0s

  • VIP
  • Sir
  • *
  • Posts: 432
  • Cookies: 53
    • View Profile
    • ka0labs #
Re: The PE(EXE) Structure
« Reply #2 on: May 28, 2011, 01:54:43 pm »
I think everything is correct. Nice summary.
Quote
Additionally I am guessing there is structures for each of the directories, export, import, resource, exception, security, basereloc, debug, copyright. etc. etc
That's it. You have an array of those sctructures at OptionalHeader->DataDirectory. It has 16 members, in order: export, import, resource, exception, security, basereloc, debug, copyright, globalptr, tls, load_config, bound_import, iat, delay_import, com_descriptor.
DataDirectory, being that structure you said, has Size (its size, obviously) and VirtualAddress, which is a RVA. If you want the real physical address of that section in the file, you have to do it like this:
Physical_Address = RVA - [Container section].VirtualAddress + [Container section].PointerToRawData
This RVA thing is a pain, when you are exploring a PE file in C you have to be always doing this conversion.

Also, I was thinking about writing some texts about PE as I am learning. Would you like to do it together?

Offline seci

  • Serf
  • *
  • Posts: 22
  • Cookies: 8
  • Seci :D
    • View Profile
    • www.evilzone.org
Re: The PE(EXE) Structure
« Reply #3 on: May 28, 2011, 03:15:49 pm »
I think everything is correct. Nice summary.That's it. You have an array of those sctructures at OptionalHeader->DataDirectory. It has 16 members, in order: export, import, resource, exception, security, basereloc, debug, copyright, globalptr, tls, load_config, bound_import, iat, delay_import, com_descriptor.
DataDirectory, being that structure you said, has Size (its size, obviously) and VirtualAddress, which is a RVA. If you want the real physical address of that section in the file, you have to do it like this:
Physical_Address = RVA - [Container section].VirtualAddress + [Container section].PointerToRawData
This RVA thing is a pain, when you are exploring a PE file in C you have to be always doing this conversion.

Yes, I am aware of the RVA issue. But I wrote a function for it that can convert any RVA to physical offset.


Also, I was thinking about writing some texts about PE as I am learning. Would you like to do it together?

I could possibly help out, double check facts and such. I am still reading up on OpCodes and how the CPU instructions are organized and a little bit of more PE structure.
6b619af0d7042db45f3e215b3dd7b977e8d1c82f

Offline Huntondoom

  • Baron
  • ****
  • Posts: 856
  • Cookies: 17
  • Visual C# programmer
    • View Profile
Re: The PE(EXE) Structure
« Reply #4 on: May 28, 2011, 07:27:30 pm »
Yes, I am aware of the RVA issue. But I wrote a function for it that can convert any RVA to physical offset.
please post it!

ontopic: I have a question, I understand the structure you explained so far, but how do you read the headers?
I made a own program  that could load the bytes of a file into a Datagridview,
this would work for every type of file, but I didn't see anything of those sections names in there
is there a certain byte that represent them?
or is there a conversion needed?
Aslong as you are connected to the internet, you'll have no privacy

Advanced Internet Search
Clean Up!

Offline ca0s

  • VIP
  • Sir
  • *
  • Posts: 432
  • Cookies: 53
    • View Profile
    • ka0labs #
Re: The PE(EXE) Structure
« Reply #5 on: May 28, 2011, 08:30:20 pm »
please post it!

ontopic: I have a question, I understand the structure you explained so far, but how do you read the headers?
I made a own program  that could load the bytes of a file into a Datagridview,
this would work for every type of file, but I didn't see anything of those sections names in there
is there a certain byte that represent them?
or is there a conversion needed?
http://evilzone.org/projects-and-discussion/calculating-entrypoint-and-mapping-it-to-byte-array/msg4611/#msg4611
And in that topic I uploaded some PDFs with info.

Offline seci

  • Serf
  • *
  • Posts: 22
  • Cookies: 8
  • Seci :D
    • View Profile
    • www.evilzone.org
Re: The PE(EXE) Structure
« Reply #6 on: July 16, 2011, 12:42:18 am »
please post it!

ontopic: I have a question, I understand the structure you explained so far, but how do you read the headers?
I made a own program  that could load the bytes of a file into a Datagridview,
this would work for every type of file, but I didn't see anything of those sections names in there
is there a certain byte that represent them?
or is there a conversion needed?

Hello again. To all. Sorry I have been away for a while. Been busy.

Anyway, I dont got the function on this laptop. But I found something close to it with a quick google search:
Code: [Select]
/*----------------------------------------------------------*
*
* RVAToOffset: Convert value from RVA to file offset.
*----------------------------------------------------------*/
DWORD CPE::RVAToOffset(DWORD dwRVA)
{
int i;
WORD wSections;
PIMAGE_SECTION_HEADER pSectionHdr;

/* Map first section */
pSectionHdr = IMAGE_FIRST_SECTION(m_pNtHdr);
wSections = GetNumberOfSections();

for (i = 0; i < wSections; i++)
{
if (pSectionHdr->VirtualAddress <= dwRVA)
if ((pSectionHdr->VirtualAddress + pSectionHdr->Misc.VirtualSize) > dwRVA)
{
dwRVA -= pSectionHdr->VirtualAddress;
dwRVA += pSectionHdr->PointerToRawData;
return (dwRVA);
}
pSectionHdr++;
}
return (-1);
}

/*----------------------------------------------------------*
*
* OffsetToRVA: Convert value from file offset to RVA.
*----------------------------------------------------------*/
DWORD CPE::OffsetToRVA(DWORD dwOffset)
{
int i;
WORD wSections;
PIMAGE_SECTION_HEADER pSectionHdr;

/* Map first section */
pSectionHdr = IMAGE_FIRST_SECTION(m_pNtHdr);
wSections = GetNumberOfSections();

for (i = 0; i < wSections; i++)
{
if (pSectionHdr->PointerToRawData <= dwOffset)
if ((pSectionHdr->PointerToRawData + pSectionHdr->SizeOfRawData) > dwOffset)
{
dwOffset -= pSectionHdr->PointerToRawData;
dwOffset += pSectionHdr->VirtualAddress;
return (dwOffset);
}

pSectionHdr++;
}
return (-1);
}

If you got any issues let me know :)
6b619af0d7042db45f3e215b3dd7b977e8d1c82f