Author Topic: [Tutorial] Analyzing Malware by Example---Sample 2  (Read 2562 times)

0 Members and 1 Guest are viewing this topic.

Offline Deque

  • P.I.N.N.
  • Global Moderator
  • Overlord
  • *
  • Posts: 1203
  • Cookies: 518
  • Programmer, Malware Analyst
    • View Profile
[Tutorial] Analyzing Malware by Example---Sample 2
« on: February 23, 2015, 08:51:17 am »
Analyzing Malware by Example---Sample 2

Goal:

Our goal is again to find out  what kind of malware it is in regard to its behaviour. The last sample was a downloader. Let's see what we have this time.

Download the sample:

download sample.zip

The password is "infected".

Caution! This is live malware!

Your first step is again to check the file type, so you know which tools are appropriate for further analysis. You already learned some techniques in the last preparatory exercise. Find out the file type first, then go on reading here.

You probably used TrID, the file command on Linux or your hex editor. If you found out that this is a Word document, you got it right.

Install Python

First install Python 2.7 on your static analysis system. You will need it for a bunch of analysis tools. If you have Linux, Python is probably preinstalled. You can check if it is preinstalled by typing python on the command line. If the interactive interpreter opens, you are done. If you get a message saying that the python command could not be found, you need to set it up.

Download Python 2.7. from here, scroll to the Download section, choose the download for your Operating System. E.g. if you use Windows 32-bit, choose Windows x86 MSI Installer (2.7), for 64-bit choose Windows X86-64 MSI Installer (2.7). Run the installer and follow the prompts. After that check again if the python command works. If it doesn't, you need to add Python to the PATH variable.

  • Right click on My Computer
  • Click on Properties
  • Click Advanced Settings, then Environmental Variables
  • A list of variables will appear. In the upper list choose PATH and modify.
  • At the end of the path, add the following: ;C:\WINDOWS\system32;C:\WINDOWS;C:\Python27
  • Click OK
Now open your command prompt. If you had one open previously, you need to close it and open again.

Type python to verify that everything is correct. Type exit() to leave the interactive interpreter.

Extracting Metadata

Before you move to Macro code extraction, get an overview of the file first. Most file types have so called metadata. That is additional information saved in the file which is usually used by programs that process files of this type.

Download the most recent ZIP of oletools from here: https://bitbucket.org/decalage/oletools/downloads
These are python tools, which you use from command line. A description of each tool is here: http://www.decalage.info/en/book/export/html/79
Install oletools on your system. If you are on Windows, execute install.bat.

Open your command prompt and verify that we have identified the correct file type of our sample with:

Code: [Select]
python oleid.py <path-to-sample>
The program will print the following information:

Code: [Select]
+-------------------------------+-----------------------+
| Indicator                     | Value                 |
+-------------------------------+-----------------------+
| OLE format                    | True                  |
| Has SummaryInformation stream | True                  |
| Application name              | Microsoft Office Word |
| Encrypted                     | False                 |
| Word Document                 | True                  |
| VBA Macros                    | True                  |
| Excel Workbook                | False                 |
| PowerPoint Presentation       | False                 |
| Visio Drawing                 | False                 |
| ObjectPool                    | False                 |
| Flash objects                 | 0                     |
+-------------------------------+-----------------------+

Oleid confirms that this was created by MS Word. It also tells us that the file has the OLE format.
Oleid also tells us that the document is not encrypted and that it contains VBA Macros. So extracting the Macros is one possible way to analyse this file, but we should do the most easy things first.

If you are on Linux, execute the file command. It is able to show metadata for some file formats, including MS Office documents.

Code: [Select]
file <path-to-sample>
Your output will look like this:

Quote
238bd6216c533984173a80c5675bd76f18100ec2c0cf462e24fe82d28305a674: Composite Document File V2 Document, Little Endian, Os: Windows, Version 5.1, Code page: 1251, Author: Admin, Template: Normal.dotm, Last Saved By: Normal.d aka punsh, Revision Number: 21, Name of Creating Application: Microsoft Office Word, Total Editing Time: 19:00, Create Time/Date: Fri Jul 25 17:42:00 2014, Last Saved Time/Date: Tue Sep  2 17:41:00 2014, Number of Pages: 44, Number of Words: 36773, Number of Characters: 209612, Security: 0

On Windows and Linux use olemeta.py.

Code: [Select]
python olemeta.py <path-to-sample> > metadata.txt
The part > metadata.txt will write the output of the command into a textfile, so you can handle it better.
The output is quite long in this case.
I will shorten it here, so we can discuss some parts of the data first.

Code: [Select]
Properties from SummaryInformation stream:
- codepage: 1251
- title: ''
- subject: ''
- author: 'Admin'
- keywords: ''
- comments: None
- template: 'Normal.dotm'
- last_saved_by: '\xc2\xeb\xe0\xe4\xe8\xec\xe8\xf0 aka punsh'
- revision_number: '21'
- total_edit_time: 1140L
- last_printed: None
- create_time: datetime.datetime(2014, 7, 25, 16, 42)
- last_saved_time: datetime.datetime(2014, 9, 2, 16, 41)
- num_pages: 44
- num_words: 36773
- num_chars: 209612
- creating_application: 'Microsoft Office Word'
- security: 0
Properties from DocumentSummaryInformation stream:
- codepage_doc: 1251
- lines: 1746
- paragraphs: 491
- scale_crop: False
- company: 'Microsoft'
- links_dirty: False
- chars_with_spaces: 245894
- shared_doc: False
- hlinks: None
- hlinks_changed: False
- version: 786432

Research is the Key, Interpreting the Metadata

The very first part shown by the Linux file command Composite Document File V2 Document is actually the file type. You can find more about it here: https://msdn.microsoft.com/en-us/library/dd942138.aspx

When you do any research like this prefer reading the documentation that was made by the organization or person who created the file type over reading third-party explanations. Usually they are more accurate. Sometimes there are exceptions, though.
Very detailed information about the file format can be found in its specification: download.microsoft.com/download/0/B/E/0BE8BDD7-E5E8-422A-ABFD-4342ED7AD886/WindowsCompoundBinaryFileFormatSpecification.pdf
This is usually only interesting for people who develop software that has to parse the format. But since you are on the way to becoming a reverse engineer, file format specifications have useful information for you too.

Little Endian (only shown by file) is a way of storing the binaries so that the least significant byte is stored in the smallest address. It is common for Microsoft Windows formats.

The Windows version (only shown by file) is given as 5.1. Microsoft pages are again good idea to look for more information. See: https://msdn.microsoft.com/en-us/library/windows/desktop/ms724832%28v=vs.85%29.aspx
So this file was compiled for Windows XP or above. This tells you, which operating systems might be targeted by the malware.

The code page (codepage_doc in olemeta) is listed as 1251. Again, use google and you will find that 1251 is a character encoding known as Cyrillic script, which is used for Russian, Bulgarian, Serbian Cyrillic and others.

The author is listed as "Admin", which is probably the user name. This does not tell us much in this case, nor does the template "Normal.dotm".

The Last Saved By value "Normal.d aka punsh" however is useful, because it is a very specific name. You can use this details in connection with other malware. E.g. if you find another document with the entry "Last Saved By: Normal.d aka punsh", it is most probably written by the same author. Google for the string and you will find several hash entries for files that have the same string in it. How is that useful? If you write signatures for antivirus products, you want to kill all birds with one stone. You want a signature that covers lots of variants of the same type to keep the antivirus database small and the performance good. And of course a smart and lazy person will prefer to have less work to do. So this information is already enough to collect variants.

Revision, editing time and creation time help to check if that is a newer or older version of the same malware. If you have several samples, you can create a time-line of changes.

Last but not least there are some statistics about the file, which are probably self-explanatory. It gives you an idea of the document's size. If you get a file like this and recognize that it has 44 MS Word pages with a total editing time of 19 minutes, it is for sure suspicious.

This is actually a lot of information we got with so little effort.
Keep in mind that you won't have to google so much about the basics anymore, once you got familiar with the output.

Easy Things First

Download a hex editor, if you haven't already. It does not really matter which one, e.g. HxD is nice.

Open the sample in your hex editor and browse through it. It is one of the first things I always do, often you will find some very interesting things there, which prevent a lot of additional work in the long run.

At offset 0xc6c you will find the start of an area that looks like this:



Firstly: the upper part shows some part of the message that the user also sees if he or she opens the document. It is written in German. You could use google translate, but since this happens to be my mothertongue, let me translate it for you: "Click on Macro, choose security and click on Low"
These are the instructions for the user to turn on VBA Macros. The malware makes the user curious by telling that he or she can only see the full content of the document, if he does so. This social engineering trick enables the document to execute its malicious code.

-----------------------------------------------------------------------------------------------------------------------------------------------
Note: If you ever encounter this in a sample, don't feel tempted to open the document in an unsafe environment. You might think it is safe, if you don't turn on Macros. But malware authors are bitches and they will bite the reverse engineers that don't take proper care. Sometimes they will throw in strings just to lure the malware analysts into doing stupid things or to keep on the wrong track.
At this point you only have assumptions, which you will verify with each additional step. Never let your guard down. Always assume the worst, which is in this case that the malware might work even with Macros disabled.
-----------------------------------------------------------------------------------------------------------------------------------------------

Now look at the lower part
There is a pattern of '&H??' with '?' being 0-9 or A-F. These are hex values! Let's decode them to text. Copy the whole area with this pattern to a text editor. I recommend Notepad++ if you are on Windows. Use the replace function of your editor to remove the "&H" parts (by replacing with nothing). Now go ahead and either write a decoder for the hex string or use a decoding tool. E.g. Notepad++ has a plugin for hex to ASCII conversion that you can use. Save the resulting file.

You probably already noticed that the beginning of the dumped file looked like this:



Once you got a bit familiar with reverse engineering, you will notice this soon as the beginning of a Portable Executable (PE) file. A PE file has a full MS DOS application at the beginning of it that prints the standard message "This file cannot be run in DOS mode!". This was done out of compatibility reasons. The 'MZ' is the magic number for a DOS executable.
The actual magic number for the PE ('PE\0\0') comes later and is given within the header of the DOS stub.
Of course you can also run the "file" command or TrID on the dumped file.

The malware type of our Word sample is a dropper. In contrast to a downloader it already carries the malware with it, writes it to disk and runs it upon execution. It might also perform steps to persist the dropped file in the system, e.g., adding it to an autostart entry in the registry. This dropper carries its PE malware as hex string.

If you look at the code with olevba.py, you can find out more about the inner workings of the Word file. Now that we have a good idea about the file's behaviour it will be much easier to interpret the code. I leave that to you as an exercise. Verify the things we already found. Try to find the location where it drops the file and the part of the VBA code that extracts and decodes the hex string. How does the malware locate the beginning of the hex string?

The dropped file was a PE file. We will dive into PE analysis within the next tutorials.
Meanwhile you can look up the SHA256 of the dumped file on VirusTotal or other analysis sites; or upload the file to a sandbox analysis system, e.g.:
https://www.hybrid-analysis.com/
https://malwr.com/

Stay safe!
« Last Edit: September 26, 2015, 09:58:30 pm by Deque »

Offline Deque

  • P.I.N.N.
  • Global Moderator
  • Overlord
  • *
  • Posts: 1203
  • Cookies: 518
  • Programmer, Malware Analyst
    • View Profile
Re: [Tutorial] Analyzing Malware by Example---Sample 2
« Reply #1 on: September 26, 2015, 10:00:49 pm »
Updated the tutorial.
oletools introduction is now here instead of part 1.
Also a small section about setting up python (mainly aimed at Windows users).

Offline Racheltjie de Beer

  • Serf
  • *
  • Posts: 26
  • Cookies: -1
  • Everything Zen
    • View Profile
Re: [Tutorial] Analyzing Malware by Example---Sample 2
« Reply #2 on: January 29, 2016, 11:11:39 am »
For our Ubuntu (and maybe other *Nix) listeners, the default unzip (or Archive Manager) will not work to unzip the sample.zip file:
Code: [Select]
Archive:  sample.zip
   skipping: 238bd6216c533984173a80c5675bd76f18100ec2c0cf462e24fe82d28305a674  need PK compat. v5.1 (can do v4.6)

I found this Ubuntu Forums thread

I did the following:
Code: [Select]
sudo apt-get install -y p7zip-full
7z x sample.zip

(Although I see one could just installed 7z)

Hope this help
(Thinkn) x ∑1n (Search x Reading)