Targeted attacks and so-called APTs (advanced persistent threats) come in many forms and colors. Very often, in-house malware analysis teams want to go beyond the detection information offered by traditional analysis systems (which often only says if a program looks malicious or not). The Lastline High-Resolution analysis engine exposes a lot of details describing the malware behavior, such as file-system modifications, changes to the Windows registry, interesting network communication, and it even highlights sophisticated evasion attempts completely automatically.
But sometimes, an analyst wants to go even beyond that and get a deeper look into the program binary. This can be useful for research purposes, finding a more effective remediation process, or just because some people need to know it all.
Sadly, the ability to perform in-depth static analysis of APT malware is far from easy, even the most powerful tool, such as the IDA Pro disassembler or WinDbg debugger, are sometimes not enough. Malware has been using polymorphic packers or so-called “protectors” as well as other sophisticated tricks to prevent analysts to get their hands on or understanding payload functionality.
The good news is that modern dynamic malware systems are very often immune to many of these obfuscation tricks that hinder static analysis in tools like the IDA Pro disassembler. To bridge the gap between the dynamic and static analysis, Lastline now provides an efficient and universal unpacker as an integral part of the engine performing the advanced dynamic analysis. This gives an analyst the ability to look into a malware sample at various stages during the dynamic analysis, eliminating barriers to static analysis.
So, do you know how APTs are attacking your company in detail? Do you want to know all possible functionality of the malware used in a targeted attack against you? In a series of blog posts, we will show you how easily you can load a fully-unpacked snapshot of a malware sample taken by the Lastline analysis engine.
LLama versus PlugX
One component of the Lastline analysis engine is a full-system emulator, we internally refer to as LLama (short for Lastline Advanced Malware Analysis - we really like words starting with Ls ;) ). In addition to exposing a sample’s behavior to the Lastline Analyst, LLama also acts as a universal unpacker by running the sample inside a guest operating system.
LLama fights many different forms of evasion attempts present in advanced malware, as we described in various previous blog posts. Therefore, it is much more powerful than (and goes far beyond) launching a program inside a virtual machine with an attached debugger.
In this blog post, we will demonstrate the system in action using as an example a recent variant of PlugX (described in a previous blog post).
Malware Family: PlugX
VT link: https://www.virustotal.com/en/file/6d579c3ab1a31719120da90e7b7aa639df65d45b9af666addd0ab0e573a6e9e1/analysis/
Full analysis result link: https://user.lastline.com/malscape#/task/f7b5c2293e574d069e0a48bcd7691b16 (accessible to Lastline customers only, sign-up now)
Due to the internal structure of this PlugX sample, static analysis has become quite complex. To give a short overview of the infection process, see the chain of events below - for the full details, refer to the PlugX post (Section Process of Infection):
The rarsfx archive
first drops three files into the %TEMP% directory:
EmpPrx.exe - a benign file with valid digital signature.
EmPrxRes.dll - an auxiliary DLL imported by EmpPrx.exe; It contains (fake) exports, identical to the legitimate DLL that EmpPrx.exe is expecting.
EmPrxRes.dll.dat - not a PE file, but a file containing position-independent code, consisting of a decryptor and an encrypted malicious image. It also contains encrypted settings.
then starts EmpPrx.exe
During the EmpPrx.exe loading process, the Windows loader looks for “EmPrxRes.dll” in the current directory, finds it, and loads the DLL which was dropped by the rarsfx archive. This technique is known as dll-load-order-hijack, where a local DLL imitates - and is loaded instead of - a legitimate library.
EmPrxRes.dll (in the DllMain function) patches the entry point of the EmpPrx.exe image in memory (which has not started execution at this point).
The original entry point of EmpPrx.exe is replaced with a jump instruction that transfers the execution to a function loading the position-independent code from the file EmPrxRes.dll.dat
To do an in-depth static analysis of these components and events, one needs to have all three components in memory, and do a step-by-step analysis inside a debugger. Further, since this malware uses position-independent code, simple dumping of memory does not expose any import tables. As a result, recognizing API functions correctly becomes very difficult.
The Lastline analysis engine already provides the analyst with an overview of behavior exhibited by malware. Additionally, the reports contain in-depth results for each interesting behavior observed (but omitted in this post).
PlugX behavior overview
Behavior overview of PlugX malware variant
As described earlier, in addition to the exposed behavior, the LLama engine exposes multiple process dumps for further analysis by an analyst.
Snapshots taken during dynamic analysis to ease static analysis
Each analysis subject has a few process dumps (or snapshots) taken at different stages of the analysis. A snapshot is taken whenever the LLama engine considers an observed functionality (or the memory content) to be interesting.
For example, one of the snapshots above were triggered after observing a call to a critical API function from an allocated, untrusted memory region, which typically means code was first unpacked and then executed.
Bridging the Gap Between Static and Dynamic Malware Analysis
Each exported process dump is a full PE image, and each section represents a loaded code module or a memory block allocated by the program. This allows the exported dumps to be opened by a wide range of analysis tools. What is even more interesting is the fact that the dumps contain only memory regions considered interesting by our engine. This means that an analyst does not need to analyze several megabytes of unrelated process memory and can focus on relevant code and data regions right from the start: The average size of the process dump usually doesn’t exceed a couple hundred kilobytes.
Without doubt, one of the most popular tools for analyzing PEs is IDA Pro (usually in combination with a decompiler plugin). To even further simplify analysis in IDA, we provide a Python script to post-process a PE snapshot, reconstructing the program imports to ease analysis. Additionally, this processing step adds bookmarks to several points of interest, highlighting interesting code execution / entry points. This script is integrated in the Lastline Analyst help and report web-interface (it can be found by clicking on the question mark next to “Process Dumps” report section).
Direct process snapshot integration in analysis report
Process snapshot integration in Lastline Analyst documentation
After loading a process dump and running the Python script, IDA Pro displays two new tabs highlighting additional analysis metadata, for example the reconstructed API import table (even for packed malware).
Reconstructed PE import table
In addition to the standard PE image import tables, LLama also reconstruct other custom tables containing virtual addresses, often used by packed malware variants. These links are fully interactive and allow to navigate to the highlighted function or code regions.
Reconstructed custom import table
In combination with the Hex-Rays decompiler, the process memory dumps enable users to analyze the source code of the unpacked program at different stages of execution. Clearly, this vastly simplifies the in-depth analysis of the malicious program and provides a powerful tool to get a fast understanding of a malware’s functionality.
For demonstration purposes, let’s open and analyze a process snapshot of analysis subject 2 (triggered by an API call as described earlier). Above, one can already see the fully reconstructed PE import tables, as well as custom call tables used by the malware.
Now, let’s look at the list of code regions the LLama engine considered to be of interest:
IDA table showing “points of interest”, such as code execution after unpacking
As one can see, LLama highlights a total of four different code regions (and gives a short description of each). The first two regions are return addresses from system calls that were considered “interesting” and we can immediately jump to these code regions:
0x003df9bb return address after a call to CreateDirectoryW
0x003d3124 return address from the CreateFileW
These two functions called by the code above use the NtCreateFile system call to create a directory and file, respectively. By analyzing the call stack, the LLama engine finds return addresses pointing to untrusted code regions. In our case, they point to position-independent code, which makes it even more relevant for analysis. These points could be used as a starting point for a payload analysis.
The remaining two points of interest highlighted in IDA Pro are the original entry points to the PE images described in the infection chain earlier.
0x00434066 is the entry point to EmpPrx.exe, and
0x10001160 is the entry point to EmPrxRes.dll
To find out what the malware does in these entry points, we decompile the code region, which reveals two very interesting behaviors:
First, the program uses a time-trigger to hide parts of its functionality based on the local system time (it only executes this code after 2014/03/03). If this check is passed, the function patches the memory located at an address relative to the image base - more concretely, it patches data at location base(EmpPrx.exe) + 0xc178.
Since our process dump contains all memory buffers allocated by untrusted code, this patched code is also available in IDA:
As one can see, the process dump includes all stages of the infection chain. Additionally, every interesting component of the infection is detected and highlighted by the analysis engine, allowing straight-forward, in-depth analysis of this malware sample.
Code packing and other code obfuscation techniques make the static analysis of practically all modern malware variants very difficult. A great way to bypass these tricks is running malware in a dynamic malware analysis environment or debugger. Sadly, many malware variants detect these analysis environments and refuse to execute (correctly).
The Lastline high-resolution analysis engine combines the best of both worlds by bypassing dynamic evasion attempts to enforce execution inside the LLama full-system analysis engine. At the same time, it exposes unpacked process dumps that can be used with a wide range of analysis tools.
For analyzing these dumps in IDA Pro, we provide means to load a process dump as if no code obfuscation technique was present: We reconstruct import tables, strip uninteresting code regions, and expose all allocated memory for analysis. This allows to get an in-depth look into sophisticated APT malware in basically no time.