Over recent years, we have seen a rapid evolution of security products. Whenever a new technology is introduced, it tackles shortcomings of its predecessor, but also faces new challenges as attackers adapt to the changing security landscape.Just to give a few examples: firewalls were (and still are) great at blocking incoming attacks, but can trivially be bypassed by client-initiated, drive-by-download exploits; signature-based Anti-Virus systems (AVs) tackle first-generation malware families, but are helpless against sophisticated code packers; code emulators and unpackers help with the latter, but cannot deal with most previously-unseen packing algorithms.
The introduction of first-generation sandboxes defeated most code packing techniques, gradually turning the attackers’ attention to sandbox evasion. While modern sandboxes can deal with evasion in many ways, they have to deal with another fundamental problem, namely, limited code coverage. In other words, a sandbox might be perfect at detecting behaviors that are exhibited by malware. However, what if a malware sample is capable of a certain behavior that the sandbox is not executing during the analysis of the program?
In this post, we introduce the concept of dormant functionality, which refers to behavior that is not executed - that is, the functionality is dormant at the time of analysis. We show how this functionality can be used to classify malware, provide examples for different types of dormant code, and discuss how the Lastline solution tackles each of these types.
Dormant Functionality Versus Evasion
Traditional sandbox-based analysis systems rely on observing and classifying behavior exhibited by malware. Thus, they only cover behavior and code paths that are executed during the analysis. Dormant functionality is code that exists in a malware program, and that could be executed under certain conditions, but, for some reason, the execution path invoking the behavior is not taken during analysis.
In a way, the problem of executing additional, interesting code paths is related to tackling evasive malware, but it is an orthogonal problem: evasive code tries to identify the presence of an analysis system, and deliberately skips malicious behavior to avoid being detected. To tackle evasive malware, a sandbox has to conceal itself as well as possible, and, at the same time, mimic an environment that the attacker targets.
Dormant functionality, on the other hand, is an independent problem and exists even when malware is not evasive (or when the analysis system is able to bypass the evasive checks). As a matter of fact, malware often contains dormant code that is not even executed on a real machine, as we will see below. Thus, it is a problem that also exists when observing malware on a real victim machine.
Types of Dormant Functionality
There is a wide range of reasons why a sandbox might not be able to trigger some of the behavior of a malicious program - that is, to turn dormant functionality into actually observed behavior - but the most frequently observed patterns fall into one of the following four categories:
Dependency on External Infrastructure: One of the most fundamental problems when analyzing malware is that the malicious code frequently relies on some sort of external infrastructure, such as a command-and-control (C&C) server, for receiving commands. If this server is not reachable, or if the server does not send specific commands to the executing malware, much of the malware’s functionality will remain dormant.
Dependency on C&C server: without commands from the server, the malicious behavior remains dormant.
There are many reasons for unreachable C&C servers. Sometimes, malware variants are released with hard-coded C&C addresses, and the attackers continuously release new variants when changing their infrastructure. Thus, if a sample is not analyzed within a relatively short amount of time after its release, the C&C server’s address might have changed, and no interesting behavior can be extracted any longer from this specific variant. Interestingly, classifying such a program as benign is - to some extent - correct, as executing it will do no harm on a real machine when executed at this point in time.
Nevertheless, analysis systems need to be able to deal with unavailable C&C servers: for example, as a report by Mandiant from 2014 suggests, the attackers known as PLA Unit 61398 would not make use of their C&C infrastructure during lunch breaks and weekends, meaning that programs analyzed at those times might not receive commands from the attackers. Furthermore, some users may have very stringent privacy restrictions and, thus, decide to deploy their analysis systems without any connectivity to the Internet.
Internal Component Dependency: A large fraction of malware families used in APT attacks today are made of multiple components. While each component is designed for a specific task, it might depend on another component to be loaded or running in order to work correctly.
For example, some malware families, such as PlugX, consist of different components for downloading new payload code from the Internet and decrypting it, for injecting the decrypted payload into a remote process, and for the actual payload. Thus, when a network-based sensor captures the download of the payload update, the subsequent analysis of this component may run without the decryption and injection components.
Component dependency: analyzing the download component might reveal the payload, but without
injection and decryption components, the malicious behavior remains dormant.
Component dependency can even exist between benign and malicious code: for example, some exploits download malicious code that is saved to disk and injected into another process running on the victim machine. Since this code, usually stored as a library/DLL, is always executed in the context of a specific process, the code might rely on information available only in this environment, such as other libraries being loaded, certain data being stored in memory, or functions being located at specific addresses.
In case the infection is later detected, and all files on the infected machine are sent to the analysis system, the dropped DLL might not execute its malicious functionality correctly, because it is not executed in the expected context.
Missing/Expected Input: As already described above, many malicious samples have a modular structure with multiple components. In many cases, these components require a specific set of arguments to trigger the execution of their behavior.
Re-using the example above, when a library is injected into a running process as part of a successful exploit, this library might be a generic loader providing various behaviors available to the attacker. Thus, the attacker will invoke the desired functionality within the library after the injection by invoking a dedicated function with specific arguments.
Another, very frequent pattern that we observe with malware in the wild is that attackers install remote access tools (RAT) on an infected computer. RATs are pieces of software that attackers employ to remotely access or control the infected machine. These RATs can be used as utilities during the attack, for example, to inject libraries into existing processes or drivers into the operating-system kernel, but they do not implement any of the malicious functionality per se.
RAT functionality remains dormant unless explicitly invoked
Very often, RATs are user-friendly command-line programs, but it can be challenging for an automated analysis system to understand which command-line parameters need to be passed to the program to trigger the interesting behavior.
Broken Packers: Most malware found in the wild today is packed with some sort of custom packer to bypass signature-based anti-malware solutions. Interestingly, we frequently see samples that do not work, because they have a buggy packer or use a packer that is not compatible with the malware code.
Packed code remains dormant until packer (successfully) decodes payload
While malware with broken packers are not a threat to users per se, it is nevertheless interesting to highlight that a (non-functional) program under analysis is malicious, to notify that there has been an attempted attack.
Detecting Dormant Functionality in Wild Neutron
A great example to illustrate the different types of dormant functionality is Wild Neutron: this malware family makes heavy use of different components, as described in a recent technical analysis by Kaspersky Labs.
As outlined above, when analyzing individual components or even multiple components without the necessary commands from the attacker, much of the functionality remains dormant. The Lastline solution uses full system emulation, which means that it can see all code that is loaded as part of a subject under analysis - including those code regions that are not executed. As a result, the system can include any functionality with the potential of being executed in its analysis.
Now, you might ask: isn’t “looking at code” exactly the approach that legacy anti-virus (AV) solutions have used for many years? Well, yes and no. Of course, AVs have statically examined code in an attempt to to determine if it is good or bad. But the Lastline solution does things a bit differently.
First, our system doesn’t simply look at blocks of code in isolation. Instead, it analyzes code and puts it into the context of the current execution of the program. Even more powerful, it can put a code block into the context of other sandbox runs, where possibly different malware programs did in fact execute this code block (or a similar one). By connecting a block of code with the API functions and system calls that it invoked, the system can learn and extract the expected behavior of that code. That is, our analysis system can identify (malicious) functionality that has been observed in samples analyzed in the past, and, thus, categorize new programs without seeing the actual behavior being executed.
As a second difference to traditional AV signatures, when checking for dormant functionality, we are not trying to precisely match sequences of bytes or instructions. Instead, we characterize and compare code at a higher, more abstract level. This “fuzzy” matching makes our analysis much more robust against simple obfuscation tricks that malware authors have long used to bypass AV signatures.
In the remainder of this post, we provide examples of how the analysis system finds dormant functionality when analyzing Wild Neutron components either independently, or when the malware sample is unable to contact its C&C infrastructure.
Example: Wild Neutron Remote Access Tool
The main payload of the Wild Neutron malware is a RAT, which is dropped and executed after infecting a host. What’s particularly interesting about this tool is the fact that it requires the C&C URL to be passed on the command line.
If the RAT is able to connect to the C&C server, it waits for the attacker to send commands to be performed on the victim machine. Otherwise, or if no valid URL is specified, the RAT simply enters an infinite loop without showing any malicious behavior:
The code above shows how the program validates the provided C&C URL. It checks if a parameter was passed and verifies that it is a valid URL. For this, it decrypts the string “http”, which is stored in encrypted form in the program’s memory, and searches for it in the provided URL. If the check is successful, the C&C information is encrypted using RC4 (with key “Intel Integrated Graphics”) and CryptProtectData, and stored in the registry for later use.
At this point, the malware registers its handlers for the different commands available to the attacker: it decrypts the command names stored in memory (for example "update"), and assigns the callback-function for each command inside a global handler structure:
Later, when a command is received, it finds the correct callback using its name, and invokes it. This way, the available commands are never stored in plain-text inside the malware binary. Fortunately, the Lastline analysis system automatically provides an analyst with full-process snapshots of the running program, allowing to find these decrypted command names easily:
Clearly, when performing a standalone analysis of the RAT, a conventional analysis sandbox (just like a human analyst) cannot know a valid, currently active C&C URL, and is thus unable to trigger any of the interesting behavior during the analysis.
On the other hand, by exposing the dormant functionality from within the malware sample, the Lastline solution is able to dive much deeper into the behavior of the program under analysis. This helps classifying the malware much more accurately, as can be seen in the Lastline analysis report:
Component MD5: 3E373D39198A2025665AFA87D2D75E7F
Lastline Analysis report: https://user.lastline.com/malscape#/task/ebeaf80d2da84b329b6cec71d3d2b002
Lastline analysis system identifying dormant C&C behavior in Wild Neutron
Example: Wild Neutron Configuration Utility
Wild Neutron contains a utility that allows an attacker to configure the different malware components running on the infected machine. This program takes a number of command-line arguments, and if no such arguments are provided, the tool simply exits, not revealing any suspicious behavior.
For example, if the argument "-stop" is provided, the tool will terminate one of the other malware components running on the host. A different command-line option allows to install a malicious wdigestEx.dll library as Security Support Provider using the AddSecurityPackage function. This allows the malware to intercept critical security operations such as user authentication:
If this program is analyzed independently of this other running process, or without the required command-line arguments, very little behavior can be observed to classify the program as malicious. Nevertheless, the analysis report overview shows that the Lastline solution is able to see the non-executed code-paths inside the malware program, and analyze – and, in turn, detect – the dormant functionality of the malware:
Component MD5: 48319E9166CDA8F605F9DCE36F115BC8
Analysis report: https://user.lastline.com/malscape#/task/cd4bc6ef901c48ca952c40fac6a8696d
Lastline analysis system identifying dormant behavior in Wild Neutron RAT
Security solutions based on dynamic analysis of malware are a powerful means to classify malicious programs, but they have to tackle the problem of limited code coverage. Evasive malware might try to deliberately hide malicious behavior, and dormant functionality might not be triggered at the time of analysis.
The Lastline analysis solution is able to tackle evasive behavior and dormant functionality alike, significantly increasing code coverage and, thus, the amount of behavior it can use to classify a malware program.