Every day, our Lastline sensors observe millions of files that our customers download from the Internet or receive as email attachments. These files are analyzed and, in many cases, executed or opened inside our sandbox. The sandbox is a secure, instrumented analysis environment where we can safely look for interesting behaviors that indicate bad intentions and outright malice.
Every once in a while, we take a step back and look at the malicious behaviors that we have seen. Malware authors always look for new ways to make money, get access to sensitive data, and evade detection. They introduce new behaviors, refine ideas that they have tried in the past, and add tricks to bypass security controls. By looking over the data collected over the last year, we discovered a few interesting trends that show some of the directions that malware authors take. In this research note, we discuss three findings that struck us as interesting and worth reporting. As a forth bonus item, we also revisit evasive behaviors, something that we have been tracking for many years.
Malware authors want to avoid detection. One possible way to achieve this goal is to make malware programs appear legitimate. And one trick to make an unwanted program appear legitimate is to digitally sign it. Many security products consider signed code to be more trustworthy. Also, when signed applications are run in Windows, certain security warnings are suppressed that could deter victims from running unwanted software.
Digital signatures are used in many different domains to establish trust between entities. In Windows, digital signatures are used within the Authenticode framework, and they associate a program with the identity of its author (or publisher). Digital signatures also guarantee that the code has not been modified or corrupted. To sign programs, code authors (publishers) need to obtain a valid certificate from a Certification Authority (CA), which are trusted third parties. This need to obtain a certificate from a trusted third party should make it more difficult for malware authors to sign their code. First, the CA should verify the identity of the publisher, and presumably, it is difficult for malware authors to convince the CA that they represent a genuine business. Second, certificates are not free, and once a certificate is found to be abused for malware, it will be revoked. This removes the benefits for all programs signed with this certificate, and, in fact, making the programs even more suspicious.
Despite the fact that the certificate process should make it difficult for malware authors to sign their programs, we have seen a noticeable increase of signed programs in our dataset that perform malicious activity. For the results, see Figure 1. The graph shows that a significant spike in the beginning of the year, and after a drop back, a gradual increase (overall, the gray trendline points upwards).
Figure 1: Fraction of signed malicious and suspicious samples
We should make clear that most of the observed increase is due to adware or other types of unwanted software (often referred to as PUP – potentially unwanted programs). We don’t find significant use of digital signatures in programs that are flat out malicious. In our system, we assign a score of 70 or higher to programs that are clearly malicious, while a score between 30 and 70 indicates that the program is suspicious. We saw a significant increase of digital signatures for programs in the range between 30 and 50, much less so for programs with scores closer to 100. This makes intuitively sense, as it might be possible for companies that produce adware or PUP to convince (or trick) CAs to provide them with certificates. There have even been cases in the past where such companies successfully forced AV vendors to remove detections for their software, claiming that their creations are legal. (You can find an index of some of these claims and their dispositions here.) Of course, most users do not want to install PUP. In the best case, such programs are annoying. In the worst case, such software can serve as a gateway to malware; a slippery slope where each “greyware” program in turn installs a slightly more aggressive application, until the victim machine is infected with “proper” malware.
Of course, we asked ourselves whether specific Certification Authorities (CAs) are less strict than other when vetting software publishers. In other words, is there a specific CA that is responsible for most of the signed malware? We found that this was not the case. All popular, well-known CAs are affected, and there is not one specific player that can be singled out.
Modifications of Browser Settings
Our sandbox tracks a wide variety of behaviors that could be interpreted as suspicious or unwanted. These behaviors are typically generic: we are less interested in how a specific malware achieves a goal, but rather, we want to capture what malware is doing. This approach is what fulfills the promise of zero-day detection capabilities: we don’t need to have seen a particular piece of code in the past, when it shows unwanted behaviors in our sandbox, we will flag it as bad.
When looking at these behaviors, we observed an interesting increase in the number and fraction of samples that modified browser settings. We track modifications for many important and security-relevant browser configuration files, for all major browsers (Internet Explorer, Chrome, Firefox, Opera, Safari). We also track changes to relevant Windows registry keys that influence the behaviors of these browsers. The increase in the number of samples that change browser settings (and – in gray – the trend) can be seen in Figure 2.
Figure 2: Fraction of samples that modify browser settings
Digging deeper into the data, we found that much of the increase of the observed behaviors was because malware samples changed the browsers’ proxy settings. For Internet Explorer, these modifications can be done by changing the AutoConfigURL registry setting. When looking around a bit, one can find reports that show malware that has targeted browser proxy settings (and AutoConfigURL in particular) appearing already many years ago. (Here's an example from late 2013.)
By changing proxy settings, an attacker can redirect all browser traffic, or traffic for selected URLs only, to go through a machine that the attacker controls. By putting himself into the path between the victim and a legitimate website (such as a bank), the attacker can launch man-in-the-middle attacks and crack open traffic even when it is encrypted. This allows miscreants to steal bank credentials or interfere with financial transactions (even when two factor authentication is used, since the attacker is manipulating an actual transaction that a user carries out). In fact, the first malware to popularize the trick to tamper with browser proxy settings were Brazilian bank Trojans.
While the basic technique of tampering with proxy settings is certainly not novel, we found the significant increase that we observed in our data troublesome. It seems that man-in-the-middle attacks are not necessarily limited to banking malware anymore. Instead, as encrypted web traffic (HTTPS) becomes ubiquitous, malware authors increasingly include components into their programs that allow them to hijack browser traffic.
There are two main approaches that attackers use to gain unauthorized access to systems and data. One approach abuses the fact that many systems have vulnerabilities. These vulnerabilities can range from configuration flaws to software bugs. When an attacker finds such a vulnerability, he can write an exploit that takes advantage of the problem and might gain direct access to the data or can force the victim system to execute commands or code that are not intended. The second approach does not exploit a weakness in the system but targets users instead. Specifically, attackers can trick users to reveal their credentials (e.g., in a phishing attack), or they can rely on the fact that users choose poor passwords. In the latter case, credentials might simply be guessed.
Launching a password guessing attack is a tried and true technique to break into remote systems or escalate local privileges. The malware repeatedly invokes a local authentication procedure or repeatedly sends requests to a remote login function, using well-known username and password combinations. When a target uses a weak combination of well-known, easy-to-guess username and password, then the malicious code gains access.
In our sandbox, we use a combination of techniques to recognize password guessing (or brute forcing) attempts. One set of techniques looks at the behavior of the program. Does it repeatedly connect to services, sending similar requests that only vary in what looks like credential sets? Does the program repeatedly invoke certain Windows authentication APIs? Does the program connect to network-local services that are used to manage and elevate privileges?
Another complementary set of techniques to detect password guessing focuses on in-memory analysis. Specifically, our sandbox statically examines the memory snapshot of a program at different times. When our analysis finds code blocks or data blocks that indicate password guessing capabilities, an appropriate detection event is generated. This second set of techniques is interesting because it separates our detection capabilities from those of many other sandboxes that are limited to the observation of interactions between the program code and the operating system, or program code and library functions (using API or system call hooks). With our full system emulation (FUSE) approach, we can not only see every individual instruction that the malware executes, we also have full visibility into the memory of the process. This enables us to peek at code and data blocks that are not executed, broadening our range of malicious capabilities that we can discover.
Figure 3: Fraction of samples that perform (password) brute force activity
Password brute force attacks are certainly not new. However, the growth of password guessing capabilities in malware samples is quite relevant, as shown in Figure 3 (notice also the gray trendline). We can speculate about the reasons for this increase. One possible explanation is that services (such as Windows itself, but also important applications) get increasingly hardened against exploits. That is, it becomes harder and harder for attackers to exploit vulnerabilities in important software components. Hence, it is necessary to rely more on social engineering and password guessing. Another explanation is that malware increasingly targets sets of devices that are known to have weak passwords and that have not been scrutinized as carefully in the past. With these devices, we refer to what is often described as the Internet-of-Things. This includes home routers, web cameras, but also devices that were not connected to the Internet at all in the past (such as thermostats or fridges). Many of these devices are built with little security in mind, and they have weak default passwords. Malware has started to target these IoT devices, and one simple way is via password guessing attacks. Finally, a third reason is that certain prominent malware families spread by injecting malicious code into websites that run old, unmaintained versions of content management system (CMS) software; WordPress has been a particularly popular target.
As dynamic analysis systems (sandboxes) have become more popular, malware authors have responded by devising evasive techniques to ensure that their programs do not reveal any malicious activity when executed in such an automated analysis environment. Clearly, when malware does not show any unwanted activity during analysis, no detection is possible. Malware authors have utilized simple evasion tricks for many years. Last year, we have reported that evasive malware has become mainstream. We found that a majority of samples used evasive techniques, and many samples employed many such techniques in parallel to increase their chances to remain undetected in a sandbox. At this point, we cannot expect substantial increases anymore; evasive behaviors have simply become too common and it is not possible anymore to double yet again. Nonetheless, we wanted to understand whether widespread evasion was here to stay. Not surprisingly, the answer is yes. In 2015, we saw the continuation of the trend that we explored in a previous report for 2014.
Figure 4: Fraction of samples with evasive capabilities
Figure 4 shows the fraction of malware samples in our dataset that uses one or more evasive techniques. As expected, we can see that this fraction stays high throughout the year. There are some swings in this curve, which is caused by different waves of popular malware samples. In some cases, a particular family does not have aggressive evasion capabilities, so the graph dips down. However, such malware samples are quickly replaced by waves of families that do make use of evasion.We have seen the full range of evasion behaviors.Malware samples check for the presence ofvirtualized environments and look for artifacts (files, processes, library hooks) that reveal specific sandboxes. More advanced APT actors have continued to push code into the kernel, a space that traditional sandboxes cannot cover. And finally, stalling code has long replaced invocations of the sleepAPI function as a weapon to trick a sandbox into timing out execution before any malicious activity is observed.