The pyramid of pain in threat hunting

Posted by Felix Weyne, October 2016.
Author contact: Twitter | LinkedIn
Tags: threat hunting, pyramid of pain, indicator of compromise, network signature, host signature, locky

Rather than wait for a defense product to pick up on a threat, threat hunting is the activity where a security analist is actively going out looking for threats. Security threat hunters are combining threat intelligence, analytics and human insights while scouting around for and chasing down bad actors and malicious activity on the corporate network they need to defend. Threat intelligence ships in various forms, the usefulness of the intell can be visualised in the so called "threat intelligence pyramid of pain".

Image 1: threat intelligence pyramid of pain. Source: David Bianco's personal blog.

The higher you are on the pyramid, the more potential your intel has, and the more resources your adversaries have to expend. For example, changing the hash of a file is trivial for an attacker: only one bit in the file needs to be flipped or appended, whereas changing the way malware communicates with its command and control (c2) server is more annoying: malware has to be partially rewritten, which comes with a cost. In this blog I'll discuss four layers of the pyramid by making use of a real use case scenario: Locky ransomware attacks in the summer of 2016. I'll discuss the usefulness of Indicators of Compromise (IOCs) (hashes, IP adresses and domain names) and the usefulness of network and host intrusion detection signatures against three Locky ransomware waves.

The locky samples were collected by running a bunch of malicious mail attachments that were sitting in a mail honeypot in a sandbox environment (image two). I entered the contacted IP's and domains from within the sandbox into threatcrowd.org to find additional samples related to the same spam wave (image three).

Image 2: running malicious mail attachments in sandbox to quickly grab some dropped files, contacted domains and IPs

Image 3: using threatcrowd to find additional samples

In total I collected as much samples as possible from three locky SPAM email waves:
-Wave 1: 25th july 2016 - phishing mail with subject "photo", attachment is a JS file that drops a locky executable.
-Wave 2: 15th august 2016 - phishing mail with subject "order confirmation", attachment is a DOCM file that drops a locky executable.
-Wave 3: 16th august 2016 - phishing mail with no subject, attachment is a WSF file that drops a locky executable.

Hashes, domains and IPs

An overview of the payload domains, contacted IP's and hashes of the three Locky malware spam waves can be found on this pastebin, a summary is displayed in image four. Samples can be found here (password=infected).

The three spam waves all had different dropper types: a malicious office document, a malicious JScript and a malicious windows script file. Each dropper had a unique hash, mainly because only a few characters in the scripts were altered (i.e. there were no different methods or obfuscation techniques used inside the same SPAM wave). Each spam wave used on average about twenty-five unique, new domains to host the payload. In wave one, eight unique C&C IP's were identified. Wave two and three were send out on two consecutive days, the C&C IPs didn't change much between those waves. In total, about one hundred unique IOCs can be linked to the three spam waves.

Image 4: subset of nearly 100 IOCs regarding Locky payloads (three SPAM waves)

A possible hunting strategy could be to throw all the domain, IP and hash IOCs on your sensors (proxy logs, workstation logs, ...) and see what sticks. A hit on a hash will probably give you a true positive with a high confidence rate. Hits on IPs may lead to a lot of false positives because of shared hosted websites.

The advantage of the "wacky wall walker approach" is that it is quick and relatively easy to implement. The disadvantage of this method is that it gives your adversaries a good laugh: the hashes, domains and IPs related to a malware family change very rapidly. The perfect example of rapidly changing IOCs is the Locky use case: every wave had unique droppers, new payloads and new payload domains. It's interesting to notice that the payloads didn't have a lot of unique hashes, although it must be mentioned that there have been Locky spam waves where every payload had a unique hash, which leads to funny tweets:

the #locky exes on these affid=1 drops... all diff hashes.https://t.co/uNQWoSlNL3 pic.twitter.com/fTgareCfa9
— Techhelplist (@Techhelplistcom) 11 mei 2016

Overall, hunting via domains, IPs and hashes is a good "pre-hunting warm-up routine", but to really combat threats effectively, you need to bring some heuristics to the table. In the next paragraph example heuristics will be discussed.

Network and Host artifacts

In the previous paragraph it was stated that the hashes, domains and IPs related to the Locky malware family changed very rapidly. This finding is in line with the theory of the pyramid of pain: hashes, domains and IPs are relatively easy and inexpensive for an adversary to change. In this paragraph we will try to find some threat intelligence (more specifically: network and host artifacts) that doesn't change as frequently as the threat intelligence from the previous paragraph. The goal is thus to find some hunting rules which will apply on ALL three of the Locky spam waves, and which may indicate a Locky infection. The hunting rules will be grouped by host artifacts (artifacts you can find on workstation logging such as sysmon) and network artifacts (artifacts you can find in web proxy logging or with a network intrusion detection system such as SNORT). For each group, we will both discuss the droppers (WSF/JS script and weaponized word document) and the payloads (Locky ransomware executable).

Host artifacts

If we run the samples of the three SPAM waves in a sandbox with sysmon installed, we can see a pattern in the parent-child process trees of the droppers and payloads. Additionally, the command line arguments of the Locky payloads are almost identical throughout the three waves. Let start by looking at the via mail sent droppers (highlighted in green, image five).

The workstation logging pattern for the droppers is pretty self explanatory: either a zipped archive or macro enabled document is opened from the temporary outlook directory (assuming that the user opens the malicious attachment straight from outlook). After opening the attachment, either winword or wscript launches an executable in the temporary folder. (Note that in the case of the zipped WSF and JS droppers, an additional extracting process takes place). A file sitting in the temporary outlook folder launching an executable in the temporary windows folder must raise at least some suspicion. This behaviour does not guarantee a Locky infection with a high degree of certainty though, that's why we'll also take a look to system logging regarding the called executable in the temp folder.

Image 5: parent-child process logging on workstation running Locky dropper and payload

Comparing the logging patterns of the Locky executable (highlighted in orange, image five), we can see that the Locky executable always deletes itself in the same way (cmd.exe /c del *parent process executable in temp folder*). Before Locky deletes itself, it opens the ransomnote on the victims desktop with the default browser. In each of the three spam waves the location and name of the ransomnotes were identical. The following workstation threat rules can thus be defined for the droppers and payloads:

Host Locky dropper threat rule:
-Child process: 'WScript.exe %TEMP%\Rar*\*.js|*.wsf', Parent process: 'Winrar.exe %OUTLOOKTEMP%\*\*.zip (AND)
Child process: '%TEMP%\*.exe', Parent process: 'WScript.exe %TEMP%\Rar*\*.js|*.wsf'
-Child process: '%TEMP%\*.exe', Parent process: 'WINWORD /n %OUTLOOKTEMP%\*\*.doc*'

Host Locky payload threat rule:
-Child process: 'cmd.exe /C del /Q /F %TEMP%\*.exe|*.tmp', Parent process: '%TEMP%\*.exe' (AND)
-Child process: 'iexplore.exe c:\users\victim\Desktop\_HELP_instructions.html', Parent process: '%TEMP%\*.exe'

Contrary to the rapidly changing hashes, domains and IPs, the above host artefacts did not change throughout the three Locky SPAM waves. This finding is also in line with the theory of the pyramid of pain: hashes of payloads can easily be changed by altering a bit or by using different packers (such as explained in this blog). Changing logic component blocks takes more effort, and while malware authors certainly change these too, the change isn't as fast as the changes in hashes, domains and IPs.

Network artifacts

The common characteristic in the payload generated traffic (shown in image six) is the use of user agents which may not be a valid user agent in your environment. The default environment browser (internet explorer) user agent is never used by the droppers. Instead, either no user agent is explicitly defined (in that case the user agent of the script engine is used), or a very odd user agent is hard coded in the dropper (a user agent which seems to mimic internet explorer 6 in combination with windows 2K). We can also establish that the URIs of the payload really are random and do not posses an extension (e.g. "7h8gbiuomp", "HJ6bhGHV" and "nJHbj0266b"). Via analytics on the URIs, e.g. via the Markov language model, the randomness of these URIs are an additional indication of a possible Locky payload download.

Image 6: user agents used by the droppers downloading the Locky payload

Network Locky dropper threat rule:
-Invalid user agent (UA) in your environment (OR)
-UA: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0) (OR)
-UA: Mozilla/4.0 (compatible; MSIE 7.0; ...; Media Center PC 6.0; ...) (AND)
-HTTP URI: [a-zA-Z0-9]{6,12} + Markov language model: suspicious word

There is also a pattern in the C&C traffic of the three waves of Locky payloads. Locky first tries to send its encryption keys and infection statistics to a few hardcoded IPs. It uses a HTTP POST method to send this data. The HTTP request does not contact a domain, instead a direct post to an IP is used (image 7). A direct POST to an IP, in combination with a short URI (e.g. "php/upload.php") makes this this kind of traffic distinguishable from "regular traffic". A similar traffic pattern is also defined in the following rule in Proofpoints Emerging Threats ruleset: "ET TROJAN Win32/Necurs Common POST Header Structure (trojan.rules)". When Locky can not contact the hard coded IPs it falls back on a domain generation algorithm (DGA). The domains generated by the DGA (e.g. vpicxnklv.biz, nyhxevjuevr.click) may also be discovered in proxy logging via Markovs language model.

Image 7: command and control traffic of Locky payloads (two waves)

Network Locky payload threat rule:
-HTTP host equals: (?:\d{1,3}\.){3}\d{1,3} (AND)
-HTTP method equals: POST (AND)
-HTTP URI equals: [a-zA-Z0-9-\.\/_]{4,25}.php

Addendum

Edit: If we look back at ten additional Locky SPAM waves, taking place between late May and the end of August, it can be confirmed that the above threat rules are valid for at least three months. This further supports the pyramid of pain theory in a sense that hashes, IPs and domains change every SPAM wave, but network and host artifacts do not. Samples of these waves can be found here.

Image 8: example network artifacts on ten additional Locky waves

References

Pyramid of Pain: Intel-Driven Detection/Response to Increase Adversary's Cost (D. Bianco)
Markov models: detecting malware through language recognition
Locky SPAM wave 1, 2 and 3: IOCs (pastebin)
dynamoo blog Locky SPAM wave 1
my online security blog Locky SPAM wave 2 | wave 3

Blogs:

Hashes, domains and IPs

Network and Host artifacts

References