An Integrated Workflow: Investigating and Remediating a Mass-Malware Infection

Posted on 07.28.16 — by Jim Wojno

I recently had the opportunity to help a customer use Tanium to investigate and respond to an outbreak of the “Ponik” malware. Ponik is a downloader that can retrieve and install additional malware, as well as steal credentials, from an infected system. Though Ponik is an example of commodity mass-malware, it presented a good opportunity to demonstrate how Tanium provides an easy, integrated workflow for scoping, investigating, and remediating from an incident. In this post, we’ll specifically look at the integration and hand-off points between the different components of the Tanium platform, including Trace, IOC Detect, and Incident Response content.

Why undergo all this effort for mass malware? In truth, an analyst might use Tanium to immediately clean up a commodity infection rather than undergo the extra level of effort for an investigation. However – we’ve also seen targeted attackers pay-for-access to mass malware platforms, as well as other circumstances where the root cause of an infection or its attribution may be murky. Practicing on a known sample is also a great way to help get an IR team ready for the “real thing” should the need arise!

Initial Scoping and Quarantine

The customer started with the following basic information about their infection:

  • The malware dropper installs a randomly named executable file within the infected user’s “\AppData\Roaming\” directories.
  • The malware establishes persistence using an identical randomly-named value within the “HKEY_USERS\[SID]\SOFTWARE\Microsoft\Windows\CurrentVersion\Run” registry key

The first objective is to use Tanium’s incident response content to quickly identify infected systems. In prior blog posts, we’ve highlighted sensors like “Get Autoruns”, but in this case we can use a simple “Get Run Keys” or “Get Registry Key Value Names with Data” question to compare the registry run keys across all machines. The output of this query is shown below:


That identified at least one variant of the malware running out of “C:\Users\administrator\AppData\Roaming\”. By using “Get Running Processes with MD5 Hash” we can enumerate every running process in the enterprise to find matches based on hash, path, or name.


We searched for the name “ygwea” referenced in one of the discovered autorun keys; however, we could just as easily have filtered based on the directory (for example, to compare what’s running out of paths containing “\appdata\roaming\”) or hash.

We could go on with dozens of other sensor searches to look for other types of autorun entries, dormant files on disk, historical process or network activity, etc. But since we’ve already got at least one impacted system, let’s proceed to run a drill-down query to identify its computer name and IP address:


The resulting host name in our test lab is “win7x64”. To mitigate the risk of the system communicating with its command and control (C2) infrastructure during our analysis, we can click “Deploy Action” and allow Tanium to enforce a temporary network quarantine. This can prevent data loss, lateral movement and further damage to the enterprise, while still maintaining connectivity to Tanium.


Tanium’s Quarantine capability is an example of an Action that can be deployed to change the state of a system – based on any target criteria – and on a one-time or recurring basis. Later in this post, we’ll talk about additional Actions that can help with incident remediation. But first, let’s dig deeper and explore this infected system in Trace to gather additional evidence of compromise.

Analyzing the Infection in Trace

Everything discussed to this point can be used to investigate active intrusions in real time. Tanium Trace expands this capability by adding historical context and telemetry, allowing us to search and examine short-lived activity that would otherwise be difficult to recover from available forensic evidence. Trace can help identify Patient Zero by pinpointing the endpoint, time, user and other activity surrounding the initial infection. This can be helpful to understand how to remediate and avoid similar incidents moving forward.

We started our analysis of the infected host in Trace by searching for the historical execution of malicious process “ygwea.exe”, as identified by our previous enterprise-wide searches. A screen-shot of the resulting process details is shown below:

ponik5The process tree illustrates that the malware downloader, “dloader-ponik-sample.exe”, created two child processes: “ygwea.exe” and “cmd.exe”.  Scrolling through the recorded events in the Detailed Process History below, we can see all file, network, registry, and child process operations performed by the downloader and its sub-processes during the initial stages of infection. The excerpt in the screenshot below shows the malware establishing its persistence mechanism by adding a value to the Run key within a user’s registry hive.


The question mark icon at the end of each row permits a quick pivot to an enterprise-wide search of the selected artifact. This allows analysts to easily switch between deep-dive analysis of an individual system and further attempts to scope the impact of an incident across an environment.


Analysts can also click the “Add Evidence” button to preserve any finding identified in Trace. Tanium retains saved evidence with context; for example, a network connection event will be preserved with the accompanying metadata of the process and user responsible for the activity. This can help team members collaborate on findings or easily produce investigation reports following the completion of their work. An example of Saved Evidence from this infected system is shown below:


Automating Detection with IOCs

The Trace Saved Evidence feature also provides the ability to generate Indicators of Compromise (IOC). Rather than requiring that teams manually exchange findings in documents, spreadsheets, or through the use of cumbersome IOC editors, this feature lets users quickly turn the results of forensic analysis of indicators that Tanium IOC Detect can use for automated enterprise-wide searches. The screen capture below shows an example of a simple indicator generated from saved evidence.


Once this indicator is saved, it’s sent off to Tanium IOC Detect. Users can configure automated, recurring detections against any groups of systems for any sets of IOCs – be they from Trace analysis, external threat feeds, or manually imported indicators or Yara rules. In the example shown below, we’ve identified several hosts containing artifacts related to the malware.


Note that these hits include a Prefetch file, identified since one of the IOC terms matched part of its filename. The hit was prefixed with “[index]”, indicating the file was found via Tanium’s disk indexing feature. Index allows Tanium searches and IOC detections to find files anywhere on disk by name, path, hash, or header bytes within seconds – even if they aren’t running in memory or haven’t recently changed – and without the need for cumbersome and I/O intensive crawls through the hard drive.

When combined with sensors for live activity and the data in Trace, this means that Tanium can evaluate IOCs against current-state, historical, and at-rest evidence across hundreds of thousands of systems in a few minutes, with minimal endpoint impact. Specific to this scenario, that can help ensure that we find both active and dormant infections – even if they happened before Tanium was deployed to an environment.

Remediating the Infection 

With our IOCs in place and a few additional infected systems identified, let’s switch to a remediation workflow so that Tanium can help clean up this infection. As mentioned earlier, Tanium Actions provide a simple, scalable mechanism to automatically act on systems based on user or system-defined criteria.

For example, we could target all systems running processes with a name, path, or hash matching the Ponik malware, and use Tanium’s Deploy Action to run a “Kill Process” action. On a one-time basis, this would simply kill the malware process wherever it was running in the environment. As a scheduled action, this could automatically kill the process wherever it showed up again in the future.


Similarly, we deploy an action to remove the malware’s registry-based persistence mechanism, ensuring it does not restart once its process has been killed (and ensure we don’t get false-positives from previously infected hosts during future investigations). The screen shot below shows the parameters for the “Registry – Delete Value” package.


In both examples, we’re only remediating a handful of infected systems; however, Tanium can deploy these same actions against thousands or hundreds of thousands of systems in the same easy-to-use workflow. 

Finally, Tanium Protect can be used to craft policies to prevent the execution of the discovered malware based on a specific file name, hash, or even its directory path. Protect can also enforce endpoint network restrictions that block connections based on originating process name, path, or other criteria. For example, we could block programs that execute from “\AppData\Roaming\” from connecting to the Internet on targeted computer groups within the enterprise. But we’ll save details on that workflow for another blog post!


Using Tanium, both targeted intrusions and opportunistic malware infections can be thoroughly investigated, documented and remediated quickly and easily. Tanium’s IR toolset provides a thorough investigation of the real-time, current state of the environment. Trace adds historical context that can help identify additional new infections, lateral movement, and broader threat actor tactics techniques and procedures (TTPs). IOC Detect completes this picture by allowing for fast, automated searches of structured threat data. Finally, Tanium’s integrations among these workflows allow users to efficiently pivot from investigating a compromise to generating IOCs, enacting remediation and prevention workflows, and validating their effectiveness over time.