Security Monitoring with Attack Behavior Based Signatures

Monday, May 25, 2015 Posted by Corey Harrell 7 comments
Coaches and athletes both gather intelligence against their upcoming opponent by watching game film. Based on what they learn, they adjust their strategies to account for their opponent’s strengthens, weaknesses, and tendencies. The analogy about watching game film does not translate well to information security. In sports, the film study is to identify a single opponent’s tendencies while in information security there is no film for the numerous threats a company is up against on a weekly basis. However, the concept of watching an opponent's techniques to identify tendencies does translate. These tendencies are how they compromise systems and by identifying tendencies enables a company to adjust their security monitoring program. Companies can have the visibility to detect threats in their environment even when those threats are new or unknown. Attaining this level of visibility is possible by leveraging attack behavior based signatures in security monitoring. 

Attack Vectors


SearchSecurity defines an attack vector as "a path or means by which a hacker (or cracker) can gain access to a computer or network server in order to deliver a payload or malicious outcome." Based on this definition, the attack vector is broken down into three separate components. The path or means is the exploit used, the payload is the outcome of the exploit, and the delivery mechanism is what delivers the exploit and/or the payload to the target. The definition combines the delivery mechanism and exploit together but in reality these are separated.

As defenders, exploring attack vectors enables us to better protect systems, detect when systems are under attack, and determine how systems are compromised. The Compromised Root Cause Analysis Model goes in to detail about identifying and understanding the artifacts left on a compromised system to determine the attack vector used in the attack. This post goes in to detail about how exploring attack vectors can be used to determine when systems are under attack.

When attacking a system the attacker is constrained to the environment they are targeting. In this environment, certain actions behave a certain way every time they are performed. The behavior is dictated by the operating system and the action performed results in the operating system behaving a certain way. This behavior occurs every time the action is performed making the activity detectable. The attacker controls the exploit and payload so these can be changed making detection harder. However, the delivery mechanism component of the attack vector is susceptible to security monitoring in the Windows operating system. The delivery mechanism is dependent on the operating system and this is the environment the attacker is constrained to. Each time the attacker uses that delivery mechanism results in the same activity occurring in the operating system. This is the activity detected when using attack behavior based signatures.

Delivery Mechanism: Malicious Word Document


To elaborate on delivery mechanisms resulting in the same activity an example is needed. Word documents are a mechanism used by attackers to compromise systems. The exploits in Word documents vary from macros to hyperlinks to Microsoft Word exploits. The payloads vary even more since an attacker can use anything. Case in point, some attackers use the Dridex malware as the payload while other attackers use the Dyre malware. Traditional signature detection mechanisms try to keep pace with the changes attackers introduce in the exploit or payload. Attack behavior based signatures instead focus on the delivery mechanism’s interaction with the operating system. The activity caused by the Word document executing in the Windows operating system to deliver malware. This behavior remains the same no matter how much obfuscation or encryption attackers use to conceal the exploit and/or payload. To demonstrate this behavior what follows is a walkthrough of the activity of a malicious Word document being used as a delivery mechanism and the activity is monitored with the Microsoft Sysmon software. The sampled used in this walkthrough is the malicious document MD5 d89c0affa2c1b5eff1bfe55b011bbaa8 obtained from Malwr.com.

To compromise a system the malicious document needs to be executed. The picture below shows what occurs when the document is executed by the user. Upon execution, the user's shell (Explorer.exe) creates a process for the program that is the default reader for Word documents (files ending in .doc). In this instance and similar to most systems in enterprise environments, the default reader is Microsoft Word and the program's executable is named WINWORD.EXE.


At some point after the WINWORD.EXE process creation the exploit in the document is ran. Again, the exploit varies from macros to links to Microsoft Word exploits to embedded executables. Regardless of the exploit used, the activity of using Word as a delivery mechanism is the same and is shown in the picture below. Microsoft Word creates a child process for the payload of the attack. In this instance, WINWORD.EXE creates the process for the file kiramin86.exe inside a user profile's temp folder.


This is the activity that is susceptible to security monitoring. Microsoft Word (WINWORD.EXE) being the parent of another process that is a Windows binary (i.e. exe, pif, dll). Depending on the organization, this activity is the anomaly since Microsoft Word may rarely be the parent process of another executable or try to execute another executable. This is the type of activity attack behavior based signatures can focus on to detect new and unknown threats. The signature can be narrowed down even further to make it more accurate - such as focusing on executables in user profiles - but in essence this is the activity being detected. The signature is able to detect attack vectors using Word documents delivery mechanisms even if the exploit and payloads are different.

Technique Detection: Bypassing UAC


Attack behavior based signatures are not only limited to detecting the attack vector’s delivery mechanism. The concept can be applied to other techniques used by the attacker. At times attackers leverage techniques to bypass Windows User Account Control (UAC). UAC is a feature in Windows where every application ran under an administrator user account only runs in the context of a standard user. Bypassing UAC is a way for attackers to elevate their standard user privileges to administrator privileges. (For more information on UAC see the post UAC Impact on Malware.) The malicious document used in the previous walkthrough delivers the Dridex malware and this malware has a UAC bypass.

The May 12, 2015 post A New UAC Bypass Method that Dridex Uses outlines the current technique Dridex uses to bypass UAC. The article stated the following about how Dridex uses the application compatibility database to bypass UAC:

"An application compatibility database is a file that configures execution rules for applications that have compatibility issues. These files have an extension of sdb. Dridex leverages this feature to bypass UAC."

".Dridex uses the sdbinst command to install/uninstall application compatibility databases to install $$$.sdb."

This UAC bypass technique is constrained to the Windows environment and results in the same activity occurring in the operating system. The picture below shows the activity. The sbdinst.exe process is created and the commandline used to create the process points to an application compatibility database file (.sdb) inside the user profile. It’s activity that occurs every time so it is susceptible to detection through security monitoring. The signature’s logic could be the image value containing “sdbinst.exe” and the commandline containing .sdb file in a user profile.


Leveraging Attack Behavior Based Signatures


To leverage attack behavior based signatures in security monitoring to detect new and unknown threats is achievable. The approach requires technology to provide visibility on enterprise end points and the backend needs technology for the collection and analysis of the logs from the endpoints. The technology on the end point needs to provide visibility involving the Windows process and files the process interact with. This article leveraged Microsoft Sysmon utility since it is freely available to anyone and was suggested to me by Harlan Carvey. Other options are available for the end points including possibly existing agents that may already be deployed in enterprises. The technology for the collection and analysis of logs from the end points need to support regex or wildcards for querying the logs to identify the attack patterns. Attack behavior based signatures tend to focus on characteristics of processes involved in the attack activity. For malicious documents with Microsoft Word installed on the endpoint, the focus is on WINWORD.EXE and not necessarily the entire file path since this executable can be located in different folders (i.e. 32bit versus 64bit Word programs). Regex or wildcards support in queries enables this type of flexibility when building detection signatures.

Another consideration is attack behavior based signatures need to be used in layers customized to an enterprise. The walkthrough only demonstrated one signature for Word documents but there are numerous other attack vectors to account for. Each attack vector needs to be customized to the enterprise to account for the software installed on their endpoints. Further customization is needed since y their nature attack behavior based signatures results in false positives. The signatures detect patterns in the activity involving Windows processes. This activity can be either malicious or normal behavior. To identify false positives triage processes need to evaluate the activity flagged by the signatures to determine if they are false positives or security events. For reoccurring false positives, the signatures need to be tuned to reduce the noise from normal behavior.

Despite the technology, process, and customization challenges, leveraging attack behavior based signatures in security monitoring can be an effective approach for detecting new and unknown threats. The techniques used by attackers are constrained to our environments and their techniques cause activity on our systems that may be susceptible to detection. It just requires us to identify the activity susceptible to detection, build signatures to detect it, and then share with others to help them improve their monitoring capabilities.
Labels:

Introducing the Active Threat Search

Monday, May 11, 2015 Posted by Corey Harrell 7 comments
Have you found yourself looking at a potential security event and wishing there was more context. The security event could be an IDS/IPS triggering on network activity, antivirus software flagging a file, or a SIEM rule alarming on activity in logs. By itself the security event may not provide a bigger picture about the activity. Has anyone else seen the same activity? Where did the file come from? Is the event part of a mass attack or is it unique? Being able to run queries on certain security event indicators can go a long way in providing context to what you are seeing. This post is the formal introduction of the Active Threat Search that can help you identify this context.

The Active Threat Search is another Custom Google search. Similar to the Digital Forensic Search, Vulnerability Search, and Malware Analysis Search (by Hooked on Mnemonics Worked for Me), the Active Threat Search harnesses the collective knowledge and research of the people/organizations who openly share intelligence information.

To demonstrate how context can be provided let’s say the IDS/IPS tripped on numerous connection attempts being made to a server running SSH. This security event could mean a few things. Someone may had forgotten their credentials and tried numerous times to log in or someone (or something) found the open SSH port on the server and tried numerous times to log in. The bigger picture may not be readily apparent so additional context is needed. A search on the source IP address that triggered the IDS/IPS alert in the Active Threat Search may show something similar to the image below:


The search on the source IP address provides a wealth of context for the security event. The same source IP address has attempted attacks against other systems. This means the security event was something trying to log in to the server and not someone forgetting their password. Context changes everything and the Active Threat Search at times can help provide this context.

The Active Threat Search can be found at the top of jIIr or directly at this link:
https://cse.google.com/cse/publicurl?cx=011905220571137173365:eo5wzloxe_m


**********Sites Last Updated on 05/24/2015**********

The following is the listing of sites indexed by the Active Threat Search and this section will be continuously updated.

Bambenek Consulting  http://osint.bambenekconsulting.com/feeds/
Binary Defense Systems  http://www.binarydefense.com/banlist.txt
Blocklist.de  http://www.blocklist.de
Cisco Threat Intelligence  http://tools.cisco.com/security/center/
Cyber Crime  http://cybercrime-tracker.net/
Dragon Research Group  http://dragonresearchgroup.org/insight/
Dshield  https://dshield.org/
Dynamoo's Blog  http://blog.dynamoo.com/
Emerging Threats  http://rules.emergingthreats.net/
Emerging Threats List  https://lists.emergingthreats.net/pipermail/emerging-sigs
Feodo Tracker  https://feodotracker.abuse.ch/
hpHosts  http://hosts-file.net/
Malc0de Database  http://malc0de.com/database/
Malware Domain List  http://www.malwaredomainlist.com
MalwareDomains  http://www.malwaredomains.com/
Malware-Traffic-Analysis  http://www.malware-traffic-analysis.net
McAfee Threat Intelligence  http://www.mcafee.com/threat-intelligence
Malware URLs http://malwareurls.joxeankoret.com/
Multiproxy  http://multiproxy.org
MX Lab  http://blog.mxlab.eu/
OpenBL  http://www.us.openbl.org/
OpenPhish  https://openphish.com/
Palevo Tracker  https://palevotracker.abuse.ch/
Phish Tank  http://www.phishtank.com
Project Honeypot  https://www.projecthoneypot.org
SPAM404  http://www.spam404.com/
SPAMHAUS  www.spamhaus.org
SSL Blacklist https://sslbl.abuse.ch/blacklist/
Tor Exit Addresses  https://check.torproject.org/exit-addresses
URLQuery  http://urlquery.net
VirusTotal  http://www.virustotal.com
VX Vault  http://vxvault.net/
Zeus Tracker  https://zeustracker.abuse.ch/
Labels: ,

Making Incident Response a Security Program Enabler

Sunday, April 26, 2015 Posted by Corey Harrell 11 comments
Incident response is frequently viewed as a reactive process. As soon as something bad happens that is when the incident response process is activated to respond to what occurred. This view is similar to insurance. Every month we spend money on buying insurance so it is available when we need it. It doesn’t matter if the insurance gets used once in a year or not at all; money is still spent on a monthly basis to buy it. In a way, it’s easy to see the similarity to the incident response process. Resources - such as staffing and technology - are invested in the incident response process. In some organizations there is a sizable investment while in others very little. The hope is something is available when the organizations need it. How can one change an organization’s view of incident response? How can you take a traditional reactive process and make it in to a proactive process that’s an enabler for the organization’s information security program? This post discusses one approach to make incident response a security enabler by addressing: continuous incident response, incident response metrics, root cause analysis, and data analytics.

 

Continuous Incident Response


The traditional incident response models resemble the incident response lifecycle illustrated below that was obtained from the NIST Computer Security Incident Handling Guide.


Image obtained from http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-61r2.pdf 

The first phase involves the organization preparing for future incidents. The next phase is when an incident is detected and analyzed. This is followed by the containment, eradication, and recovery activities. At times when trying to remediate an incident, activities cycle back to detection and analysis to determine if the incident was resolved. After the incident is eradicated and the organization returns to normal operations a post incident activity is performed to see what did and didn't work out as planned. The lifecycle represents the traditional approach to incident response: incident detected -> organization responds -> incident eradicated -> organization returns to normal operations. This is the tradition reactive incident response process where the assumption is that nothing is going on until an incident is detected and after the incident is resolved it goes back to assuming nothing is going on.

To take a traditional reactive incident response process and make it in to a proactive process requires incident response to be seen in a different light. Organizations are under constant attack from daily malware infections to daily probing to daily exploit attempts to daily potential unauthorized access attempts. The model is no longer linear where an organization is waiting to detect an incident and then returning to normal operations. The new normal is being under constant attack and being at different stages in the incident response process concurrently. Richard Bejtlich stated in his book The Practice of Network Security Monitoring on page 188 regarding the model below:

the workflow in Figure 9-2 appears orderly and linear, but that’s typically not the case in real life. In fact, all phases of the detection and response processes may occur at the same time. Sometimes, multiple incidents are occurring; other times, the same incident occupies all four stages at once.”

Image obtained from http://www.nostarch.com/nsm

Richard expressed incident response is not a linear process with a start and end; on the contrary the process can be at different phases at the same time dealing with different incidents. Anton Chuvakin also touched on the non-linear incident response process in his post Incident Response: The Death of a Straight Line. Not only did he say that the “ “normal -> incident -> back to normal” is no more” but he summed up the situation organizations find themselves in.

“While some will try to draw a clear line between monitoring (before/after the incident) and incident response (during the incident), the line is getting much blurrier than many think.  Ongoing indicator scans (based on external and internal sources), malware and artifact reversing, network forensics “hunting”, etc all blur the line and become continuous incident response activities.”

The light incident response needs to be seen in is that it is a continuous process instead of a linear one. Incident response is not something that starts and ends but is an ongoing cyclical process where an organization is constantly detecting and responding to incidents. A process similar to David Bianco's the Intel-Driven Operations Cycle model shown below and was obtained from his The Pyramid of Pain Intel-Driven Detection and Response to Increase Your Adversary's Cost of Operations presentation.

Image obtained from https://speakerdeck.com/davidjbianco/the-pyramid-of-pain-intel-driven-detection-and-response-to-increase-your-adversarys-cost-of-operations

Seeing incident response as a continuous process is one that everyone must see from security practitioners to incident responders to management. Changing people’s perspectives on incident response will take time and every opportunity to sell it will need to be seized (don’t sell FUD but layout the actual threat environment we find ourselves in.) In time the conversation will go from viewing incident response as insurance that may or may not be needed to viewing incident response as continuous where people are detecting and responding to the daily security incidents. The conversation will go from “do we really need to invest in this since we only had a few incidents last year” to “we are continuing seeing these incidents due to this security weakness so how can we address it since it’s an area of concern.”

Operationalize Incident Response Information


Changing the view of incident response from a linear process to a continuous one is not enough to make it a security program enabler. To be a security program enabler incident response needs to contribute to the organization’s security strategy to help influence where security resources are focused. Too often incident response tries to influence the security strategy in a reactive manner. The reactive process resembles the following: incident detected -> organization responds -> incident eradicated -> organization returns to normal operations -> incident response recommendations provided. The attempts to influence the security strategy is based on the most recent incident. In essence, recommendations are being made based on a single event instead being made based on trends from numerous events. Don’t get me wrong, there are times when recommendations from a single event do influence the security strategy but to make incident response a security program enabler there needs to be more.

To re-enforce this point, a story about a local credit union that happened years ago may help. The credit union happened to be located at a busy intersection; its location was very accessible from buses, cars, bikes, and walking. One day a person walked in to the credit union, handed the teller a note, and then walked out with money. As an outsider looking at this single event, there was nothing drastic implemented from any recommendations based on this single robbery. The next week a similar event occurred again with someone handing the teller a note and walking out with cash. This occurred a few more times and each robbery was very similar. The robbery involved a person handing a note to the teller without any visible weapons shown. The credit union looked at all the robberies and they must have seen this pattern. In response, the credit union implemented a compensating control and this control was double doors to trap any individual as they try to exit the bank. After this control was implemented the robberies stopped. This story shows how incident response can become a security program enabler. The first robbery was a single event and the recommendation may had been to install trap doors. However, installing trap doors takes essential resources from other areas and this may not be in the best interest of the organization. As more data is collected from different events it causes a pattern to emerge. Now taking essential resources from other areas is an easier decision since the data analysis shows installing trap doors is not addressing a single event but a re-occurring issue.

The continuous incident response process needs to move from only providing reactive recommendations to producing intelligence by operationalizing the information produced by enterprise incident response and detection processes. To accomplish this, data and information needs to be captured from the ongoing detection and response activities. Then this data and information is analyzed to produce intelligence to be used by the security program. Some intelligence is used by the response and detection processes themselves but other intelligence (especially ones developed through trend analysis) is reported to appropriate parties to influence the organization’s security strategy. Operationalizing incident response information results in creating intelligence at various levels in the intelligence pyramid.


The book Building an Intelligence-Led Security Program authored by Allan Liska describes the pyramid levels as follows:

"Strategic intelligence is concerned with long-term trends surrounding threats, or potential threats to an organization. Strategic intelligence is forward thinking and relies heavily on estimation – anticipating future behavior based on past actions or expected capabilities."

"Tactical intelligence is an assessment of the immediate capabilities of an adversary. It focuses on the weaknesses, strengths, and the intentions of an enemy. An honest tactical assessment of an adversary allows those in the field to allocate resources in the most effective manner and engage the adversary at the appropriate time and with the right battle plan."

"Operational intelligence is real time, or near real-time intelligence, often derived from technical means, and delivered to ground troops engaged in activity against the adversary. Operational intelligence is immediate, and has a short time to live (TTL). The immediacy of operational intelligence requires that analysts have instant access to the collection systems and be able to put together FINTEL in a high-pressure environment."

As it relates to making incident response process a security program enabler, the focus needs to be on making the process contribute to the organization’s security strategy by producing tactical and strategic intelligence. Tactical intelligence can highlight the organization’s weaknesses and strengths then show where security resources can be used more effectively. Strategic intelligence can influence the direction of the organization’s long term security strategy. Incident response starts to move from being viewed as a reactive process to a proactive one once it starts adding value to other areas in an organization’s security program.

Improve Root Cause Analysis Capabilities


Before one can start to operationalize incident response information to produce intelligence at various levels in the intelligence pyramid they must first improve their root cause analysis capabilities. Root cause analysis is trying to determine how an attacker went after another system or network by identifying and understanding the remnants they left on the systems involved during the attack. This is a necessary activity for one to discover information during a security incident that can be operationalized. The Verizon Data Breach Investigations Report is an excellent example about the type of information one can discover by performing root cause analysis. The report highlights trends from “time to incident discovery” to “time to compromise” to exploited vulnerabilities to frequency of attack types to hacking actions. None of this data would had been available for analysis if root cause analysis wasn’t completed on these incidents.

Take the hypothetical scenario of a malware infected system. Root cause analysis discovered the attacker compromised the system using a phishing email containing a malicious Word document. At this point there is various data one can then turn in to intelligence. At the operationally level, the email’s subject line, content, from address, and Word document attachment name can all be documented and then turned in to intelligence for response and detection activities. The same can occur for the URL inside the Word document and the malware it downloads. Doing root cause analysis on all infections can then make data available to do trend analysis. Is it a pattern for the organization employees to be socially engineered through Word documents? Can resources be applied in other areas such as security awareness training to combat this threat? In time, more and more data can be collected to reveal other trends to help drive security. Performing root cause analysis on each incident is needed to operationalize incident response information to produce intelligence in this manner. The Compromised Root Cause Analysis Model is one model to use and it is described in the post Compromised Root Cause Analysis Model Revisited.


Incident Response Metrics


The outcome from performing root cause analysis on each incident is discoverable information. It’s not enough to consistently do root cause analysis to discover information; the information needs to be documented and analyzed to make it into intelligence. Different options are available to document security incident information but in my opinion the best available schema is the VERIS Framework. The “Vocabulary for Event Recording and Incident Sharing (VERIS) is a set of metrics designed to provide a common language for describing security incidents in a structured and repeatable manner.” The VERIS Framework is open and can be modified to meet an organization’s needs.

The schema is well designed but to support an internal incident response process some modifications may be needed. This post won’t go in to great detail about the needed modifications but I will mention a few to make the schema better support internal incident response. In the Incident Tracking section, to make it easier to track security incidents the following can be added: Incident Severity (to match the incident response process severity for incidents), Hostname (of the targeted system), IP Address (of the targeted system), Username (involved in the incident), and Source IP Address (of the attacker’s system). In the Victim Demographics section, these may or may not apply for an internal incident response process. Personally, I don’t see the need for tracking this information if the incident response process supports the same entity. In the Incident Description section, the biggest change is outlining the expected values for the vectors and vulnerabilities. For example, for the vulnerabilities list out each possible vulnerable application - such as Java vulnerability - instead of allowing for specific CVEs. This reduces the amount of work needed on doing root cause analysis without losing too much on the metrics side. The last changes I’ll discuss are for the Discovery and Response section. In this section make sure to account for the various discovery methods the organization may use to detect incidents as well as the intelligence sources behind those methods. This slight change enables an organization to measure how they are detecting security incidents and to evaluate the return on investment for different intelligence sources.

Data Analysis


Information that is documented is only data and does not become intelligence until it is analyzed and refined so it is useful to others. There are different options available for organizations to produce intelligence from the information discovered during root cause analysis. The book Data-Driven Security: Analysis, Visualization and Dashboards goes in to detail about how one can do data analysis with free and/or open source tools. The route I initially took was to allow me to focus on the incident response process without getting bogged down trying to create visualizations to identify trends. At my company (this is the only item in this post directly tied to my employer and I only mention it in hopes it helps my readers) we went with a license for Tableau Desktop and I bought a personal copy of the book Tableau Your Data!: Fast and Easy Visual Analysis with Tableau Software. The combination of Tableau Desktop and the VERIS Framework makes it very effective at producing strategic and tactical intelligence that can be consumed by the security program. In minutes, you can create visualizations to highlight what departments in an organization is most susceptible to phishing attacks or to quickly identify the trends explaining how malware is entering the organization. The answers and intelligence one can gain from the incident response data is only limited by one’s creativity and the ability of those consuming the intelligence.

Making Incident Response a Security Program Enabler


The approach an organization can take to take incident response from a reactive process to proactive one involves the following steps:

      - Improving an organization's incident response capabilities
      - Improving an organization's root cause analysis capabilities
      - Improving an organization’s security monitoring capabilities
      - Influencing others to see incident response as a continuous process
      - Operationalizing incident response information
      - Collecting and documenting data for the organization’s incident response metrics
      - Analyzing the organization’s incident response metrics to produce intelligence
      - Presenting the intelligence to appropriate stakeholders

Making incident response a security program enabler is a gradual process requiring organization buy-in and resources to make it happen. As DFIR practitioners, we can only be the voice in the wilderness telling others incident response can be more than a reactive process. It can be more than an insurance policy. It can be a continuous process enabling an organization’s security strategy and helping guide how security resources are used. A voice hoping to influence others to make the right decision to better protect their organization.
Labels:

Python: print “Hello DFIR World!”

Wednesday, April 8, 2015 Posted by Corey Harrell 5 comments
Coursera's mission is to "provide universal access to the world's best education." Judging by their extensive course listing it appears as if they are delivering on their mission since the courses are free for anyone to take. I knew about Coursera for some time but only recently did I take one of their courses (Python Programming for Everybody.) In this post I'm sharing some thoughts about my Coursera experience, the course I took, and how I immediately used what I learned.

Why Python? Why Coursera?


Python is a language used often in information security and DFIR. Its usage is varied from simple scripts to extensive programs. My interest in Python was modest; I wanted to be able to modify (if needed) Python tools I use and to write automation scripts to make my job easier. Despite the wealth of resources available to learn Python, I wanted a more structured environment to learn the basics. An environment that leverages lectures, weekly readings, and weekly assignments to explore the topic. My plan was to learn the basics then proceed exploring how Python applies to information security using the books Black Hat Python and Violent Python. Browsing through the Cousera offerings I found the course Programming for Everybody (Python). The course “aims to teach everyone to learn the basics of programming computers using Python. The course has no pre-requisites and avoids all but the simplest mathematics.” Teaches the basics in a span of 10 weeks without the traditional learning to code by mathematics; the course was exactly what I was looking for.

Programming for Everybody (Python)


I’m not providing a full fledge course review but I did want to provide some thoughts on this course. The course itself is “designed to be a first programming course using the popular Python programming language.” This is important and worth repeating. The course is designed to be someone’s first programming course. If you already know how to code in a different language then this course isn’t for you. I didn’t necessary fit the target audience since I know how to script in both batch and Perl. However, I knew this was a beginner’s course going in so I expected things would move slowly. I could easily overlook this one aspect since my interest was to build a foundation in Python. The course leveraged some pretty cool technology for an online format. The recorded lectures used a split screen between the professor, his slides, and his ability to write on the slides as he taught. The assignments had an auto grader where students complete assignments by executing their programs and the grader confirms if the program was written correctly. The text book is Python for Informatics: Exploring Information, which focuses more on trying to solve data analysis problems instead of math problems like traditional programming texts. The basics covered include: variables, conditional code, functions, loops/iteration, strings, files, lists, dictionaries, tuples, and regular expressions.

Overall, spending the past 10 weeks completing this course was time well spent. Sure, at times I wish times moved faster but I did achieve what I wanted to. Exploring the basics of the Python language so I can have a foundation prior to exploring how the language applies to security work. The last thing I wanted to mention about the course, which I highly respect. The entire course from the textbook to the lecture videos is licensed under a Creative Common Attribution making it available for pretty much anyone to use.

Applying What I Learned


The way I tend to judge courses, trainings, and books is by how much of the content can be applied to my work. If the curriculum is not relevant to one’s work than what is the point in wasting time completing it? It’s just my opinion but judging courses and trainings in this manner has proven to be effective. To illustrate this point as it applies to the Python Programming for Everybody course I’m showing how the basics I learned solved a recent issue. One issue I was facing is how to automate parsing online content and consuming it in a SIEM. This is a typical issue for those wishing to use open source threat intelligence feeds. One approach is to manually parse it in to a machine readable form that your SIEM and tools can use. Another and a better approach is to automate as much as possible through scripting. I took the later approach by creating a simple script to automate this process. For those interested in Python usage in DFIR should check out David Cowen's Automating DFIR series or Tom Yarrish's Year of Python series.

There are various open source threat intelligence feeds one can incorporate in to their enterprise detection program. Kyle Maxwell’s presentation Open Source Threat Intelligence touched on some of them. For this post, I’m only discussing one and it was something I was interested in knowing how to do it. Tor is an anonymity service that enables people to hide where they are coming from as they surf the Internet. Tor has a lot of legitimate uses and just because someone is using it does not mean they are doing something wrong. Being able to flagged users connecting to your network from Tor can add context to other activity. Is the SQL injection IDS alert a false positive? Is the SQL injection IDS alert coming from someone who is also using Tor a false positive? See what I mean by adding context. This was an issue that needed a Python solution (or at least a solution where I could apply what I learned.)

To accomplish adding Tor context to activity in my SIEM I first had to identify the IP addresses for the Tor exit nodes. Users using the service will have the IP address of the exit node they are going through. The Tor Project FAQs provides an answer to the question "I want to ban the Tor network from my service." After trying to discourage people from blocking two options are presented by using either the Tor exit relay list or a DNS-based list. The Tor exit relay list webpage has a link to the current list of exit addresses. The screenshot below shows how this information is presented:


Now we’ll explore the script I wrote to parse the Tor exit node IP addresses into a form my SIEM can consume, which is a text file with one IP address per line. The first part –as shown in the image below - imports the urllib2 module that is used to open URLs. This part wasn’t covered in the course but wasn’t too difficult to figure out by Googling. The last line in the image creates a dictionary called urls. A dictionary associates a key with a value and in this case the key is tor-exit with the value being the URL to the Tor exit relay list. Leveraging a dictionary allows the script to be extended to support other feeds without having to make significant changes to the script.


The next portion of the script as shown below is where the first for loop occurs. The for loop will process each entry (key and value pair) in the urls dictionary. The try and except is a method to account for errors such as a URL not working. Inside the try section the URL is opened in to a variable named file and then it is read in to a variable named data using the urllib2 readlines() option. Lastly, a file is created to store the output using the key value and the file handle is named output.


The next part of the script –image below - is specific to each threat feed being parsed. This accounts for the differences in the way threat feeds present data. The if statement checks to see if the key matches “tor-exit” and if it does then the second for loop executes. This for loop reads each line in the data variable (hence the data listed at the URL.) As each line is read there is additional actions performed such as skipping blank lines and any line that doesn’t start with the string “ExitAddress.” For the lines that do start with this string, the line is broken up in to a list named words. Basically, it breaks the line up into different values by using the space as a separator. The IP address is the second value so it is contained in the second index location in the words list (words[1]). The IP address is then written to the output file and after each line is processed a message is displayed saying processing completed.


The screenshot below shows the script running.


The end result is a text file containing the Tor exit IP addresses with one address per line. This text file can then be automatically consumed by my SIEM or I can use it when analyzing web logs to flag any activity involving Tor.


It’s Basic but Works


Harlan recently said in his Blogging post “it doesn't matter how new you are to the industry, or if you've been in the industry for 15 years...there's always something new that can be shared, whether it's data, or even just a perspective.” My hope with this post is it would be useful to others who are not programmers but want to learn Python. Coursera is a good option that can teach you the basics. Even just learning the basics can extend your DFIR capabilities as demonstrated by my simple script.
Labels:

Compromised Root Cause Analysis Model Revisited

Wednesday, March 11, 2015 Posted by Corey Harrell 3 comments
How? The one question that is easy to ask but can be very difficult to answer. It's the question I kept asking myself over and over. Reading article after article where publicized breaches and compromises were discussed. Each article alluded to the answer about how the breach or compromise occurred in the first place but each one left something out. Every single one left out the details that influenced their conclusions. As a result, I was left wondering how they figure out how the attack occurred in the first place. It was the question everyone alluded to and everyone said to perform root cause to determine the answer. They didn’t elaborate on how to actually do root cause analysis though. Most incident response literature echoes the same sentiment; do root cause analysis while omitting the most critical piece explaining how to do it.  I asked my question to a supposed "incident responder" and their response was along the lines "you will know it when you see it." Their answer along with every other answer on the topic was not good enough. What was needed was a repeatable methodical process one can use to perform root cause analysis. The type of methodical process found in the Compromise Root Cause Analysis Model.

I developed the Compromise Root Cause Analysis Model three years ago to fulfill the need for a repeatable investigative process for doing root cause analysis. In this post I'm revisiting the model and demonstrating its usefulness by outlining the following:

        - Exploring Locard’s Exchange Principle
        - Exploring Temporal Context
        - Exploring Attack Vectors
        - Exploring the Compromise Root Cause Analysis Model
        - The Model is Cyclical
        - Applying the Compromise Root Cause Analysis Model
                * Webserver Compromise

Exploring Locard’s Exchange Principle


The essential principle in the Compromise Root Cause Analysis Model is Locard’s Exchange Principle. This principle states “when two objects come into contact, something is exchanged from one to the other.” Locard’s Exchange Principle is typically explained using examples from the physical world. When one object – such as someone’s hand – comes in to contact with another object – such as a glass – something is exchanged from one to the other. In this example, on the glass are traces of oils from the person’s hand, skin flakes, and even fingerprints.

The principle is not only limited to the physical world; it applies to the digital world as well. Harlan Carvey’s example demonstrated the principle in the digital world as follows: “well, in essence, whenever two computers come "into contact" and interact, they exchange something from each other.” The principle is not only limited to computers; it applies to everything such as routers, switches, firewalls, or mobile devices. The essence of this principle for the Compromise Root Cause Analysis Model is:

When an attacker goes after another system; the exchange will leave remnants of the attack on the systems involved. There is a transfer between the attacker’s system(s), the targeted system(s), and the networking devices connecting them together.

The transfer between the systems and networks involved in the attack will indicate the actual attack used. By identifying and exploring the remnants left by the transfer is what enables the question of “how did the attack occur in the first place” to be answered.

Exploring Temporal Context


The second principle and one that supports the Compromise Root Cause Analysis Model is the psychology principle of proximity. The principle of proximity is one of the Gestalt laws of grouping and states that “when we see stimuli or objects that are in close proximity to each other, we tend to perceive them as being grouped together.” As it relates to the Compromise Root Cause Analysis Model, the grouping is based on the temporal relationship between each object. Temporal proximity impacts the model in two ways.

The first way temporal proximity impacts the Compromise Root Cause Analysis Model is by enabling the grouping of remnants related to an attack. When an attacker goes after another system, remnants are left is various places within system and the network the system is a part of. Networking devices logs showing the network activity, application logs showing what the intruder was doing, and remnants on the system showing what the intruder accomplished are only a few of the places where these artifacts could be located. The attacker’s actions are not the only remnants left within the network and system. The organization and its employees are creating remnants every day from their activity as well as remnants left by the normal operation of the information technology devices themselves. Temporal proximity enables the grouping of the remnants left by an attacker throughout a network and system by their temporal relationship to each other. Remnants that occur within a short timeframe of each other can be grouped together while remnants outside of this timeframe are excluded. Other factors are involved to identify the attacker’s remnants amongst normal activity but temporal proximity is one of the most significant factors.

The second way temporal proximity impacts the Compromise Root Cause Analysis Model is the time that lapses between when an attacker attacks the system and an investigation is conducted affects the ability to identify and group the remnants left by the attacker. The reason for this impact is that “time is what permits other forces to have an effect on the persistence of data.” The remnants left by the attacker is in the form of data on information technology devices. The more time that goes by after these remnants are left the more opportunity there is for them to be changed and/or removed. Logs can be overwritten, files modified, or files deleted through activities of the organization and its employees along with the normal operation of the information technology devices. The more time that lapses between when the attack occurred and when the investigation begins the greater the opportunity for remnants to disappear and the inability to group the remaining remnants together. The Compromise Root Cause Analysis Model can still be used to identify these remnants and group them but it is much more difficult as more time lapses between the initial attack and investigation.

Exploring Attack Vectors


Root cause analysis is trying to determine how an attacker went after another system or network by identifying and understanding the remnants they left on the systems involved during the attack. In essence, the analysis is identifying the attack vector used to compromise the system. It is crucial to explore what an attack vector is to see how it applies to the Compromise Root Cause Analysis Model.

SearchSecurity defines an attack vector as "a path or means by which a hacker (or cracker) can gain access to a computer or network server in order to deliver a payload or malicious outcome." Based on this definition, the attack vector can be broken down into three separate components. The path or means is the exploit used, the payload is the outcome of the exploit, and the delivery mechanism is what delivers the exploit and/or the payload to the target. The definition combines the delivery mechanism and exploit together but in reality these are separated. The exploit, payload, and delivery mechanism can all leave remnants (or artifacts) on the compromised system and network and these artifacts are used to identify the attack vector used.

Exploit


An exploit is defined as "a piece of software, a chunk of data, or a sequence of commands that takes advantage of a bug, glitch or vulnerability in order to cause unintended or unanticipated behavior to occur on computer software, hardware, or something electronic (usually computerized).." An exploit takes advantage of a weakness in a system to cause a desirable activity on that system for the attacker. Exploits can target vulnerabilities in either operating systems, applications, or the people using the system. In accordance with Locard’s Exchange Principle, when the exploit comes in contact with the system containing the weakness remnants are left by the attacker. Identifying these exploit artifacts left on a system are one piece of the puzzle for identifying the attack vector.

Payload


A payload is defined (in security) as “the cargo of a data transmission.” A payload is the desirable activity on a system for the attacker that was caused by an exploit taking advantage of a weakness. In accordance with Locard’s Exchange Principle, when the payload comes in contact with the system remnants are left by the attacker. Identifying these payload artifacts left on a system are another piece of the puzzle for identifying the attack vector.

Delivery Mechanism


A delivery mechanism is defined as “a method by which malicious software places a payload into a target computer or computer network.” The delivery mechanism is what delivers the exploit and/or payload to the system to enable the desirable activity to occur on the system for the attacker. Similar to the exploit and payload, when the delivery mechanisms come in contact with the system remnants are left by the attacker. Identifying these delivery mechanisms artifacts left on a system are the last piece of the puzzle for identifying the attack vector.

Exploring the Compromise Root Cause Analysis Model


When an attacker goes after another system; the exchange leaves artifacts of the attack on the systems involved. These artifacts are identified during an investigation and grouped together based on their temporal proximity to one another. Root cause analysis identifies the attack vector used by determining what of the identified artifacts are related to the exploit, payload, and delivery mechanism(s). The Compromise Root Cause Analysis Model is a methodical process for organizing information and identified artifacts during an investigation to make it easier to answer the question of how did a compromise occur. The model is a not a replacement for any existing models; rather it’s a complimentary model to help discover information related to a system compromise. The Compromise Root Cause Analysis Model organizes the artifacts left on a network and/or system after being attacked into the following categories: source, delivery mechanism, exploit, payload, and indicators. The relationship between the categories are shown below.



Source


At the core of the model is the source of the attack. The source is where the attack originated from. Attacks can originate from outside or within an organization’s network; it all depends on who the attacker is. An external source is anything residing outside the control of an organization or person. An example is attacks against a web application coming from the Internet. Attacks can also be internal, which is within the network and under the control of an organization or person. An example is an employee who is stealing data from a company file server.

The artifacts left behind by the attacker on the system is used to determine where the attack came from. For example, if the attack originated from the Internet then the data left on the systems indicate this. Firewall logs, web application logs, proxy server logs, authentication logs, and email logs all will point to the attacker’s location outside of the organization’s network.

Delivery Mechanism


Proceeding to the next layer is the first delivery mechanism. This is the mechanism used to send the exploit to the system. The mechanism used is dependent on the attacker’s location. Attackers external to the organization may use avenues such as email, network services (i.e. HTTP, SSH, FTP, etc..), or removable media. Attackers internal to the organization may use avenues such as physical access or file sharing protocols.

The artifacts left behind by the attacker on the system is used to determine how they sent the exploit to the system. Where and what the artifacts are is solely dependent on the method used. If the method was HTTP then either web proxy, web browser histories, or web application logs will contain the remnants from the attacker. If the method was email then the email gateway logs, client email storage file, or user activity involving email will contain the remnants from the attacker.

Exploit


Continuing outward to the next layer is the exploit. The exploit is what was sent to take advantage of a vulnerability. As mentioned previously, vulnerabilities can be present in a range of items: from operating systems to applications to databases to network services to the person using the computer.
When vulnerabilities are exploited it leaves specific artifacts on the system and these artifacts can identify the weakness targeted by the attacker. Where and what the artifacts are is solely dependent on what weakness is targeted. The Applying the Model section illustrates this artifact for one vulnerability.

Delivery Mechanism


The next layer is the second delivery mechanism. A successful exploit may result in a payload being sent to the system. This is what the outer delivery mechanism is for. If the payload has to be sent to then there may be artifacts showing this activity. This is the one layer that may not always be present. There are times when the payload is bundled with the exploit or the payload just provides access to the system. Similar to the exploit, where and what the artifacts are present solely dependent on what the exploit was.

Payload


The next layer outlines the desired end result in any attack; to deliver a payload or malicious outcome to the system. The payload can include a number of actions ranging from unauthorized access to denial of service to remote code execution to escalation of privileges. The payload artifacts left behind will be dependent on what action was taken.

Indicators


The last layer in the model is the indicators layer. The layer is not only where the information and artifacts about how the attack was detected would go but it also encompasses all of the artifacts showing the post compromise activity. The reason for organizing all the other remnants left by the attacker into this layer is to make it easier to identify the attack vector artifacts (exploit, payload, and delivery mechanisms.) This results in the layer being broad since it contains all of the post compromise artifacts such as downloading files, malware executing, network traversal, or data exfiltration.

The Model is Cyclical


The Compromise Root Cause Analysis Model is a way to organize information and artifacts to make it easier to answer questions about an attack. More specifically to answer: how and when did the compromise occur? Information or artifacts about the compromise are discovered by completing examination steps against any relevant systems involved with the attack. The model is cyclical; as each new system is discovered the model is used to determine how the system was compromised. This ongoing process continues until each system involved with an attack is examined to confirm if it truly was a part of the attack.

To illustrate, take the hypothetical scenario of an IDS alert indicating an organization’s employee laptop is infected with malware. The IDS signature that flagged the network traffic is shown below (signature was obtained from the Emerging Threats emerging-botcc.rules.) As can be seen in the rule, the laptop was flagged for visiting an IP address associated with the Zeus Trojan.

 
The network packet captured in the IDS alert indicates the employee is a remote user connected through the organization’s VPN. The network diagram below shows the organization’s network layout and where this employee’s laptop is located.


The investigation into the employee’s laptop - remotely over the network – located the Zeus Trojan on the laptop. The examination continued by doing root cause analysis to determine how the laptop became infected in the first place. The employee was surfing the Internet prior to connecting to the organization’s network through the VPN. A drive-by attack successfully compromised the laptop when the employee visited the organization’s website. The IDS alerted on the infection once the laptop connected through the VPN. The investigation now uncovered another system involved with the attack (organization’s web server) and its location is shown below.


The organization’s main website is compromised and serving malware to its visitors. The investigation continues by moving to the compromised web server. The Root Cause Analysis Model is applied to the server to determine how it became compromised in the first place. The answer was an attacker found the webserver was running an outdated Joomla plug-in and exploited it. The attacker eventually leveraged the compromised web server to deliver malware to its visitors.

In this hypothetical scenario, the Compromise Root Cause Analysis Model was initially applied to a compromised laptop. The source of the attack pointed to another system under the control of the organization. The investigation continued to the newly discovered system by applying the Compromise Root Cause Analysis Model against it. The attack vector pointed to an attacker from the Internet so at this point all of the systems involved in the attack have been investigated and the root cause identified. If there were more systems involved then the cyclical process continues until all systems are investigated. The Compromise Root Cause Analysis Model enabled the attack vector for each system to be determined and the incident information discovered can then be further organized using other models. For example, the overall attack can be described using the Lockheed Martin's Cyber Kill Chain model.

Applying the Compromise Root Cause Analysis Model


The Compromise Root Cause Analysis Model is a way to organize information and artifacts to make it easier to answer questions about a compromise. The model can be applied to systems to either confirm how they were compromised or to determine if they were compromised. The article Malware Root Cause Analysis goes in to detail about how to apply the model for a malicious code incident involving a single system. However, the model is not limited to only malicious code incidents. It can be applied to any type of security incident including: unauthorized access, denial of service, malicious network traffic, phishing, and compromised user accounts. To demonstrate the model’s versatility, it will be applied to a hypothetical security incident using data from a published article. The incident is for a compromised Coldfusion webserver as described in the An Eye on Forensics's article A Cold Day in E-Commerce - Guest Post. The data referenced below either was influenced/borrowed from either the previously mentioned article, the Coldfusion for Pentesters presentation, or made up to appear realistic.

Webserver Compromise


An IDS alert flags some suspicious network traffic for an external system trying to connect to an organization’s Coldfusion web server located in their DMZ. The organization monitors for access attempts to the Coldfusion administrator web panel including access to features such as scheduling tasks. The external system triggered the IDS signature shown below because it accessed the Coldfusion’s scheduleedit located at hxxp://www.fake_site.com/CFIDE/administrator/scheduler/scheduleedit.cfm on an established session.


The reason the IDS alert is concerning is because what accessing scheduleedit means. One method an attacker can use to upload code on to a compromised Coldfusion server is by leveraging the scheduled tasks. The attacker can schedule a task, point it to their program’s location on a different server, and then have the task save it locally to the Coldfusion server for them to use (see page 85 in this presentation.) Accessing the interface to edit scheduled tasks is reflected by “scheduleedit” appearing in the URL. The IDS alert is triaged to determine if the Coldfusion server was successfully compromised and if an attacker was uploading anything to the server using the scheduled tasks feature.

The Coldfusion instance is running on a Windows 2008 server with IIS and its IP address is 192.168.0.1. The IIS log was reviewed for the time in question to see the activity around the time the IDS alert triggered.

2015-03-10 22:09:00 192.168.0.1 GET /CFIDE/Administrator/scheduler/scheduletasks.cfm - 80 – X.X.X.X fake-useragent  200 0 0 5353

2015-03-10 22:09:10 192.168.0.1 GET /CFIDE/Administrator/scheduler/scheduleedit.cfm submit=Schedule+New+Task 80 - X.X.X.X fake-useragent 200 0 0 5432

2015-03-10 22:09:15 192.168.0.1 GET /CFIDE/Administrator/scheduler/scheduletasks.cfm runtask=z&timeout=0 80 – X.X.X.X fake-useragent 200 0 0 1000

2015-03-10 22:11:15 192.168.0.1 GET /CFIDE/shell.cfm - 80 – X.X.X.X fake-useragent 200 0 0 432

The IIS logs showed the activity that tripped the IDS sensor occurred at 2015-03-10 22:09:10 when the external system with IP address X.X.X.X scheduled a new task successfully. Notice the 200 HTTP status code indicating the request completed successfully. This single entry answers one of the questions. The attacker did compromise the Coldfusion server and has administrator rights to the Coldfusion instance because they were able to access the schedule tasks area within the administrator panel. The next log entry shows the scheduled task named “z” executed at 2015-03-10 22:09:15 and shortly thereafter the attacker accessed a file named “shell.cfm”. Applying the Root Cause Analysis Model to this incident results in this activity along with the IDS alert being organized into the indicators layer. The activity is post compromise activity and the model is being used to identify the attack vector. The investigation continues to see what remnants the attacker left in the logs just prior to tripping the sensor while trying to upload their web shell.

The IIS log was reviewed to see what occurred prior to 2015-03-10 22:09:10 for the attackers IP address X.X.X.X. The few records are listed below:

2015-03-10 22:08:30 192.168.0.1 GET /CFIDE/adminapi/administrator.cfc method=login&adminpassword=&rdsPasswordAllowed=true 80 – X.X.X.X fake-useragent 200 0 0 432

2015-03-10 22:08:40 192.168.0.1 GET /CFIDE/administrator/images/icon.jpg 80 – X.X.X.X fake-useragent 200 0 0 432

The prior activity shows the attacker requesting a strange URL followed by successfully accessing the icon.jpg image file. Searching on the strange URL reveals it’s an Adobe ColdFusion Administrative Login Bypass exploit and when successful it provides access to the admin panel. This remnant is organized into the exploit layer. The payload of this exploit is direct access to the admin panel. There is no delivery mechanism for the payload. When the admin panel is accessed certain files are loaded such as images. In this scenario one of the images loaded by default is the file icon.jpg. This remnant indicates the attacker successfully accessed the admin panel so it means the exploit worked and the payload was admin access. The access to the icon.jpg file is organized into the payload layer. At this point the following layers in the Root Cause Analysis have been completed: indicators, payload, deliver mechanism, and exploit. The remaining layers are the delivery mechanism for the exploit and source. The attacker used a tool or web browser to attack the server so the delivery mechanism for the exploit is HTTP and the source of the attack is somewhere from the Internet.

The Compromise Root Cause Analysis Model was applied to the hypothetical web compromise security incident and it made it easier to review the remnants left by the attacker to identify the attack vector they used.

Root Cause Analysis Is Easier with a Methodical Process


The Compromise Root Cause Analysis Model is a cyclical methodical process one can use to perform root cause analysis. The model is way to organize information and artifacts discovered during investigations for each system involved in the attack. The model is a repeatable investigation process enabling the questions of how and when did the compromise occur to be answered.



References

Carvey, H. (2005). Locard's Exchange Principle in the Digital World. Retrieved from http://windowsir.blogspot.com/2005/01/locards-exchange-principle-in-digital.html

Harrell, C. (2010). Attack Vector Artifacts. Retrieved from http://journeyintoir.blogspot.com/2010/11/attack-vector-artifacts.html

Harrell, C. (2012). Compromise Root Cause Analysis Model. Retrieved from http://journeyintoir.blogspot.com/2012/06/compromise-root-cause-analysis-model.html

Harrell, C. (2012). Malware Root Cause Analysis. Retrieved from http://journeyintoir.blogspot.com/2012/07/malware-root-cause-analysis.html

Harrell, C. (2014). Malware Root Cause Analysis Dont Be a Bone Head Slide Deck. Retrieved from http://journeyintoir.blogspot.com/2014/06/malware-root-cause-analysis-dont-be.html

Hogfly. (2008). Footprints in the snow. Retrieved from http://forensicir.blogspot.com/2008/12/footprints-in-snow.html