Now the holidays are out of the way I can get back to posting. This post will be a two part series. I tried to get it out in one part, however, time and commitments are not allowing it.
As the title suggests, this topic is on Rapid Response & Assessment (RR&A) . When a suspect device comes your way, how do you respond to get the answers you need quickly? The two posts will discuss a given approach that has evolved within my environment based upon the influences I am exposed to.
I hope you enjoy the read and I encourage feedback. I want to know how your organization deals with RR&A.
PART1
Preface
Rapid Response & Assessment (RR&A) is exactly that. A quick response by IT Security personnel assessing a potential concern they have been alerted to. Get it right; by implementing the correct program, you will be in a good position to fend of the bad guys and minimize the potential damage that will undoubtedly unfold. Get it wrong, well, no need to explain that outcome!
Depending on the tools, resources (head count & money), talent, company culture, management support, Corporate Security & HR Support, priorities, policies, processes, internal politics, significance in the global economy, regulations, and the sense of urgency that individuals within the organization place on Rapid Response & Assessment, some, or all of these influences may contribute to how individual organizations respond to suspect devices that appear on their radar for one reason or another. No two organizations are alike, and so goes their RR&A approach. My organization is no different. We execute our existing Rapid Response and Assessment procedure based upon the vast majority of the above-mentioned items. Is it perfect? No. Is it driven by a person with a passion for doing this stuff who is continuously tweaking it and fine tuning it, always trying to do the right thing for his employer? Absolutely.
In no particular order, let’s take a cursory look at the above-mentioned items.
RR&A – Recipe for Success
1. Tools – Do you own tools that can give you rapid unfettered access to remote hosts to perform assessments?
2. Resources – Do you have the people and money to operate?
3. Regulations – Are there any regulations setting minimum requirements for on-site full time Incident Response and Digital Forensic personnel?
4. Talent – Do you have the right people with the required skill sets?
5. Culture – What is the cultural attitude within the organization on this issue?
6. Management Support – Do the correct management within the organization support your efforts?
7. Priorities – Where does RR&A stack up on the priority list?
8. Policies – Do you have policies in place to guide your response?
9. Processes – Do you have tested & mature processes to execute RR&A?
10. Politics – Are there any inter group politics inhibiting your RR&A program?
11. Significance – What is your company’s significance within the global economy?
12. Urgency – Does your organization have an inherent sense of urgency to deal with these matters?
13. Corporate Security & HR Support – This one is an absolute must, however, you need to have all your ducks lined up to leverage ongoing support for your program. Do you have support from the folks in Corporate Security and Human Resources when you send them reports that require them to issue a slap on the wrist when employee misconduct is the cause? After all, do you want to see your report that details a virus outbreak as a result of an employee policy violation disappear into a black hole on the HR Managers desk after you have spent many hours putting it together? You must partner with these folks and make them clearly understand what it is you are bringing to them. Your concerns and recommendations must emphasize the seriousness of the mater and lay out the potentially devastating consequences based upon the users actions. For example, this could have been much worse and here’s why.
Any one of the above items can impact the level of success you enjoy with your RR&A program.
The purpose of this post is to detail the RR&A approach and methods used by my organization, in particular, my group, in hope that other individuals who may be struggling in this area may get some ideas on how they might be able to address RR&A in their own organizations, or I may get some feedback and suggestions on how to improve my own program. Even if this approach cannot be replicated due to specific constraints (see recipe for success), there may be pieces of the approach that can be utilized. It boils down to sharing knowledge, ideas, and varying methods/approaches in achieving a goal. Responding to an incident as quickly and as efficiently as possible to determine if there is a problem or not? I would like to hear how other organizations perform RR&A.
Shown below is high-level view of my environment. The end node environment is made up of end user devices such as desktops and laptops and then there are servers of various types. In this particular environment the Rapid Response & Assessment resides with the Forensic Response Team.
The Environment
30,000 Windows desktop devices (PC’s/Laptops) spread across multiple continents
4,000 Windows servers spread across multiple data centers
The Monitoring and Response Teams
Security Operations Center - First Alert, Triage & Incident Response Coordinators
Threat Management Team – Net Flow Analysis, Packet Analysts, Malware RE
Forensic Response Team – Host Assessment and Disk Forensics
The whole point of RR&A is to be in a confident position (see recipe for success) with mature and tested procedures (test and trust your procedures) that can quickly confirm or deny if a problem exists. You will need to determine very quickly if a reported incident will require additional follow up, or if you can stand down because it was a false alarm. My group typically responds to devices where the SOC had detected an anomaly based upon output in the SIEM (Security Information & Event Management). To be able to respond effectively and efficiently, you need the right tools deployed into the environment. For a large enterprise that is spread over multiple continents, an agent-based tool is best suited to give the responder instant access to the end node under question.
As the saying goes, there is more than one way (and tool) to skin a cat. The information here in one example of how a given organization performs RR&A. Again, see the Recipe for Success noted above that can influence your program and approach.
Tools & Procedures
Today’s response tools need to be as capable and versatile as the malware we are encountering. What does this mean? Well, the first thing that comes to mind is a tool that can perform remote RAM dumps (aka Volatile Data) and bring back the volatile data for analysis. Additionally, you need to be on a position to cherry pick the files of interest that will be reviewed during your assessment. For example, during incident response you need to be able to perform a RAM dump, start processing the dump via a tool such as Mandiant’s Redline, or Volatility, while at the same time collect and process other parts of the system activity. The particular tool used by my group to achieve the remote collection is EnCase Enterprise.
Analysis Tip: Use the 32-bit version of EnCase to acquire RAM dumps on 32-bit target devices.
At a high level, the RR&A goes something like this…Alert, Respond, Pull Data, Analyze, and Decide.
Alert
The Alert will come from the SOC. They have seen something suspicious, deemed it a concern, and opened up an incident ticket.
Respond
Once the incident ticket hits us, RR&A personnel are engaged to act.
Pull Data
The device is identified, attached to with our Enterprise Tool; a procedure is followed to start pulling Volatile Data and select files. The data of interest that is pulled is listed below in the order it is acquired.
Step 1 - Attach to target machine with EnCase Enterprise
Step 2 - Perform a Sweep Enterprise (a very useful EnCase feature)
Step 3 - Perform a RAM dump
Step 4 – Collect Prefetch files
Step 5 – Collect the $MFT
Step 6 – Collect UsnJrnl file
Step 7 – Collect NTUSER.DAT and UserClass.dat for logged on user
Step 8 – Collect SYSTEM, SOFTWARE, SAM, and SECURITY hives.
Step 9 – Collect Application, Security and System event logs.
The collection of the above data does not actually take that long. The forensic platform design and the placement of equipment throughout the environment facilitate the expeditious collection of the data and deliver it to the examiner to start analyzing.
Analyze
After pulling the data outlined above, we need to quickly parse it with our tools in an attempt to identify any notable concerns. Of note, the Sweep Enterprise feature built into EnCase is an incredibly useful feature. By entering the IP address of the target machine into Sweep Enterprise and executing it, we gain much visibility into the device activity on a number of levels. For example, it allows you to see running process, network connections, open files etc.
Decide
A decision is made on whether to pursue the device further. False alarms do happen and this allows the SOC to fine tune their alerts.
In part two of this post I will be discussing the actual procedures used to parse the collected data looking for indicators of compromise. The post will detail the tools and approach based upon the data points we want to analyze.