Using logs for forensics after a data breach

08.11.2010
Despite the best precautions, it is impossible to protect your network against every attack. When the inevitable happens, your log data can be critical for identifying the cause of the breach and collecting evidence for use in the legal system. That is, if your logs were properly configured before the breach happened.

Log files are generated by all data processing equipment every time an activity takes place.  It is an electronic fingerprint with an added element: we know at what time that fingerprint was generated, so we are able to reconstruct what happened and in what order. Analyzing logs is the primary way of doing forensics, and properly managed logs can also be used as evidence in a court of law for prosecution purposes.

When you enable logs you can typically specify: 1) the severity level, which essentially specifies how severe the event needs to be to deserve creating a log message and 2) the level of detail captured in the log message, the so-called verbose level.

There are eight standard severity levels, from high-severity level 0 (called emergency, in which only emergency and extremely critical events are logged) to low-severity level 7 (called debug, in which almost any minute event is logged).

Verbose levels are less standards and vary on the vendors, makes and models of equipments.

The tradeoffs are obvious:

High severity, low verbose level:

* Few messages; each message is short.

* Little storage requirement, but you won't know much about what happened.

Low severity, low verbose level:

* Many messages; each message is short.

* Medium storage requirement and you will know when something happens, but you won't know much about what happened.

High severity, high verbose level:

* Few messages; each message is long.

* Medium storage requirement and you will be able to tell a lot about critical events but there's many events in which you’ll have no visibility at all.

Low severity, high verbose level:

* Many messages, each message is long.

* High storage requirements but you'll know a lot about any event happening.

The right approach is to apply a risk-management method to your logs. As such, you identify the set of systems that are important for you to keep logs from.

Indeed, it is not necessary to have a one-size-fits-all approach to severity/verbose; instead, you want to crank up the number and level of verbose of logs for important systems and dial it down for non-important systems.

We recommend creating four groups of systems, each corresponding to a category of severity/verbose level described above, and apply a different level of logging to each category.

Remember the rule of thumb: in case of doubt, go ahead and log it because you never know when you'll need a log. It is tempting to use debug-level logging, however, it typically generates so much information it will slow the systems down, so use it with caution; a typical setting is severity level 6 -- informational -- which generates lots of information without performance penalty.

Once you know the level of severity and verbose level of the logs you want, you are ready to answer the second question: "Where do I keep the logs?"

This question is important because some systems allow you to either store the logs locally or send them in real-time to a remote server. In fact, one of the first things the bad guys will do when attacking a system is try to tamper with the local log file to hide their tracks or plant fake evidence to send you running the wrong way.

Again, there are pros and cons for each of these methods:

Local storage:

* No need for the logs to be transported, but introduces operational complexity to properly manage rights and permissions on the directories containing the logs.

* Window of opportunity for bad guys to manipulate the logs in case a system gets hacked -- logs cannot necessarily be trusted.

* Operational complexity when doing forensics because of obligation to scour from system to system, each having its local logs.

UDP syslog to send logs to central repository:

* Unreliable transport mechanism with no guarantee of delivery but no need to manage each local system's log storage directories.

* Little possibility for bad guys to manipulate logs as they are being sent in real-time.

* Centralization of all logs providing a unique window into separate sources of logs.

Dedicated agent to send logs to central repository:

* Operational cost to deploy agents to every single source server from which we need to collect logs.

* Reliable transport mechanism.

* Little possibility for bad guys to manipulate logs as they are being sent in real-time.

* Centralization of all logs providing a unique window into separate sources of logs.

The right approach:  Use a risk-management method to assess which makes the most sense for your environment.  In high-security environments, you may want to deploy agents in each system you want to collect from, although the operational cost could be high if your scope contains many systems.

The middle ground, and probably the easiest method to get you up and running, is to send logs via the syslog protocol to a remote server. However, sending it to a dedicated log management solution is actually even better. This will insure all your logs are centralized, which will facilitate getting the most value out of them.

Another decision point is how long to store the logs.  This needs to be carefully considered, especially as there are legal requirements you may be subject to or industry-specific rules that apply to you.

For example, PCI-DSS requirements ask you to store the PCI-scoped logs for a year. Country-specific rules may require you to delete logs after a certain period of time so as to respect privacy.

The typical tradeoffs:

Keep the data for a long time:

* Higher likelihood that you'll have the log required to solve the crime.

* Higher storage requirement and performance impact to manipulate lots of data.

Keep the data for a short time:

* Maybe you'll have the log if you need it, but maybe not.

* Easy on your storage, and better performance.

The right approach: As I've mentioned before, the right approach would be to apply a risk-management method. You first need to identify the legal and industry constraints that apply, which will give you a minimum/maximum range, then you need to understand how far back in time you want go for your forensics. Again, there is no one-size-fits-all solution for this.

As a rule of thumb, keep the logs as long as possible while respecting legal and industry requirements and respecting privacy issues.

You need to trust the logs that you are using for forensics and in case of prosecution you also need to prove to a court of law that the logs are genuine and that nobody has tampered with them. How?

There are actually two types of integrity you need to prove.

* Integrity of each raw log. This will prove no log has been tampered with or manipulated

.*Integrity of the log sequence.  This will prove no log has been added and no log has been deleted.

This is not easy to do.  For example, signing each log will guarantee log integrity but not the integrity of the log sequence. 

At this point you can rule out most homegrown log management solutions, and you can throw away most open source solutions; in fact, there are not many solutions that are capable of providing both of these.

One way of proving both log integrity as well as log sequence integrity is to store raw logs in flat files that are digitally signed or at least hashed using a strong hashing mechanism.  Pros and cons are as follows.

Store raw logs as is:

* Logs are available in original format for most flexibility in subsequent use.

* Difficult/impossible to guarantee their integrity.

* Storage requirement fits complexity of logs.

Store normalized logs in database:

* Logs are available in "intelligent" form for easy reporting.

* Difficult/impossible to guarantee their integrity.

* Storage space wasted because of empty fields in database.

Store raw logs in signed flat files:

* Logs are available in original format for most flexibility in subsequent .

* Integrity of each log and integrity of log sequence can be proven.

* Efficient storage with possibility to compress flat files.

The right approach:  No matter what you use logs for, you need to insure that you are working off of legitimate logs, that the logs you have stored have not changed since they were received, and that no log has been added or deleted.  So you need to store them with some sort of proof of integrity.

If you store raw logs vs. normalized logs, make sure you understand what you want to do with your logs.  If you want maximum flexibility, then work off of raw logs and apply a treatment later.  If you want logs that are immediately usable for reporting or correlation, normalized storage is fine.  But once normalized, it could be difficult or even impossible to reconstruct the original message and prove its integrity.

Now you are in an ideal situation to perform forensics: you are working from a clear stream of data, with fresh, unaltered and pure information at your fingertips, and in case of misbehavior you have all the elements of information that will lead to the criminal.

If you have ever been under attack you can certainly understand the pressure to react fast.  And the first step is to understand what happened, who did it, how, what systems were affected and what needs to be done to stop the damage and prevent this from happening again.

Logs represent a gold mine for this task if you know how to leverage them, and if you have the proper tools to do so.  You know that the proof of the misbehavior is somewhere in there, somewhere mixed with billions of other logs, buried in terabytes or even petabytes of data.

The process of doing forensics on a log management solution is similar to using an Internet search engine. Sometimes you know exactly what you're looking for; other times, it's a trial-and-error process.  Start with keywords and refine/modify these so you zoom in on the log or logs that explain you what happened.

Once you have zoomed in on the specific log or logs, you can now follow the trail of the crime and understand how the breach spread from system to system, how and why the attack was successful, and which systems were affected.  Each log becomes a piece of the puzzle as you answer questions such as: was it successful because there were missing security patches, or because passwords are in clear and a system was in promiscuous mode, or because the firewall was misconfigured, etc.

Since you have the trail of evidence, and you can prove that this evidence is clean thanks to the different integrity mechanisms addressed above, it will make it easier for you or for law enforcement agencies to prove the case in court if you decide to prosecute.

But don't wait for a crime before you think about your logs. Your forensics process will be excruciatingly painful if you have not switched on the logs, or they have been deleted, or they do not contain the right level of information, or you can't rely on them. Or if you may end up in a situation where you acknowledge the crime and you even know who did it, but you can't prosecute or even involve HR because you have no formal evidence against the perpetrator.

The log management process is a critical part of your forensics posture, and it is important to select a tool to automate and facilitate the management of your logs.

Disclaimer: I am not a lawyer and this does not represent legal advice; always check with your local lawyer for legal matters.

in Network World's Wide Area Network section.