Finding the needles in a log file haystack


Now select the Create Log Preset. Choose the type of file you will be examining (Event Logs, IPTables, Apache Log, IIS log). If your log doesn't fit into these categories, choose Simple or Advanced.

If you use Simple or Advanced you will have to fiddle around with the formatting to be able to read it properly, but it's fairly self-explanatory. The really cool feature that you can use once your file is in the database is the "Group By" button. This is where you can ask PyFLAG to group the messages by column. Doing so is an invaluable insight into the log that you are processing as it helps you pinpoint any outlying data point quickly.

Another tool that is helpful to narrow down the search: the latest version of Splunk, which makes it easy to import a file based on numerous formats. Splunk quickly presents that file for more complex searching capabilities.

For a quick search, starting with the Excel 2007 version and up, you can open up large files for review. When you find something of interest, simply use the "Find All" option to pull all of the instances of that nugget out of the file at once. Then you can click on the cell it identifies to go directly to that full reference.

Another quick analysis tool for log files is grep. Grep is a command line tool typically found in a Linux/Unix environment that allows you to do text searches. If you don't have this available, you can install Cygwin, which is a Windows version of a Linux/Unix command line. Grep is especially useful on large log files that don't open easily in other environments.