https://cyberdefenders.org/labs/42

When we first unzip the archive, we get a large number of files. The challenge description says there are only five files (although apache2 is a folder containing three files, so seven in total); however, I found some answers are not in those seven files, so we need to consider all the files in the archive.

Ideally we would create some tools and scripts to aid parsing the data, or even import it into a SIEM, and it would make it a lot quicker - but for my own learning I want to look through manually.

Initial Analysis

Manipulating the Logs

As part of my analysis, I need to extract some information from the logs. Without using a script or a tool, the easiest way is to use regex to reorder and filter the data. It's possible these will lose some data but they worked for me!

I love regex.

auth.log - sorted by command then time

  1. Move date to between command and PID: ^(.{15}).{7}(\\w+)(:|\\[)$2 | $1 | $3
  2. Change month names to month numbers: Mar03 , Apr04 , May05
  3. Sort Ascending (built into VSCodium; you have to select the lines first)

102,164 lines → 102,164 lines