https://cyberdefenders.org/labs/42
When we first unzip the archive, we get a large number of files. The challenge description says there are only five files (although apache2 is a folder containing three files, so seven in total); however, I found some answers are not in those seven files, so we need to consider all the files in the archive.
Ideally we would create some tools and scripts to aid parsing the data, or even import it into a SIEM, and it would make it a lot quicker - but for my own learning I want to look through manually.
auth.log
for loginsdaemon.log
is mainly dhclient
, mysqld
, ntpd
, collectd
www-access.log
and www-media.log
make lots of references to WordPresskern.log
)www-media-log
)As part of my analysis, I need to extract some information from the logs. Without using a script or a tool, the easiest way is to use regex to reorder and filter the data. It's possible these will lose some data but they worked for me!
I love regex.
^(.{15}).{7}(\\w+)(:|\\[)
→ $2 | $1 | $3
Mar
→ 03
, Apr
→ 04
, May
→ 05
102,164 lines → 102,164 lines