Skip to content

Week 12: Investigating and Analyzing Logs

Introduction

Every alert, every incident timeline, and every forensic report depends on one thing: logs. Packet captures show what happened on the wire, but logs record what happened at the system, application, and device level. They tell you who authenticated, what was allowed or denied, which URLs were requested, and when services started or stopped. A packet capture might reveal that a TCP connection was established between two hosts on port 22, but the SSH authentication log on the target tells you that someone tried the root password 327 times in two minutes and eventually got in.

For a network forensic analyst, the ability to open a raw log file and extract meaningful indicators is a non-negotiable skill. Logs exist everywhere in a production environment: endpoints generate authentication records, proxies record every outbound web request, firewalls log every policy decision, DNS servers record every query, and DHCP servers track every IP assignment. Each of these sources uses a different format, a different timestamp convention, and a different set of fields. An analyst who cannot read these formats is limited to what automated tools surface for them.

This chapter introduces the foundational concepts of log analysis for network forensics. You will examine three distinct log sources that represent the most common formats encountered in production environments: SSH authentication logs in syslog format, web proxy logs in Squid native format, and firewall logs in Fortinet key-value format. For each log type, the chapter breaks down the anatomy of individual entries, explains the timestamp format, identifies the forensically significant fields, and walks through parsing techniques using command-line tools.

This chapter builds directly on three earlier topics. Week 7 introduced web attack techniques (XSS, SQL injection, directory fuzzing) at the packet level; the proxy and firewall logs in this chapter show what those attacks look like from the defender's perspective. Week 10 covered IDS/IPS detection methodologies and alert output; the firewall IPS entries in this chapter are exactly the kind of alert data that triggers an investigation. Week 11 examined C2 frameworks including Metasploit; the proxy log section reveals what a Meterpreter tunneling attempt looks like in Squid access records.

Learning Objectives

By the end of this chapter, you should be able to:

  1. Explain the role of log data in network forensic investigations and how logs complement packet capture analysis.
  2. Identify and differentiate between common log formats (syslog, Squid native, Fortinet key-value) by examining raw log entries.
  3. Parse SSH authentication logs to identify brute-force attack patterns, including rapid authentication failures, parallel connections, and successful compromises.
  4. Analyze Squid web proxy logs to detect suspicious outbound activity, including denied connections on non-standard ports and internal server requests.
  5. Interpret Fortinet firewall logs across multiple event subtypes (traffic, VPN, IPS, authentication, UTM) and extract forensically relevant indicators.

12.1 Foundations of Log Analysis for Network Forensics

Before examining specific log types, it helps to understand what logs are and how they relate to the packet-level analysis you have been doing throughout this course. Packets and logs are complementary evidence sources. Packets capture the raw data exchange between hosts: every byte, every header, every payload. Logs capture the decisions that systems made about that traffic and the events that occurred as a result: authentication succeeded, connection denied, file downloaded, signature matched.

Consider a simple example. A firewall log entry might show that a connection from an external IP to an internal web server on port 80 was allowed by policy 49 and that the IPS engine detected a ZmEu vulnerability scanner signature in the traffic. That single log entry tells you the firewall's decision (allow), the detection result (vulnerability scanner), and the policy that governed it. But it does not tell you what the scanner actually sent. For that, you need the packet capture. Effective investigations require pivoting between both data sources.

Where Logs Come From

In a typical enterprise network, log sources fall into four categories:

  • Endpoint logs include system authentication records (auth.log on Linux, Security Event Log on Windows), application logs, and service-specific logs. These record what happened on individual hosts.
  • Network device logs come from firewalls, routers, switches, and load balancers. These record traffic decisions, routing events, and device health at the network infrastructure layer.
  • Server logs are generated by web servers (Apache, Nginx), proxy servers (Squid), DNS servers, DHCP servers, and database servers. These record application-level transactions.
  • Security appliance logs come from IDS/IPS engines, web application firewalls (WAFs), and SIEM platforms. These aggregate, correlate, and alert on activity from multiple sources.

Each category produces logs in different formats. This chapter covers three of the most common format families: syslog, native application formats, and key-value structured logs. Week 13 adds a fourth, Zeek tab-separated logs, and introduces cross-source correlation techniques.

Log Format Families

Format Structure Timestamp Convention Example Source
Syslog (RFC 5424) Space-delimited, positional fields MMM DD HH:MM:SS (no year, no timezone) SSH auth.log, Linux system logs
Native Application Application-specific layout Unix epoch with milliseconds Squid access.log
Key-Value field=value pairs, self-documenting Varies (often date=YYYY-MM-DD time=HH:MM:SS) Fortinet, Palo Alto, many enterprise appliances
Zeek TSV Tab-separated, header-defined columns Unix epoch with microseconds dns.log, dhcp.log, conn.log

The timestamp column in this table is worth studying. Each format represents time differently, and correlating events across sources requires converting all timestamps to a common format (typically UTC). A mismatch of even a few minutes can cause you to miss the connection between two related events. Each section in this chapter covers its log type's timestamp format in detail.

Introduction to Command-Line Log Parsing

Throughout this chapter, you will use four core command-line tools to extract data from log files. These are the same tools that working analysts use daily in SOC environments and incident response engagements.

Terminal output showing four core command-line log parsing tools: grep for filtering lines by pattern, awk for extracting fields by position, sort and uniq for frequency counting, and wc -l for line counting.
Figure 12.1: Core command-line tools for log analysis. These four commands form the foundation of manual log parsing.

grep filters lines that match a pattern. When you run grep 'Failed password' auth.log, grep returns only the lines containing that exact string, allowing you to isolate relevant entries from a file that may contain thousands of unrelated records.

awk extracts specific fields by position. Most log formats use a consistent delimiter (space, tab, or custom character), and awk can pull individual fields by number. Running awk '{print $1, $2, $3}' auth.log extracts the first three fields (the timestamp in syslog format) from each line.

sort and uniq work together to produce frequency counts. The pipeline sort | uniq -c | sort -rn takes a list of values, groups identical entries, counts them, and sorts by count in descending order. This is the standard technique for answering questions like "which IP address generated the most failed login attempts?"

wc -l counts lines. When piped after a grep filter, it answers "how many times did this event occur?" This provides volume context that helps distinguish a handful of mistyped passwords from a sustained brute-force attack.

Analyst Perspective

In a real SOC, you will rarely have a single perfect log source. Most investigations require correlating 3-5 different log types to build a complete picture. This chapter teaches you to read each source individually. Week 13 ties them together into a unified correlation methodology.


12.2 SSH Authentication Logs

SSH authentication logs are among the first log sources an analyst examines when investigating unauthorized access to Linux and Unix systems. These logs are generated by the PAM (Pluggable Authentication Modules) framework and the SSH daemon, and they are written to /var/log/auth.log on Debian-based systems (including Ubuntu and Kali) or /var/log/secure on Red Hat-based systems (including CentOS and Fedora). They use the standard syslog format.

Anatomy of a Syslog Entry

Every entry in auth.log follows the syslog structure. Understanding this structure is the first step toward extracting useful data from any syslog-formatted file.

Annotated syslog entry from auth.log with color-coded fields: timestamp (Mar 24 11:59:15), hostname (kali), process and PID (sshd[27294]), and the free-text message body describing an authentication failure.
Figure 12.2: Anatomy of a syslog entry from auth.log. Each field is color-coded by forensic function.

The format breaks down as follows:

  • Timestamp (e.g., Mar 24 11:59:15): The date and time the event was recorded, in the format Month Day Hour:Minute:Second. Syslog timestamps do not include the year or timezone, a significant forensic limitation discussed below.
  • Hostname (e.g., kali): The system that generated the log entry. In centralized logging environments, this field identifies which host the event occurred on.
  • Process[PID] (e.g., sshd[27294]): The service name and process ID. The PID is useful for tracking a specific connection across multiple related log entries.
  • Message: The free-text body describing the event. The content and format of this field varies by service and event type.

Warning

Syslog timestamps in auth.log do not include the year or timezone. If your investigation spans a year boundary (December to January) or involves systems in different time zones, the raw timestamps alone are insufficient for accurate timeline construction. Always verify the system clock configuration and timezone setting as part of your evidence collection. On a live Linux system, timedatectl displays the current timezone and NTP synchronization status.

Key Log Messages for SSH Forensics

The SSH daemon produces several distinct message types, each carrying different forensic significance. The following table serves as a reference for the most commonly encountered messages.

Log Message Meaning Forensic Significance
pam_unix(sshd:auth): authentication failure PAM rejected the credentials First failure indicator; includes rhost (source IP) and target user
Failed password for [user] from [IP] port [port] ssh2 SSH-level password rejection Includes port number for connection tracking; high volume indicates brute force
Accepted password for [user] from [IP] port [port] ssh2 Successful authentication Critical: if preceded by brute-force attempts, indicates compromise
Received disconnect from [IP]: Bye Bye [preauth] Connection closed before auth completed Scanner or tool probe; client disconnected without finishing the handshake
session opened for user [user] by (uid=0) New interactive session established Confirms active session; verify whether preceded by legitimate or forced auth

Identifying Brute-Force Patterns

The course resources include an SSH authentication log (ssh_auth.log, 718 lines) captured during an active brute-force attack. Opening this file reveals a pattern that is unmistakable once you know what to look for.

Raw SSH authentication log entries showing rapid-fire Failed password messages from source IP 192.168.153.130 targeting the root account, with sequential source port numbers and identical timestamps indicating an automated brute-force attack.
Figure 12.3: Raw entries from ssh_auth.log showing a brute-force attack in progress. Note the identical timestamp, single source IP, and sequential port numbers across all entries.

Four indicators distinguish an automated brute-force attack from legitimate failed logins:

  1. High-frequency failures from a single source IP. The log shows dozens of authentication failure and Failed password entries from 192.168.153.130, all arriving within the same second. Legitimate users mistype passwords occasionally; they do not generate 16 failures per second.

  2. Sequential or near-sequential source ports. The source ports in the log entries (53030, 53034, 53036, 53038, 53040, 53042...) increment in a pattern consistent with a tool opening many parallel connections simultaneously. Each TCP connection uses a different ephemeral source port, and the operating system assigns them sequentially.

  3. Single target account. Every attempt in this log targets the root account exclusively. This is a targeted brute-force attack, not a password-spraying attack (which would cycle through multiple usernames). Attacking root is a common strategy because a successful root compromise provides immediate full system control.

  4. Inhuman speed. Sixteen parallel connections per second, each trying a different password, is beyond what any human could produce manually. This pattern is consistent with automated tools such as Hydra, Medusa, or Ncrack.

Parsing the SSH Log

With the indicators identified visually, the next step is using command-line tools to extract quantitative data from the full 718-line log.

Terminal session demonstrating four grep and awk commands against ssh_auth.log: counting failed attempts per source IP (384 from 192.168.153.130), checking for successful authentication, listing targeted usernames (root only), and extracting the attack time window.
Figure 12.4: Command-line parsing of ssh_auth.log. Each command extracts a different investigative data point from the raw log.

The parsing exercise demonstrates four investigative questions and the commands that answer them:

How many failed attempts came from each source IP? The command grep 'Failed password' ssh_auth.log | awk '{print $11}' | sort | uniq -c | sort -rn extracts the source IP field (field 11 in the "Failed password" message format), counts occurrences, and sorts by frequency. The result shows 327 failed attempts from 192.168.153.130.

Did any attempt succeed? The command grep 'Accepted password' ssh_auth.log searches for successful authentication. In this sample, the result reveals that one attempt succeeded: Accepted password for root from 192.168.153.130. This is the most critical finding. The brute-force attack was successful.

Which accounts were targeted? The command grep 'Failed password' ssh_auth.log | awk '{print $9}' | sort -u extracts the username field and deduplicates. Only root was targeted.

What was the attack time window? Extracting the first and last timestamps from the failed password entries shows the attack spanned from 11:59:15 to 12:01:47.

Putting It Together

Consider a scenario: you are an analyst who receives a SIEM alert for "Excessive SSH Authentication Failures" on a Linux server. Your first step is to pull the auth.log from the affected system. You run the frequency analysis and confirm 384 failures from a single IP in under two minutes. You check for successful authentication and find one. The brute-force worked.

Your next investigative steps would be to check the DHCP log (Week 13) to determine which physical device held that IP at the time of the attack, search the DNS log for any domain resolution activity from the attacker IP, and examine the firewall log to see what other connections were allowed or denied from that source. If packet captures are available, you would filter for the attacker IP in Wireshark and examine the SSH traffic. The companion pcap files (ssh_adjusted.pcap and ssh_cap.pcap) in the course resources provide this exact opportunity.


12.3 Web Proxy Logs (Squid)

Web proxy servers sit between internal users and the internet, intercepting and logging every outbound HTTP and HTTPS request. In an enterprise network, web proxies serve multiple purposes: content filtering, bandwidth management, caching, and most importantly for forensic analysts, visibility into outbound traffic. Squid is one of the most widely deployed open-source proxy servers, and its access.log uses a native format that differs fundamentally from syslog.

Anatomy of a Squid Access Log Entry

Squid logs each proxied request as a single line containing ten space-delimited fields. The format is consistent but the timestamp is not human-readable without conversion.

Annotated Squid proxy access.log entry with labeled fields: Unix epoch timestamp with milliseconds, request duration, client IP address, result code and HTTP status (TCP_DENIED/403), response size in bytes, HTTP method (CONNECT), and destination URL with port number.
Figure 12.5: Anatomy of a Squid access.log entry. The epoch timestamp, result code, and destination URL are the primary fields for forensic analysis.

The key fields are:

  • Epoch Timestamp (e.g., 1553457580.022): A Unix epoch timestamp with millisecond precision. This represents the number of seconds since January 1, 1970 (UTC). To convert to a human-readable format on Linux, use date -d @1553457580. In Python, use datetime.fromtimestamp(1553457580). The value 1553457580 converts to March 24, 2019 at 19:59:40 UTC.
  • Duration (e.g., 0): The time in milliseconds that the request took to process. A value of 0 indicates the proxy denied the request before establishing a connection.
  • Client IP (e.g., 192.168.153.1): The internal host that made the request.
  • Result Code/HTTP Status (e.g., TCP_DENIED/403): The Squid action combined with the HTTP status code. This is the most forensically significant field in many investigations.
  • Bytes (e.g., 3974): The response size.
  • Method (e.g., CONNECT): The HTTP method. CONNECT requests create tunnels for HTTPS traffic; GET and POST requests handle standard HTTP.
  • URL (e.g., www.google.com:4444): The destination. For CONNECT requests, this includes the port number.

HTTP Methods in Proxy Context

Two HTTP methods appear most frequently in proxy logs, and their forensic implications differ:

CONNECT is used when a client requests an encrypted tunnel through the proxy, typically for HTTPS traffic. The proxy establishes a TCP connection to the destination and relays encrypted bytes without inspecting the content. Forensically, a CONNECT request to a standard HTTPS port (443) is normal. A CONNECT request to a non-standard port (4444, 8080, 8443) warrants immediate investigation.

GET is used for standard unencrypted HTTP requests. The proxy can inspect, log, and cache the full content. GET requests to internal IP addresses through an external-facing proxy are abnormal and may indicate tunneling or misconfiguration.

Detecting Suspicious Proxy Activity

The course resources include a Squid access log (prox_access.log, 2,071 lines) containing normal browsing traffic mixed with indicators of malicious activity.

Squid proxy log entries highlighting two suspicious patterns: TCP_DENIED/403 responses for CONNECT requests to www.google.com on port 4444 indicating a blocked Meterpreter C2 tunneling attempt, and TCP_MISS_ABORTED/000 responses for GET requests to internal IP 192.168.153.146 on port 8080 suggesting internal scanning activity.
Figure 12.6: Suspicious entries in prox_access.log. TCP_DENIED entries for port 4444 indicate a blocked Meterpreter tunneling attempt. TCP_MISS_ABORTED entries to an internal IP suggest scanning or misconfigured malware.

Two suspicious patterns are visible in this log:

Pattern 1: CONNECT to port 4444. Multiple entries show TCP_DENIED/403 responses for CONNECT www.google.com:4444. Port 4444 is the default listener port for Metasploit's Meterpreter reverse_tcp handler (covered in Week 11, Section 11.2). The attacker used a legitimate hostname (www.google.com) as the CONNECT destination, likely attempting to evade casual inspection or keyword-based filtering. The proxy's policy correctly denied the connection because port 4444 is not a standard web port, but the attempt itself is evidence that an internal host was compromised and actively trying to establish a C2 channel.

Pattern 2: Requests to an internal IP through the proxy. Entries show TCP_MISS_ABORTED/000 for GET requests to http://192.168.153.146:8080/. Internal-to-internal traffic should not transit an external web proxy. These requests indicate either a misconfigured application on the compromised host or malware attempting to reach an internal staging server through the proxy.

Squid Result Code Meaning Forensic Significance
TCP_TUNNEL/200 CONNECT tunnel established successfully Normal for HTTPS on port 443; suspicious on non-standard ports
TCP_DENIED/403 Proxy policy blocked the request Connection blocked, but the attempt is logged evidence
TCP_MISS/200 Cache miss, fetched directly from origin Normal first-time request
TCP_MISS_ABORTED/000 Connection started but failed or aborted Target unavailable or connection reset; repeated attempts suggest scanning
TCP_HIT/200 Served from proxy cache Normal cached response

Parsing the Proxy Log

Terminal session demonstrating Squid proxy log parsing: converting epoch timestamps to human-readable dates using awk strftime, filtering TCP_DENIED entries to isolate blocked requests, and running frequency analysis on destination URLs to identify top-contacted domains including the suspicious port 4444 and internal IP targets.
Figure 12.7: Command-line parsing of prox_access.log. Epoch-to-readable conversion, denied destination extraction, and frequency analysis.

Key parsing techniques for Squid logs include epoch timestamp conversion using awk's strftime function, filtering by result code to isolate denied or aborted requests, and frequency analysis of destination URLs to identify top-contacted domains. In this sample, the frequency analysis reveals that www.google.com:4444 and http://192.168.153.146:8080/ both appear in the top results alongside legitimate traffic to Google, ad networks, and CDN providers.

Analyst Perspective

Proxy logs are one of the first places to look when investigating data exfiltration or C2 activity. The proxy sees all outbound web traffic from the internal network. If an attacker's malware uses HTTP or HTTPS for command and control, and most modern malware does, the proxy log records the connection attempts even if the proxy blocks them. A TCP_DENIED entry is not just a policy enforcement record; it is evidence of malicious intent.

Warning

Epoch timestamps require conversion to be human-readable. When correlating Squid logs with syslog-format logs (like SSH auth.log), you must convert both to the same timezone and format before comparing event times. Use UTC as your common reference to avoid timezone-related errors.

Putting It Together

An analyst investigating a suspected C2 compromise receives a tip from the IDS team that a host may be communicating with an external controller. The analyst opens the proxy log and filters for TCP_DENIED entries. The port 4444 CONNECT attempts immediately stand out. The analyst traces the client IP (192.168.153.1), checks the timestamp window, and begins correlating with other log sources: was this IP involved in the SSH brute-force from Section 12.2? What does the firewall log show for this host? The investigation is expanding from a single log source into a multi-source timeline, which is exactly the methodology covered in Week 13.


12.4 Firewall Logs (Fortinet)

Enterprise firewalls generate some of the richest log data available to analysts. Unlike endpoint logs that record events on a single system, or proxy logs that capture web traffic, firewalls sit at enforcement points where all traffic between network zones must pass. A single firewall log file can contain traffic decisions, VPN negotiations, IPS detections, admin access records, web filtering actions, and antivirus events, all in one source.

Fortinet FortiGate devices use a key-value log format where each field is explicitly labeled (e.g., srcip=192.168.100.72, dstport=53, status=accept). This makes Fortinet logs more self-documenting than positional formats like syslog, but it also means they require different parsing techniques.

Anatomy of a Fortinet Log Entry

The course resources include a Fortinet log (fortinet_log.log, 20 entries) spanning multiple event types. Despite the small sample size, these 20 entries cover traffic forwards, admin logins, VPN negotiations, IPS alerts, web filter actions, and UTM virus events, demonstrating the breadth of data a single firewall produces.

Annotated Fortinet FortiGate log entry showing key-value pair structure with six highlighted field groups: date and time timestamps, type and subtype classification, source and destination IP addresses, action status, policy ID governing the connection, and sent and received byte counts.
Figure 12.8: Anatomy of a Fortinet FortiGate log entry. Key-value pairs are self-documenting, with each field explicitly labeled. The six highlighted field groups are the most forensically significant.

Each Fortinet log entry begins with a syslog header (for forwarded logs) followed by a series of key=value pairs. The most forensically relevant fields include:

  • date= / time=: The device-local timestamp when the event occurred. This is the authoritative timestamp for timeline construction (see "Dual Timestamps" below).
  • type= / subtype=: The log category. type=traffic records network forwarding decisions. type=event records system, VPN, and user events. type=utm records unified threat management detections (IPS, antivirus, web filter).
  • srcip= / dstip=: Source and destination IP addresses.
  • status=: The action taken: accept, close, deny, or detected.
  • policyid=: The firewall rule number that governed this connection. This field allows you to audit whether the correct policy was applied.
  • sentbyte= / rcvdbyte=: The data volume transferred in each direction. Unusually large values on outbound connections can indicate data exfiltration.

Fortinet Log Types and Subtypes

The following table catalogs the log types present in the sample file and their key forensic fields.

Type / Subtype Description Key Forensic Fields
traffic / forward Network traffic forwarding decisions srcip, dstip, service, policyid, status, sentbyte, rcvdbyte
event / system Device-level events (admin, memory, maintenance) user, ui (source IP), action, status, reason
event / vpn VPN tunnel negotiations vpntunnel, remip, locip, esptransform, espauth, status
event / user User authentication events (FSSO, guest) user, group, action, status, reason
utm / ips IPS signature-based detections attackname, severity, srcip, dstip, sensor, attackid
utm / webfilter URL filtering decisions hostname, catdesc, profile, status, user
utm / virus Antivirus events (oversized file, malware) msg, status, service, url, user

Reading an IPS Alert Entry

The IPS entries in the sample log are particularly valuable for forensic analysis. They show the firewall's intrusion prevention engine detecting an active vulnerability scan.

Fortinet UTM IPS log entry with type=utm subtype=ips level=alert showing detection of a ZmEu Vulnerability Scanner from external IP 61.19.246.69 targeting internal web server 192.168.100.55 on port 80, with attackname, severity, and sensor fields highlighted.
Figure 12.9: Fortinet IPS alert detecting a ZmEu vulnerability scanner. The attackname, source IP, and destination IP are highlighted.

The entry shows type=utm subtype=ips eventtype=signature level=alert with attackname="ZmEu.Vulnerability.Scanner". ZmEu is a Romanian-origin automated scanner that probes web servers for phpMyAdmin installations, MySQL administrative interfaces, and other common web application weaknesses. The external IP 61.19.246.69 was scanning the internal web server 192.168.100.55 on port 80. The status field reads detected, meaning the IPS identified the traffic but did not block it (the IPS was likely running in detection-only mode for this policy). The sensor field identifies which IPS profile was active, and the attackid field (30024) provides a lookup reference for the Fortinet threat encyclopedia.

VPN and Admin Event Entries

Beyond traffic and IPS logs, Fortinet records VPN tunnel negotiations and administrative access events that are essential for security auditing.

Two Fortinet log entries: a VPN IPSec phase 2 negotiation showing tunnel name Etek_Soc_2 with legacy ESP_3DES encryption and HMAC_MD5 authentication, and an admin login event showing user ringo successfully authenticating via HTTPS from IP 200.31.77.163.
Figure 12.10: Fortinet VPN negotiation and admin login events. The VPN entry reveals encryption strength (3DES + MD5, both legacy), and the admin entry provides a full audit trail of management access.

VPN entries record IPSec phase 2 negotiations, including the tunnel name (vpntunnel="Etek_Soc_2"), the negotiation status, and the cryptographic parameters. In this sample, the VPN uses esptransform=ESP_3DES and espauth="HMAC_MD5", both of which are legacy algorithms that modern security standards consider weak. An analyst reviewing VPN logs should flag any tunnel using 3DES or MD5, as these indicate configurations that have not been updated to current cryptographic standards.

Admin login entries record who accessed the firewall management interface, from where, and whether the attempt succeeded. The sample shows user ringo logging in via HTTPS from 200.31.77.163 with status=success. During an incident investigation, reviewing admin login logs helps rule out (or confirm) the possibility that an attacker gained access to the firewall itself.

The Dual Timestamp Problem

Fortinet logs forwarded via syslog contain two timestamps, and they do not always match.

Side-by-side comparison of two timestamps in a forwarded Fortinet log entry: the syslog header timestamp of May 14 06:25:07 when the relay server received the message, and the device timestamp of date=2014-05-14 time=06:25:42 when the event occurred on the FortiGate, showing a 35-second relay delay.
Figure 12.11: The dual timestamp discrepancy in forwarded Fortinet logs. The syslog header and device timestamps differ by 35 seconds due to network relay delay.

The syslog header timestamp (May 14 06:25:07) reflects when the syslog relay server received the message. The device timestamp in the key-value fields (date=2014-05-14 time=06:25:42) reflects when the event actually occurred on the FortiGate appliance. In this sample, the discrepancy is 35 seconds, the time it took for the log message to travel from the firewall to the syslog collector.

During timeline construction, always use the device timestamp (date= and time= fields) as authoritative. The syslog header timestamp includes variable network relay delay and is less precise. Document the typical delta between the two timestamps in your case notes so that your timeline methodology is transparent and defensible.

Parsing Key-Value Logs

Fortinet's key-value format requires different parsing techniques than positional formats. Instead of extracting fields by position number (as with awk and syslog), you extract fields by name:

  • Extract a specific field: grep -oP 'srcip=[^ ]+' fortinet_log.log uses Perl-compatible regex to extract the srcip field and its value from each line.
  • Filter by log type: grep 'type=utm subtype=ips' fortinet_log.log isolates IPS detection entries.
  • Extract multiple fields: grep -oP '(srcip|dstip|attackname)=[^ ]+' fortinet_log.log pulls source IP, destination IP, and attack name from each matching line.

For more complex analysis involving multiple fields per line, Python or a dedicated log parser is often more practical than chaining grep and awk commands. Many organizations ingest Fortinet logs into a SIEM platform (Splunk, Elastic Security, Microsoft Sentinel) where the key-value format is automatically parsed into searchable fields.

Putting It Together

An analyst receives an IPS alert for a ZmEu vulnerability scanner detection. Starting from the IPS log entry, the analyst identifies the attacker IP (61.19.246.69) and the target (192.168.100.55). The next step is to query the traffic logs for any connections from that attacker IP that were allowed by policy, since the IPS detected the scan but may not have blocked it. The analyst also reviews the VPN log to confirm no unauthorized tunnel was established from the attacker's network, and checks the admin login events to verify that no management access occurred from an unexpected source. If the traffic logs show successful HTTP connections from the attacker IP to the internal web server, the analyst pivots to the web server's access log (covered in Week 13, Section 13.1) to determine exactly what the attacker accessed.

Analyst Perspective

Firewall logs often contain the most complete picture of network-level activity because firewalls sit at enforcement points where all traffic must pass. The policyid field is especially valuable: it tells you exactly which security rule governed the connection, allowing you to audit whether the policy is performing as intended. If an attacker's traffic was allowed by a policy that should have blocked it, that is a finding that needs to be escalated to the firewall administration team.


Log Format Comparison

The following table summarizes the three log formats covered in this chapter and previews the two additional formats introduced in Week 13.

Attribute SSH auth.log Squid proxy Fortinet firewall Apache CLF (Week 13) Zeek TSV (Week 13)
Format Syslog (positional) Squid native (positional) Key-value pairs Combined Log Format Tab-separated
Timestamp MMM DD HH:MM:SS Unix epoch.ms date=YYYY-MM-DD time=HH:MM:SS DD/Mon/YYYY:HH:MM:SS +TZ Unix epoch.us
Year included? No Yes (via epoch) Yes Yes Yes (via epoch)
Timezone included? No UTC (epoch) Device-local Yes (offset) UTC (epoch)
Parse with grep + awk (positional) awk + epoch conversion grep -oP for named fields grep + awk (positional) awk with tab delimiter

This table is designed as a quick-reference resource for investigations. When you encounter an unfamiliar log file, comparing its structure against these format families will help you identify the format and select the right parsing approach.


Chapter Summary

  • Logs and packets are complementary evidence sources. Packets capture the raw data exchange on the wire; logs capture system-level decisions (allowed, denied, authenticated, failed) and application-level events. Effective forensic investigations require both.
  • Three log formats were covered: syslog (SSH auth.log), Squid native (proxy access.log), and Fortinet key-value (firewall log). Each uses a different structure, different timestamp convention, and different parsing approach.
  • SSH brute-force attacks produce four recognizable indicators in auth.log: high-frequency failures from a single source IP, sequential source ports, single target account, and inhuman speed. Always check for Accepted password entries after identifying a brute-force pattern, as a successful authentication indicates the attack worked.
  • Proxy logs reveal C2 tunneling attempts and suspicious outbound traffic. CONNECT requests on non-standard ports (especially 4444, the Metasploit default) and requests to internal IPs through the proxy are strong indicators of compromise. TCP_DENIED entries are evidence of malicious intent, not just policy enforcement records.
  • Firewall logs provide the broadest visibility across log types. A single Fortinet log contains traffic decisions, VPN negotiations, IPS detections, admin access records, and web filter actions. The key-value format is self-documenting but requires named-field extraction rather than positional parsing.
  • Timestamp normalization is a recurring challenge. Syslog timestamps lack year and timezone. Squid uses epoch time. Fortinet logs contain two timestamps with a relay delay between them. Week 13 introduces a formal timestamp normalization methodology for cross-source correlation.
  • Week 13 builds on these individual log reading skills by adding web server access logs, DNS query logs, and DHCP lease logs, then demonstrating how to correlate events across all six sources to construct unified incident timelines. The companion lab introduces NetworkMiner as an automated analysis tool.