Flow Logs Deep Dive

VPC Flow Logs are an AWS feature that enables you to capture information about the IP traffic going to and from network interfaces in your VPC. These logs can help with many tasks, such as:

  • Application and network debugging
  • Debugging network ACLs, security groups, and routing
  • Reviewing network costs
  • Audit and compliance requirements

However, there are some key points to consider when working with flow logs. We work through a few examples highlighting these points and dive deep into the behavior of flow logs.

Flow Logs Review

Flow logs can operate at three separate levels within a VPC:

  • VPC – Monitor all the activity with a VPC. Review the pricing before enabling across highly network active VPCs.
  • Subnet – Subnets, a regular use case to target specific subnets like database or DMZ subnets.
  • Network Interface – A single ENI.

Flow logs can be created with two formats and stored in cloudwatch logs or S3:

  • Version 2. The original default format. The only format available when using cloudwatch logs as a target. It can also be used when sending flows to S3.
  • Version 3. The custom format. This allows users to select the fields and order within the flow log and is only available when sending flows to S3.

AWS global accelerator has an additional [flow log format] (https://docs.aws.amazon.com/global-accelerator/latest/dg/monitoring-global-accelerator.flow-logs.html). These can only go to S3.

The default format contains the following fixed number of fields:

  • version, 2 for default, 3 for custom
  • account-id
  • interface-id – the ENI id
  • src and dstaddr – the source and destination IPs
  • src and dstport – the source and destination ports where applicable, TCP and UDP traffic
  • protocol – the protocol number. 6 for TCP, 17 for UDP
  • packets and bytes – the number transferred during the flow
  • start and end times – the start and end timestamps of the flow window
  • action – ACCEPT or REJECT
  • log-status – OK, NODATA (no traffic during the capture window) or SKIPDATA (a capacity or AWS error occurred)

The custom format adds additional fields that we review with an example further on.

Flows vs. Sessions

When working with flow logs, it’s essential to understand the difference between a session and a flow. A flow is not mapped directly to a session; for example, a TCP session captured in a tool like wireshark would include all packets moving in both directions. A flow contains only a small section of a session, and only describes packets moving in one direction.



Pricing varies per region and storage type. Storage costs now fall under the ‘vended’ pricing model described under the logs section of [cloudwatch pricing.] (https://aws.amazon.com/cloudwatch/pricing/)

Default Format Flow Logs

Example 1

Also known as version 1 format logs, this example has a VPC level flow log configured to send to cloudwatch logs.


To start, we’ll use netcat to create a single TCP connection between two EC2 instances in the same VPC.

Setup the server to listen on a port

server$ nc <span class="hljs-operator">-l</span> <span class="hljs-number">10000</span>

The client to send a short message

client$ <span class="hljs-built_in">echo</span> <span class="hljs-string">"hello_flow_logs"</span> | nc <span class="hljs-number">10.0</span>.<span class="hljs-number">87.192</span> <span class="hljs-number">10000</span>

Confirm the server received it

server$ nc <span class="hljs-operator">-l</span> <span class="hljs-number">10000</span>

Review the flow logs generated from the single connection.


These results can be confusing without prior knowledge of other flow-based monitoring such as netflow or IPFIX.

  • A small single TCP connection created four flows
  • They appear out of order compared to the natural network sequence
  • In this case, are displayed 25 seconds apart

First, let’s review how a single TCP connection creates at least four flows. Recall flow logs collect flows from an ENI perspective across the selected VPC or subnet. The four flows are:

  1. A to B outbound from A’s ENI
  2. A to B inbound to B’s ENI
  3. B to A outbound from A’s ENI
  4. B to A inbound to A’s ENI

This is important to note when counting or graphing bytes or packets where source and destination are within the same VPC to avoid double counting. If we fix the ordering, the expected sequence would be:


How did the flows end up 25 seconds apart and out of order?

The 25 second gap requires looking more into timestamps and how flow logs are generated.

Flow logs have a capture window of up to 10 minutes, followed by processing and publishing. The total time of capture+processing+publishing is the aggregation period, which is up to 15 minutes. The capture windows and aggregation periods are per ENI and not synced at a VPC level. This is how the flows for the server ENI ended up recorded 25 seconds after the client flows.


The two timestamp fields, start and end within the flowlog can also be misunderstood. In the example above, they are recorded as:


These are the start and end of the capture window. They have the same values on both flows on each ENI as they were captured inside the same capture window.

Example 2

We will use iperf3 is generate 100Mbit/s for 300 seconds between two instances in different AZ’s within the same VPC.

server$ iperf3 <span class="hljs-operator">-s</span>
client$ iperf3 -c {target} -b <span class="hljs-number">100</span>M -t <span class="hljs-number">300</span>
Cnnecting to host {target}, port <span class="hljs-number">5201</span>
[  <span class="hljs-number">4</span>] <span class="hljs-built_in">local</span> {<span class="hljs-built_in">source</span>}  port <span class="hljs-number">42790</span> connected to {target} port <span class="hljs-number">5201</span>
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  <span class="hljs-number">4</span>]   <span class="hljs-number">0.00</span>-<span class="hljs-number">1.00</span>   sec  <span class="hljs-number">10.8</span> MBytes  <span class="hljs-number">90.2</span> Mbits/sec    <span class="hljs-number">0</span>    <span class="hljs-number">673</span> KBytes
[  <span class="hljs-number">4</span>]   <span class="hljs-number">1.00</span>-<span class="hljs-number">2.00</span>   sec  <span class="hljs-number">12.0</span> MBytes   <span class="hljs-number">101</span> Mbits/sec    <span class="hljs-number">0</span>   <span class="hljs-number">1.13</span> MBytes

The client generated, on average, 12.0MBytes/s for the duration. Let’s generate a graph to confirm the bandwidth, [IP Address functions] (https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_QuerySyntax.html#CWL_QuerySyntax-operations-functions) were recently added to cloudwatch insights filters making network-based graphing much easier.

Using cloudwatch insights query we can graph the bandwidth in the client to server direction. Note we divide by 60 for per second and the graph displays as expected.

<span class="hljs-built_in">filter</span> isIpInSubnet(srcAddr,<span class="hljs-string">'{client}/32'</span>) <span class="hljs-built_in">and</span> isIpInSubnet(dstAddr,<span class="hljs-string">'{server}/32'</span>) | stats avg(bytes/<span class="hljs-number">60</span>) <span class="hljs-keyword">as</span> MBytes_per_second by bin(<span class="hljs-number">1</span><span class="hljs-keyword">m</span>)


Use the following type of query to view all the flow logs created by the test, client to server and server to client.

filter (isIpInSubnet(srcAddr,<span class="hljs-string">'{source}/32'</span>) <span class="hljs-keyword">and</span> isIpInSubnet(dstAddr,<span class="hljs-string">'{dest}/32'</span>)) <span class="hljs-keyword">or</span> (isIpInSubnet(srcAddr,<span class="hljs-string">'{dest}/32'</span>) <span class="hljs-keyword">and</span> isIpInSubnet(dstAddr,<span class="hljs-string">'{source}/32'</span>)) | fields interfaceId, start, <span class="hljs-keyword">end</span>, (<span class="hljs-keyword">end</span> - start) as duration, bytes, srcPort, dstPort


  • The test sent 3.5GBytes of data over 5 minutes resulting in 32 flow log records.
  • Reviewing the clusters of timestamps we can see this was captured over 14 capture windows
  • The duration of the capture windows, end – start do not add up to the expected time as they are only approximations of the session timestamps
  • In this case, our capture windows were much smaller than the maximum mentioned in the documentation of 15 minutes

Custom Format Flow Logs

Example 3

The custom flow format adds many useful additional fields to the flow log records.

  • vpc, subnet, and instance ids for easier querying, filtering, and graphing.
  • tcp-flags, SYN, SYN-ACK, FIN, and RST, which can be useful for debugging and determining flow direction.
  • type, IPv4, IPv6 or the elastic fabric adapter (EFA)
  • pkt-srcaddr and dstaddr. When using ENIs with multiple IP addresses or interacting with NAT gateways the default log format could show the incorrect IPs. These fields help identify the original (as expected) IP addresses for these types of flows.

This example uses the custom flow format with S3 as a target.


When creating a custom format flow log specify the additional fields required.

<span class="hljs-variable">${</span>version} <span class="hljs-variable">${</span>account-id} <span class="hljs-variable">${</span>interface-id} <span class="hljs-variable">${</span>srcaddr} <span class="hljs-variable">${</span>dstaddr} <span class="hljs-variable">${</span>srcport} <span class="hljs-variable">${</span>dstport} <span class="hljs-variable">${</span>protocol} <span class="hljs-variable">${</span>packets} <span class="hljs-variable">${</span>bytes} <span class="hljs-variable">${</span>start} <span class="hljs-variable">${</span><span class="hljs-keyword">end</span>} <span class="hljs-variable">${</span>action} <span class="hljs-variable">${</span>log-status} <span class="hljs-variable">${</span>instance-id} <span class="hljs-variable">${</span>subnet-id} <span class="hljs-variable">${</span>vpc-id} <span class="hljs-variable">${</span>pkt-srcaddr} <span class="hljs-variable">${</span>pkt-dstaddr} <span class="hljs-variable">${</span>tcp-flags} <span class="hljs-variable">${</span>type}

To make use of the pkt-srcaddr and pkt-dstaddr fields, we have added additional IP addresses to the existing ENIs on the test instances.

server$ iperf3 <span class="hljs-operator">-s</span> -B {secondary_ip}
client$ iperf3 -c {server_secondary_ip} -b <span class="hljs-number">100</span>M -t <span class="hljs-number">300</span> -B {client_secondary_ip}

Create an athena table and partition to query the logs from S3

Create table

<span class="hljs-operator"><span class="hljs-keyword">CREATE</span></span> <span class="hljs-operator"><span class="hljs-keyword">EXTERNAL</span></span> <span class="hljs-operator"><span class="hljs-keyword">TABLE</span></span> <span class="hljs-operator"><span class="hljs-keyword">IF</span></span> <span class="hljs-operator"><span class="hljs-keyword">NOT</span></span> <span class="hljs-operator"><span class="hljs-keyword">EXISTS</span></span><span class="hljs-operator"> vpc_flow_logs (
         </span><span class="hljs-operator"><span class="hljs-keyword">version</span></span><span class="hljs-operator"> <span class="hljs-built_in">int</span>,
         account <span class="hljs-keyword">string</span>,
         interfaceid <span class="hljs-keyword">string</span>,
         sourceaddress <span class="hljs-keyword">string</span>,
         destinationaddress <span class="hljs-keyword">string</span>,
         sourceport <span class="hljs-built_in">int</span>,
         destinationport <span class="hljs-built_in">int</span>,
         protocol <span class="hljs-built_in">int</span>,
         numpackets <span class="hljs-built_in">int</span>,
         numbytes <span class="hljs-built_in">bigint</span>,
         starttime <span class="hljs-built_in">int</span>,
         endtime <span class="hljs-built_in">int</span>,
         <span class="hljs-keyword">action</span> <span class="hljs-keyword">string</span>,
         logstatus <span class="hljs-keyword">string</span>,
         instanceid <span class="hljs-keyword">string</span>,
         subnetid <span class="hljs-keyword">string</span>,
         vpcid <span class="hljs-keyword">string</span>,
         pktsrcaddr <span class="hljs-keyword">string</span>,
         pktdstaddr <span class="hljs-keyword">string</span>,
         tcpflags <span class="hljs-built_in">int</span>,
         </span><span class="hljs-operator">type</span><span class="hljs-operator"> <span class="hljs-keyword">string</span>
) PARTITIONED </span><span class="hljs-operator"><span class="hljs-keyword">BY</span></span><span class="hljs-operator"> (
         dt <span class="hljs-keyword">string</span>
) </span><span class="hljs-operator"><span class="hljs-keyword">ROW</span></span><span class="hljs-operator"> <span class="hljs-keyword">FORMAT</span> DELIMITED <span class="hljs-keyword">FIELDS</span> <span class="hljs-keyword">TERMINATED</span> </span><span class="hljs-operator"><span class="hljs-keyword">BY</span></span> <span class="hljs-operator"><span class="hljs-string">' '</span></span>
<span class="hljs-operator">LOCATION</span> <span class="hljs-operator"><span class="hljs-string">'{s3location}'</span></span><span class="hljs-operator">
TBLPROPERTIES (</span><span class="hljs-operator"><span class="hljs-string">"skip.header.line.count"</span></span><span class="hljs-operator">=</span><span class="hljs-operator"><span class="hljs-string">"1"</span></span><span class="hljs-operator">);</span>

And a partition for todays date

<span class="hljs-operator"><span class="hljs-keyword">ALTER</span></span> <span class="hljs-operator"><span class="hljs-keyword">TABLE</span></span><span class="hljs-operator"> sampledb.vpc_flow_logs
</span><span class="hljs-operator"><span class="hljs-keyword">ADD</span></span><span class="hljs-operator"> <span class="hljs-keyword">PARTITION</span> (dt</span><span class="hljs-operator">=</span><span class="hljs-operator"><span class="hljs-string">'{year]-{month}-{date}'</span></span><span class="hljs-operator">)
</span><span class="hljs-operator">location</span> <span class="hljs-operator"><span class="hljs-string">'{s3location}/{year}/{month}/{date}/'</span></span><span class="hljs-operator">;</span>


We can now see the benefit of the pkt-srcaddr and pkt-dstaddr fields, showing the expected IP addresses instead of the primary interface IP addresses. This is also useful when working with flows that interact with NAT gateways, EKS and other AWS services.


There are a few important takeaways:

  • Flow log displayed sequence is different from the real network sequence. This can confuse fine-grained debugging.
  • Flow logs are up to 15 minutes delayed. If you need real-time packet capture, then run a packet-capture tool like tcpdump.
  • The start and end timestamps can only be used as approximate times of the represented connection.
  • A single long-running connection can be captured over multiple windows and hence recorded across multiple flow log records
  • Consider the costs of enabling VPC wide flow logs on very active VPCs, consider select subnets and S3 to reduce costs
  • The custom format flow logs provide valuable additional fields, especially if working with systems with multiple ENIs or looking at flows through components such a NAT gateways