Skip to content

VPC flow log NODATA records should be skipped silently, not treated as errors #277

@e-gineer

Description

@e-gineer

Problem

When collecting VPC flow logs, records with log_status = 'NODATA' are treated as errors because they don't have start_time or end_time values (these fields contain - which is the nil value).

NODATA is a valid, documented VPC flow log status that indicates "no data was recorded during the aggregation interval." These records don't have timestamps by design - there was simply no traffic during that window.

Current behavior

  1. NODATA records fail validation because tp_timestamp is never set
  2. These are counted as errors
  3. Errors prevent the collection state from being updated
  4. This can cause the same files to be reprocessed repeatedly ("stuck" collecting the same data)

Expected behavior

NODATA records should be silently skipped (not treated as errors) since they:

  • Are a normal part of VPC flow logs
  • Don't contain traffic data by design
  • Should not block collection progress

Proposed Solution

In EnrichRow, check for log_status == 'NODATA' before processing timestamps and skip these records without treating them as errors.

Optionally, count skipped NODATA records separately so users have visibility into how many were skipped.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions