Skip to content

feat: add SQS queue support for failed events storage#38

Merged
LorisFriedel merged 2 commits into
mainfrom
loris.friedel/add-sqs-queue-support-for-failed-events
Mar 18, 2026
Merged

feat: add SQS queue support for failed events storage#38
LorisFriedel merged 2 commits into
mainfrom
loris.friedel/add-sqs-queue-support-for-failed-events

Conversation

@LorisFriedel

@LorisFriedel LorisFriedel commented Mar 16, 2026

Copy link
Copy Markdown
Member

Summary

The Datadog Forwarder Lambda (v5.3.0+, layer >= 97) supports SQS as an alternative to S3 for storing failed events via DD_SQS_QUEUE_URL. This wires that capability through the Terraform module with a new dd_sqs_queue_url variable.

When set, SQS takes priority over S3 for retry storage, DD_STORE_FAILED_EVENTS is automatically enabled, and the S3 bucket is no longer created solely for failed events (still created if tag caching is needed). The SQS queue is user-managed — the module only configures IAM permissions and passes the URL to the Lambda.

Changes

  • variables.tf — New dd_sqs_queue_url variable with URL format validation and layer version >= 97 guard
  • data.tf — Updated create_s3_bucket to skip S3 when SQS handles failed events; new sqs_queue_arn (derived from URL) and store_failed_events_enabled locals
  • main.tf — Pass sqs_queue_arn to IAM module; set DD_STORE_FAILED_EVENTS and DD_SQS_QUEUE_URL env vars; update scheduled retry conditions to use store_failed_events_enabled; fix pre-existing bug where dd_forwarder_buckets_access_logs_target could fail when no S3 bucket exists
  • modules/iam/main.tf + variables.tf — Add SQS IAM policy statement (sqs:SendMessage, sqs:ReceiveMessage, sqs:DeleteMessage, sqs:ChangeMessageVisibility)
  • outputs.tf — Mark dd_api_key_secret_arn as sensitive (pre-existing fix, was blocking all mock_provider tests)
  • tests/sqs_failed_events.tftest.hcl — 9 test scenarios: auto-enable, tag fetching + SQS, existing IAM role, scheduled retry, URL validation, layer version validation (old + new), S3 fallback, IAM permissions
  • README.md — Document new variable in inputs table

OBSPLTF-1040

The Datadog Forwarder Lambda (v5.3.0+, layer >= 97) supports SQS as an
alternative to S3 for storing failed events. This wires the new
DD_SQS_QUEUE_URL env var through the Terraform module via a new
dd_sqs_queue_url variable.

When set, SQS takes priority over S3 for retry storage and
DD_STORE_FAILED_EVENTS is automatically enabled. The SQS queue must be
user-managed (not created by the module). IAM permissions for
sqs:SendMessage, sqs:ReceiveMessage, sqs:DeleteMessage, and
sqs:ChangeMessageVisibility are granted on the derived queue ARN.

Also fixes a pre-existing bug where dd_forwarder_buckets_access_logs_target
could fail when no S3 bucket was created, and marks the dd_api_key_secret_arn
output as sensitive.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@LorisFriedel LorisFriedel requested a review from a team as a code owner March 16, 2026 11:22
Remove conflicting dd_api_key from sqs_with_existing_iam_role test that
already uses dd_api_key_ssm_parameter_name, and fix terraform fmt alignment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@LorisFriedel LorisFriedel merged commit 96c6d1b into main Mar 18, 2026
4 checks passed
@LorisFriedel LorisFriedel deleted the loris.friedel/add-sqs-queue-support-for-failed-events branch March 18, 2026 13:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants