From 9219162b2a7697cc567df4f76119616bfb139437 Mon Sep 17 00:00:00 2001 From: Yiming Luo <10097700+lym953@users.noreply.github.com> Date: Fri, 5 Jun 2026 17:10:07 -0400 Subject: [PATCH 1/9] Add CloudFormation template for Lambda Durable Function event forwarder MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Captures AWS Lambda Durable Function execution status change events from EventBridge and delivers them to the Datadog HTTP intake via Amazon Data Firehose. Records arrive at Datadog as the raw EventBridge envelope; reshaping (ARN qualifier stripping, detail.* flattening, ISO timestamp parsing) is configured on the Datadog side via a logs processing pipeline rather than inside the stack. Resources created (9): S3 backup bucket + policy, Firehose delivery stream + role + log group + 2 log streams, EventBridge rule + role. When DdApiKeyKmsCiphertext is set, four additional resources are provisioned to decrypt the API key at deploy time via a custom resource (IAM role, Lambda, log group, Custom::DatadogApiKeyKmsDecrypt). Four mutually-acceptable API-key options: - DdApiKey (plaintext, NoEcho) - DdApiKeySecretArn (Secrets Manager dynamic reference) - DdApiKeySsmParameterName (SSM SecureString dynamic reference) - DdApiKeyKmsCiphertext + DdApiKeyKmsKeyArn (deploy-time decrypter) Five independent function-name filter slots (FunctionNameFilter1..5), each producing two matchers — unqualified ARN + version/alias-qualified ARN — so events for any qualifier form are captured. Empty slots are stripped from the EventBridge rule at deploy time. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../README.md | 312 +++++++++ .../template.yaml | 615 ++++++++++++++++++ 2 files changed, 927 insertions(+) create mode 100644 aws/durable_function_event_forwarder/README.md create mode 100644 aws/durable_function_event_forwarder/template.yaml diff --git a/aws/durable_function_event_forwarder/README.md b/aws/durable_function_event_forwarder/README.md new file mode 100644 index 000000000..dee9e6b0c --- /dev/null +++ b/aws/durable_function_event_forwarder/README.md @@ -0,0 +1,312 @@ +# Datadog Lambda Durable Function Event Forwarder + +A self-contained CloudFormation template that captures AWS Lambda Durable +Function execution status change events and delivers them to the Datadog +HTTP intake via Amazon Data Firehose. Records arrive at Datadog as the +raw EventBridge envelope; any reshaping (field renaming, ARN qualifier +stripping, timestamp parsing) is configured on the Datadog side via a +logs processing pipeline. + +## Architecture + +``` +EventBridge rule -> Firehose -> Datadog HTTP intake (raw EventBridge JSON) + \ + -> S3 backup bucket (failed records only) +``` + +- The EventBridge rule subscribes to `aws.lambda` source events with + detail-type `Durable Execution Status Change` and routes them to + Firehose. +- Firehose forwards each record unchanged to + `https://aws-kinesis-http-intake.logs./v1/input` using the + Datadog API key as the `X-Amz-Firehose-Access-Key` header. The stack + does **not** attach any custom metadata to Firehose's outbound + requests; Datadog's AWS integration auto-tags incoming logs from the + EventBridge envelope (`source:lambda`, `service:lambda`, + `region:`, `aws_account:`, + `sourcecategory:aws`) and from the Firehose ARN. +- Records the endpoint rejects are written to the S3 backup bucket + (`S3BackupMode: FailedDataOnly`); under normal operation the bucket + stays empty. + +## Parameters + +| Parameter | Required | Default | Description | +| --- | --- | --- | --- | +| `DdApiKey` | one of four | "" | Plaintext Datadog API key (`NoEcho`). | +| `DdApiKeySecretArn` | one of four | "" | ARN of a Secrets Manager secret whose `SecretString` is the API key. Resolved via `{{resolve:secretsmanager:...}}`. | +| `DdApiKeySsmParameterName` | one of four | "" | Name of an SSM SecureString parameter holding the API key. Resolved via `{{resolve:ssm-secure:...}}`. | +| `DdApiKeyKmsCiphertext` | one of four | "" | Base64-encoded KMS ciphertext of the API key. A deploy-time Lambda decrypts it via `kms:Decrypt` and hands the plaintext to Firehose as a `NoEcho` custom-resource attribute. See [API key from KMS ciphertext](#api-key-from-kms-ciphertext). | +| `DdApiKeyKmsKeyArn` | when ciphertext set | "" | ARN of the KMS key that encrypted `DdApiKeyKmsCiphertext`. Used to scope the decrypter Lambda's `kms:Decrypt` permission. Required when `DdApiKeyKmsCiphertext` is set (enforced by a `Rules` assertion). | +| `DdSite` | no | `datadoghq.com` | Datadog site; used to build the Firehose destination URL. | +| `DdService` | no | `datadog-durable-function-event-forwarder` | Datadog `service` tag applied to every forwarded event. Override to match your existing service taxonomy. | +| `DdEnv` | no | "" | Datadog `env` tag. | +| `DdVersion` | no | "" | Datadog `version` tag. | +| `DdTags` | no | "" | Comma-delimited extra tags (for example `team:durable,owner:platform`). | +| `Statuses` | no | `TIMED_OUT,STOPPED` | EventBridge `detail.status` values to forward. Must be uppercase, comma-delimited. | +| `FunctionNameFilter1` … `FunctionNameFilter5` | no | "" | Up to 5 independent function-name filters. Each accepts a name or EventBridge wildcard (for example `prod-*-orders`). All five empty matches all functions in the region. Each populated slot adds matchers for both unqualified and version/alias-qualified ARNs — see [Filtering multiple functions](#filtering-multiple-functions). | +| `BufferIntervalSeconds` | no | `60` | Firehose buffer interval (60–900). | + +`Rules.ApiKeyRequired` asserts at least one of the four API key parameters +is set and fails the stack action with a clear message otherwise. + +## Outputs + +| Output | Description | +| --- | --- | +| `DeliveryStreamArn` | Firehose delivery stream ARN. | +| `BackupBucketName` | S3 bucket name for failed records. | +| `EventRuleArn` | EventBridge rule ARN. | +| `ForwarderVersion` | Template version (from `Mappings.Constants`). | + +## Forwarded log shape + +The stack does **no transformation in AWS**. Firehose forwards each +EventBridge record to Datadog verbatim, so Datadog receives the raw +envelope: + +```json +{ + "version": "0", + "id": "...", + "detail-type": "Durable Execution Status Change", + "source": "aws.lambda", + "account": "123456789012", + "time": "", + "region": "us-east-1", + "resources": [], + "detail": { + "functionArn": "arn:aws:lambda:us-east-1:123456789012:function:my-fn:$LATEST", + "executionName": "...", + "executionStartTime": "", + "executionEndTime": "", + "status": "TIMED_OUT" + } +} +``` + +The stack itself does not attach metadata to the Firehose request. +Datadog's AWS integration auto-derives these tags from the envelope and +the Firehose ARN: + +- `source:lambda` and `service:lambda` (from `source:aws.lambda`) +- `region:` +- `aws_account:` +- `sourcecategory:aws` + +Anything beyond these — a service override +(`service:my-orders-service`), `env`/`version`, custom tags, attribute +flattening, ARN qualifier stripping, timestamp parsing for relative-time +tooltips — is the Datadog log processing pipeline's responsibility (see +below). The `DdService`/`DdEnv`/`DdVersion`/`DdTags` parameters remain +on the stack for forward compatibility but are not currently propagated; +configure their equivalents in the pipeline instead. + +### Datadog-side processing pipeline + +Configure a Datadog logs processing pipeline (Logs → Configuration → +Pipelines → New Pipeline, filter `source:lambda` + +`@detail-type:"Durable Execution Status Change"`) with these +processors: + +1. **Date Remapper** on `time` so EventBridge's `time` becomes the log's + official date. +2. **Attribute Remapper** to flatten `detail.*` to top-level attributes — + for example `detail.functionArn` → `function_arn`, + `detail.executionName` → `lambda.durable_function.execution_name`, + etc. (Use snake_case names so they match the rest of the Lambda + namespace.) +3. **Grok / String Builder** to strip the `:` suffix + (`:$LATEST`, `:prod`, `:1`, …) from `function_arn`, so all events for + the same function share a single ARN value regardless of how it was + invoked. +4. **Arithmetic Processor** on `detail.executionStartTime` / + `detail.executionEndTime` (parse to epoch ms) if you want numeric + range facets and the relative-time tooltip on those fields. +5. **Message Remapper** if you want a human-readable message like + `Durable execution is `. + +These are all UI-configurable; no template changes needed. The benefit +of doing this in Datadog rather than in a transformer Lambda is that +pipeline tweaks ship instantly without redeploying the stack, and you +get to test against a sample log via Datadog's pipeline preview. + +## Publishing the template (Datadog operators) + +Once the template is hosted at a public S3 URL, customers can reference it +directly — no zip artifact, region replication, or layer publish is needed. +The template is the only thing to ship. + +The convention mirrors `aws/logs_monitoring/release.sh` in this repo: + +- **Public bucket**: `datadog-cloudformation-template` (same bucket the + Lambda Forwarder already uses). +- **Versioned object key**: + `aws/durable_function_event_forwarder/.yaml` +- **Floating "latest" key**: + `aws/durable_function_event_forwarder/latest.yaml` +- **Sandbox / staging**: separate path + `aws/durable_function_event_forwarder-staging/.yaml` against the + per-environment sandbox bucket. + +Upload commands per release (run from this directory): + +```bash +VERSION=$(yq '.Mappings.Constants.DdDurableEventForwarder.Version' template.yaml | tr -d '"') +BUCKET=datadog-cloudformation-template + +aws cloudformation validate-template --template-body file://template.yaml + +aws s3 cp template.yaml \ + "s3://${BUCKET}/aws/durable_function_event_forwarder/${VERSION}.yaml" \ + --content-type "application/yaml" + +aws s3 cp template.yaml \ + "s3://${BUCKET}/aws/durable_function_event_forwarder/latest.yaml" \ + --content-type "application/yaml" +``` + +Customer-facing URLs after publish: + +- Versioned (recommended for nested stacks): + `https://datadog-cloudformation-template.s3.amazonaws.com/aws/durable_function_event_forwarder/.yaml` +- Latest (convenient for one-off console deploys; not pinned, will change): + `https://datadog-cloudformation-template.s3.amazonaws.com/aws/durable_function_event_forwarder/latest.yaml` + +### Quick-create link (give this to customers) + +Drop the URL into a CloudFormation quick-create deeplink so customers can +launch the stack with one click. Anything not pre-filled defaults to the +parameter's `Default:` value. + +``` +https://console.aws.amazon.com/cloudformation/home#/stacks/quickcreate + ?stackName=datadog-durable-function-event-forwarder + &templateURL=https://datadog-cloudformation-template.s3.amazonaws.com/aws/durable_function_event_forwarder/latest.yaml + ¶m_DdService= + ¶m_DdEnv= +``` + +(Removed the line breaks — paste as a single URL.) The customer fills in +the API key parameter on the console form; never pre-fill `DdApiKey` in a +link. + +### Release checklist + +1. Bump `Mappings.Constants.DdDurableEventForwarder.Version` in + `template.yaml`. +2. `aws cloudformation validate-template ...` +3. `aws s3 cp` to both `.yaml` and `latest.yaml` keys. +4. Tag the release (`git tag durable-function-event-forwarder/`). + +A `release.sh` modeled on `aws/logs_monitoring/release.sh` can automate +steps 1–4. It is intentionally not committed yet — the template ships +self-contained, so a one-line `aws s3 cp` is sufficient until cross-env +sandbox/prod automation is needed. + +## Deploying directly + +```bash +aws cloudformation deploy \ + --template-file template.yaml \ + --stack-name datadog-durable-function-event-forwarder \ + --capabilities CAPABILITY_IAM \ + --parameter-overrides \ + DdApiKeySecretArn=arn:aws:secretsmanager:us-east-1:123456789012:secret:datadog/api-key-AbCdEf \ + DdService=my-service \ + DdEnv=prod +``` + +## Consuming as a nested stack + +```yaml +DurableFunctionEvents: + Type: AWS::CloudFormation::Stack + Properties: + TemplateURL: https:///aws/durable_function_event_forwarder//template.yaml + Parameters: + DdApiKeySecretArn: !Ref DatadogApiKeySecret + DdService: my-service + DdEnv: prod +``` + +The template is fully self-contained — no Lambda zip artifact, no region +replication, no `ZipCopier` custom resource. Firehose forwards +EventBridge records to Datadog directly; all reshaping happens in +Datadog's logs processing pipeline. + +## Filtering multiple functions + +Up to 5 independent function-name filters are exposed as separate +parameters (`FunctionNameFilter1` … `FunctionNameFilter5`). Each +populated slot contributes two matchers to the EventBridge rule — one +for unqualified ARNs and one for version/alias-qualified ARNs. Empty +slots are stripped from the rendered list at deploy time, so leaving +gaps (e.g., populating slots 1, 3, 5) is fine. + +Why five separate parameters instead of one comma-separated list: +`AWS::Events::Rule.EventPattern` is typed `Json` (an arbitrary blob), so +CloudFormation does not auto-convert `Fn::ForEach` Map output to a list +the way it does for schema-typed list properties. The only ways to +build a dynamic-length list inside an `EventPattern` are (a) a +custom-resource macro, (b) `CommaDelimitedList` with `!Select` plus +inline comma-padding tricks repeated per slot, or (c) fixed N slots +exposed as individual parameters. We chose (c) because each slot is +locally simple to read in the template. + +If you need more than 5 filters in one region, either widen one of the +slots with a wildcard (`prod-*` covers every function starting with +`prod-`) or deploy a second stack — they're independent. + +## API key from KMS ciphertext + +`DdApiKeyKmsCiphertext` accepts the base64 output of: + +```bash +aws kms encrypt \ + --key-id arn:aws:kms:us-east-1:123456789012:key/abcd... \ + --plaintext "$DATADOG_API_KEY" \ + --query CiphertextBlob \ + --output text +``` + +At stack create/update time, a short-lived Lambda decrypts the ciphertext +once via `kms:Decrypt` and returns the plaintext to CloudFormation as a +`NoEcho` custom-resource attribute. The Firehose `AccessKey` then +references the value via `!GetAtt DecryptedApiKey.ApiKey`. + +The decrypter Lambda's IAM role is scoped to `kms:Decrypt` on the single +key ARN you pass as `DdApiKeyKmsKeyArn`. A `Rules` assertion fails the +stack action if the ciphertext is set without the key ARN. + +**Security trade-off.** The plaintext is materialized in two places +during deploy: the custom-resource response body (suppressed from stack +events by `NoEcho: true`) and the rendered Firehose resource properties. +This is weaker than `DdApiKeySecretArn` or `DdApiKeySsmParameterName`, +which AWS treats specially and never logs. Use the KMS-ciphertext option +when you already have an encrypted ciphertext blob in your deployment +flow (e.g., from `serverless-plugin-kms` or an internal config store) +and don't want to add a Secrets Manager / SSM secret. + +The decrypter resources (`ApiKeyKmsDecrypterFunction`, +`ApiKeyKmsDecrypterRole`, `ApiKeyKmsDecrypterLogGroup`, `DecryptedApiKey`) +are conditional on `UseApiKeyKms` and only exist when +`DdApiKeyKmsCiphertext` is set — no overhead for the other three API-key +methods. + +## Files + +| File | Purpose | +| --- | --- | +| `template.yaml` | Canonical CloudFormation template. | + +## Notes + +- The function-name filter emits two `wildcard` patterns per value so it + matches both unqualified ARNs and version/alias-qualified ARNs. A single + `suffix(":function:")` matcher would miss qualified ARNs. +- `BufferingHints` is set explicitly even at its default value: omitting + it has historically caused CloudFormation drift on subsequent updates. +- The backup bucket is retained on stack deletion + (`DeletionPolicy: Retain`) so failed records survive teardown. diff --git a/aws/durable_function_event_forwarder/template.yaml b/aws/durable_function_event_forwarder/template.yaml new file mode 100644 index 000000000..61d0675cf --- /dev/null +++ b/aws/durable_function_event_forwarder/template.yaml @@ -0,0 +1,615 @@ +AWSTemplateFormatVersion: "2010-09-09" +Description: >- + Captures AWS Lambda Durable Function execution status change events from + EventBridge, transforms them into Datadog log documents, and forwards them + to the Datadog HTTP intake via Amazon Data Firehose. + +Metadata: + # W1030 is suppressed at the template level because cfn-lint can't model + # the KmsCiphertextRequiresKeyArn Rule: it sees DdApiKeyKmsKeyArn's + # empty-string default and warns that the resulting Resource arn might + # not match the IAM ARN regex, even though that resource only exists + # when the parameter is required to be a valid ARN. + cfn-lint: + config: + ignore_checks: + - W1030 + AWS::CloudFormation::Interface: + ParameterGroups: + - Label: + default: Datadog API key (one required) + Parameters: + - DdApiKey + - DdApiKeySecretArn + - DdApiKeySsmParameterName + - DdApiKeyKmsCiphertext + - DdApiKeyKmsKeyArn + - Label: + default: Datadog routing and tagging + Parameters: + - DdSite + - DdService + - DdEnv + - DdVersion + - DdTags + - Label: + default: Event filters + Parameters: + - Statuses + - FunctionNameFilter1 + - FunctionNameFilter2 + - FunctionNameFilter3 + - FunctionNameFilter4 + - FunctionNameFilter5 + - Label: + default: Tuning + Parameters: + - BufferIntervalSeconds + ParameterLabels: + DdApiKey: { default: API key (plaintext) } + DdApiKeySecretArn: { default: Secrets Manager secret ARN } + DdApiKeySsmParameterName: { default: SSM SecureString parameter name } + DdApiKeyKmsCiphertext: { default: KMS ciphertext (base64) } + DdApiKeyKmsKeyArn: { default: KMS key ARN (for ciphertext) } + DdSite: { default: Datadog site } + DdService: { default: Service tag } + DdEnv: { default: Env tag } + DdVersion: { default: Version tag } + DdTags: { default: Additional tags } + Statuses: { default: Statuses to forward } + FunctionNameFilter1: { default: Function name filter 1 (optional) } + FunctionNameFilter2: { default: Function name filter 2 (optional) } + FunctionNameFilter3: { default: Function name filter 3 (optional) } + FunctionNameFilter4: { default: Function name filter 4 (optional) } + FunctionNameFilter5: { default: Function name filter 5 (optional) } + BufferIntervalSeconds: { default: Firehose buffer interval (seconds) } + +Mappings: + Constants: + DdDurableEventForwarder: + Version: "0.1.0" + +Parameters: + # ---- Datadog API key (exactly one of the three is required) ---- + DdApiKey: + Type: String + NoEcho: true + Default: "" + Description: >- + Datadog API key. Provide a plaintext value here OR set DdApiKeySecretArn + OR DdApiKeySsmParameterName instead. The value is passed to Firehose as + the X-Amz-Firehose-Access-Key header and stored opaquely by Firehose; it + is not written to CloudWatch logs by this stack. + DdApiKeySecretArn: + Type: String + Default: "" + AllowedPattern: "^$|^arn:.*:secretsmanager:.*" + Description: >- + ARN of a Secrets Manager secret whose SecretString is the Datadog API + key. Resolved at deploy time via a CloudFormation `resolve:secretsmanager` + dynamic reference, so the secret value never appears in the template. + DdApiKeySsmParameterName: + Type: String + Default: "" + AllowedPattern: "^$|^/[a-zA-Z0-9/_.-]+$" + Description: >- + Name (not ARN) of an SSM Parameter Store SecureString parameter that + holds the Datadog API key. Resolved at deploy time via a CloudFormation + `resolve:ssm-secure` dynamic reference. + DdApiKeyKmsCiphertext: + Type: String + NoEcho: true + Default: "" + Description: >- + Base64-encoded KMS ciphertext of the Datadog API key (the output of + `aws kms encrypt --plaintext --key-id --query + CiphertextBlob --output text`). A short-lived deploy-time Lambda + decrypts the value and hands the plaintext to Firehose. Requires + DdApiKeyKmsKeyArn so the decrypter's IAM role can be scoped to the + specific key. + DdApiKeyKmsKeyArn: + Type: String + Default: "" + AllowedPattern: "^$|^arn:.*:kms:.*" + Description: >- + ARN of the KMS key that encrypted DdApiKeyKmsCiphertext. Used to + scope the decrypter Lambda's `kms:Decrypt` permission. Required when + DdApiKeyKmsCiphertext is set; ignored otherwise. + + # ---- Routing ---- + DdSite: + Type: String + Default: datadoghq.com + AllowedPattern: .+ + Description: >- + Datadog site to deliver events to. The Firehose destination URL is + derived as https://aws-kinesis-http-intake.logs./v1/input. + + # ---- Tagging ---- + DdService: + Type: String + Default: datadog-durable-function-event-forwarder + AllowedPattern: .+ + ConstraintDescription: DdService cannot be empty. + Description: >- + Datadog service tag applied to every forwarded event. Defaults to + "datadog-durable-function-event-forwarder"; override to match your + existing service taxonomy if you already have Datadog conventions for + this category of events. + DdEnv: + Type: String + Default: "" + Description: Datadog env tag. Optional. + DdVersion: + Type: String + Default: "" + Description: Datadog version tag. Optional. + DdTags: + Type: String + Default: "" + Description: >- + Extra comma-delimited tags appended to every forwarded event (for + example team:durable,owner:platform). Optional. + + # ---- Event filters ---- + Statuses: + Type: CommaDelimitedList + Default: "TIMED_OUT,STOPPED" + Description: >- + Comma-separated list of execution status values to forward. Values must + match the EventBridge detail.status field exactly (uppercase). Default + forwards only terminal failure-like statuses. + # Up to 5 independent function-name filters. CloudFormation has no + # native iteration that fits AWS::Events::Rule.EventPattern (a Json blob, + # not a schema-typed list), so each slot is exposed as its own optional + # parameter. Each populated slot emits two wildcard matchers — one for + # unqualified ARNs and one for version/alias-qualified ARNs. Slots left + # empty are removed from the EventPattern via AWS::NoValue, so they have + # no effect on the rendered rule. + FunctionNameFilter1: + Type: String + Default: "" + Description: >- + Optional Lambda function name or EventBridge wildcard pattern (for + example "my-fn" or "prod-*-orders") used to restrict which functions' + events are captured. If all five FunctionNameFilterN parameters are + empty, the rule matches every function in this region. + FunctionNameFilter2: + Type: String + Default: "" + Description: Optional additional function name or wildcard pattern. + FunctionNameFilter3: + Type: String + Default: "" + Description: Optional additional function name or wildcard pattern. + FunctionNameFilter4: + Type: String + Default: "" + Description: Optional additional function name or wildcard pattern. + FunctionNameFilter5: + Type: String + Default: "" + Description: Optional additional function name or wildcard pattern. + + # ---- Tuning ---- + BufferIntervalSeconds: + Type: Number + Default: 60 + MinValue: 60 + MaxValue: 900 + Description: >- + Firehose buffer interval in seconds. Increasing this trades freshness + for fewer outbound requests; the maximum (900) is fine for low-volume + durable-execution streams. + + +Conditions: + UseApiKey: !Not [!Equals [!Ref DdApiKey, ""]] + UseApiKeySecret: !Not [!Equals [!Ref DdApiKeySecretArn, ""]] + UseApiKeySsm: !Not [!Equals [!Ref DdApiKeySsmParameterName, ""]] + UseApiKeyKms: !Not [!Equals [!Ref DdApiKeyKmsCiphertext, ""]] + HasEnv: !Not [!Equals [!Ref DdEnv, ""]] + HasVersion: !Not [!Equals [!Ref DdVersion, ""]] + HasTags: !Not [!Equals [!Ref DdTags, ""]] + HasFilter1: !Not [!Equals [!Ref FunctionNameFilter1, ""]] + HasFilter2: !Not [!Equals [!Ref FunctionNameFilter2, ""]] + HasFilter3: !Not [!Equals [!Ref FunctionNameFilter3, ""]] + HasFilter4: !Not [!Equals [!Ref FunctionNameFilter4, ""]] + HasFilter5: !Not [!Equals [!Ref FunctionNameFilter5, ""]] + HasFunctionFilter: !Or + - !Condition HasFilter1 + - !Condition HasFilter2 + - !Condition HasFilter3 + - !Condition HasFilter4 + - !Condition HasFilter5 + +Rules: + ApiKeyRequired: + Assertions: + - Assert: !Or + - !Not [!Equals [!Ref DdApiKey, ""]] + - !Not [!Equals [!Ref DdApiKeySecretArn, ""]] + - !Not [!Equals [!Ref DdApiKeySsmParameterName, ""]] + - !Not [!Equals [!Ref DdApiKeyKmsCiphertext, ""]] + AssertDescription: >- + One of DdApiKey, DdApiKeySecretArn, DdApiKeySsmParameterName, or + DdApiKeyKmsCiphertext must be set. + KmsCiphertextRequiresKeyArn: + RuleCondition: !Not [!Equals [!Ref DdApiKeyKmsCiphertext, ""]] + Assertions: + - Assert: !Not [!Equals [!Ref DdApiKeyKmsKeyArn, ""]] + AssertDescription: >- + DdApiKeyKmsKeyArn is required when DdApiKeyKmsCiphertext is set. + The ARN is used to scope the decrypter Lambda's kms:Decrypt + permission. + +Resources: + # --------------------------------------------------------------------------- + # Firehose backup bucket. Receives only records that fail to deliver to the + # Datadog endpoint (S3BackupMode: FailedDataOnly), so it stays empty under + # normal operation. Retained on stack deletion to preserve any failed + # records the operator may need to inspect or replay. + # --------------------------------------------------------------------------- + BackupBucket: + Type: AWS::S3::Bucket + DeletionPolicy: Retain + UpdateReplacePolicy: Retain + Properties: + BucketEncryption: + ServerSideEncryptionConfiguration: + - ServerSideEncryptionByDefault: + SSEAlgorithm: AES256 + PublicAccessBlockConfiguration: + BlockPublicAcls: true + BlockPublicPolicy: true + IgnorePublicAcls: true + RestrictPublicBuckets: true + OwnershipControls: + Rules: + - ObjectOwnership: BucketOwnerEnforced + + BackupBucketPolicy: + Type: AWS::S3::BucketPolicy + Properties: + Bucket: !Ref BackupBucket + PolicyDocument: + Version: "2012-10-17" + Statement: + - Sid: EnforceSSL + Effect: Deny + Principal: "*" + Action: s3:* + Resource: + - !GetAtt BackupBucket.Arn + - !Sub "${BackupBucket.Arn}/*" + Condition: + Bool: + aws:SecureTransport: "false" + + # --------------------------------------------------------------------------- + # Firehose delivery stream. HTTP endpoint destination targets the Datadog + # Firehose-specific intake (which speaks the Firehose protocol — do not use + # the standard /api/v2/logs endpoint here). Backup mode is FailedDataOnly so + # the bucket only receives records the endpoint rejected. + # --------------------------------------------------------------------------- + FirehoseLogGroup: + Type: AWS::Logs::LogGroup + Properties: + LogGroupName: !Sub "/aws/kinesisfirehose/${AWS::StackName}" + RetentionInDays: 7 + + FirehoseHttpLogStream: + Type: AWS::Logs::LogStream + Properties: + LogGroupName: !Ref FirehoseLogGroup + LogStreamName: HttpEndpointDelivery + + FirehoseS3LogStream: + Type: AWS::Logs::LogStream + Properties: + LogGroupName: !Ref FirehoseLogGroup + LogStreamName: S3Backup + + FirehoseRole: + Type: AWS::IAM::Role + Properties: + AssumeRolePolicyDocument: + Version: "2012-10-17" + Statement: + - Effect: Allow + Principal: + Service: firehose.amazonaws.com + Action: sts:AssumeRole + Policies: + - PolicyName: FirehoseDelivery + PolicyDocument: + Version: "2012-10-17" + Statement: + - Effect: Allow + Action: + - s3:AbortMultipartUpload + - s3:GetBucketLocation + - s3:GetObject + - s3:ListBucket + - s3:ListBucketMultipartUploads + - s3:PutObject + Resource: + - !GetAtt BackupBucket.Arn + - !Sub "${BackupBucket.Arn}/*" + - Effect: Allow + Action: + - logs:PutLogEvents + Resource: + - !GetAtt FirehoseLogGroup.Arn + + # --------------------------------------------------------------------------- + # Deploy-time API key decrypter. Only created when DdApiKeyKmsCiphertext is + # set. The custom resource invokes the Lambda once per stack action; the + # Lambda calls KMS:Decrypt on the ciphertext blob and returns the plaintext + # API key in a NoEcho response. The Firehose AccessKey then references the + # returned value via !GetAtt. The plaintext is briefly present in the + # custom resource's response and in the Firehose resource's rendered + # properties — a security trade-off that customers using Secrets Manager + # or SSM SecureString dynamic references can avoid. + # --------------------------------------------------------------------------- + ApiKeyKmsDecrypterRole: + Type: AWS::IAM::Role + Condition: UseApiKeyKms + Properties: + AssumeRolePolicyDocument: + Version: "2012-10-17" + Statement: + - Effect: Allow + Principal: + Service: lambda.amazonaws.com + Action: sts:AssumeRole + ManagedPolicyArns: + - !Sub "arn:${AWS::Partition}:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole" + Policies: + - PolicyName: DecryptApiKey + PolicyDocument: + Version: "2012-10-17" + Statement: + - Effect: Allow + Action: kms:Decrypt + Resource: !Ref DdApiKeyKmsKeyArn + + ApiKeyKmsDecrypterFunction: + Type: AWS::Lambda::Function + Condition: UseApiKeyKms + Properties: + Description: >- + Decrypts the KMS-encrypted Datadog API key at deploy time and + returns the plaintext to CloudFormation as a NoEcho custom-resource + attribute. + Runtime: nodejs22.x + Architectures: [arm64] + Handler: index.handler + Role: !GetAtt ApiKeyKmsDecrypterRole.Arn + Timeout: 30 + MemorySize: 128 + Code: + ZipFile: | + 'use strict'; + const https = require('https'); + const { URL } = require('url'); + const { KMSClient, DecryptCommand } = require('@aws-sdk/client-kms'); + + const respond = (event, context, status, data, reason) => + new Promise((resolve, reject) => { + const body = JSON.stringify({ + Status: status, + Reason: reason || `See ${context.logGroupName}/${context.logStreamName}`, + PhysicalResourceId: event.PhysicalResourceId || context.logStreamName, + StackId: event.StackId, + RequestId: event.RequestId, + LogicalResourceId: event.LogicalResourceId, + NoEcho: true, + Data: data || {}, + }); + const u = new URL(event.ResponseURL); + const req = https.request({ + hostname: u.hostname, + port: 443, + path: u.pathname + u.search, + method: 'PUT', + headers: { 'content-type': '', 'content-length': Buffer.byteLength(body) }, + }, (res) => { res.on('data', () => {}); res.on('end', resolve); }); + req.on('error', reject); + req.write(body); + req.end(); + }); + + exports.handler = async (event, context) => { + try { + if (event.RequestType === 'Delete') { + return respond(event, context, 'SUCCESS', {}); + } + const ciphertext = event.ResourceProperties && event.ResourceProperties.Ciphertext; + if (!ciphertext) { + return respond(event, context, 'FAILED', {}, 'Ciphertext property is required'); + } + const kms = new KMSClient({}); + const out = await kms.send(new DecryptCommand({ + CiphertextBlob: Buffer.from(ciphertext, 'base64'), + })); + const apiKey = Buffer.from(out.Plaintext).toString('utf8'); + return respond(event, context, 'SUCCESS', { ApiKey: apiKey }); + } catch (err) { + // Do not log err.message — KMS errors can be informative + // but plaintext should never reach CloudWatch. + console.error('KMS decrypt failed:', err.name || 'Error'); + return respond(event, context, 'FAILED', {}, `Decrypt failed: ${err.name || 'Error'}`); + } + }; + + ApiKeyKmsDecrypterLogGroup: + Type: AWS::Logs::LogGroup + Condition: UseApiKeyKms + Properties: + LogGroupName: !Sub "/aws/lambda/${ApiKeyKmsDecrypterFunction}" + RetentionInDays: 7 + + DecryptedApiKey: + Type: Custom::DatadogApiKeyKmsDecrypt + Condition: UseApiKeyKms + Properties: + ServiceToken: !GetAtt ApiKeyKmsDecrypterFunction.Arn + # Including the ciphertext in the properties is what makes + # CloudFormation re-invoke the custom resource on update if the + # ciphertext changes. + Ciphertext: !Ref DdApiKeyKmsCiphertext + + DeliveryStream: + Type: AWS::KinesisFirehose::DeliveryStream + Properties: + DeliveryStreamType: DirectPut + HttpEndpointDestinationConfiguration: + EndpointConfiguration: + Name: Datadog + # Firehose's Url field accepts only https://[/path], no + # query string. Static metadata is attached via CommonAttributes + # below (Firehose sends them as the X-Amz-Firehose-Common- + # Attributes header on each request, which Datadog's Firehose + # intake parses into log metadata / tags). + Url: !Sub "https://aws-kinesis-http-intake.logs.${DdSite}/v1/input" + AccessKey: !If + - UseApiKey + - !Ref DdApiKey + - !If + - UseApiKeySecret + - !Sub "{{resolve:secretsmanager:${DdApiKeySecretArn}:SecretString}}" + - !If + - UseApiKeySsm + - !Sub "{{resolve:ssm-secure:${DdApiKeySsmParameterName}}}" + - !If + - UseApiKeyKms + - !GetAtt DecryptedApiKey.ApiKey + - !Ref AWS::NoValue + BufferingHints: + IntervalInSeconds: !Ref BufferIntervalSeconds + SizeInMBs: 4 + RetryOptions: + DurationInSeconds: 60 + # Datadog's Firehose intake does not interpret common-attributes + # header keys (dd-service / dd-source / dd-tags) as log metadata — + # it surfaces each as a raw tag with the literal key, and tag + # values can't contain commas so a joined dd-tags value would be + # mangled. We explicitly set CommonAttributes: [] (instead of + # omitting RequestConfiguration entirely) because CloudFormation + # does not push outright property removals to Firehose — omission + # would leave previously-configured attributes live on the stream. + # Datadog's AWS integration auto-tags service/source/region/ + # aws_account from the raw envelope's source field and the + # Firehose ARN, so we get those for free. Any extra metadata + # (service override, env, version, custom tags) is set by a + # Datadog log processing pipeline against these events. + RequestConfiguration: + CommonAttributes: [] + CloudWatchLoggingOptions: + Enabled: true + LogGroupName: !Ref FirehoseLogGroup + LogStreamName: !Ref FirehoseHttpLogStream + RoleARN: !GetAtt FirehoseRole.Arn + S3BackupMode: FailedDataOnly + S3Configuration: + BucketARN: !GetAtt BackupBucket.Arn + RoleARN: !GetAtt FirehoseRole.Arn + BufferingHints: + IntervalInSeconds: 300 + SizeInMBs: 5 + CompressionFormat: GZIP + CloudWatchLoggingOptions: + Enabled: true + LogGroupName: !Ref FirehoseLogGroup + LogStreamName: !Ref FirehoseS3LogStream + # Firehose forwards each EventBridge envelope to Datadog unchanged; + # all reshaping (function ARN qualifier stripping, detail.* + # flattening, ISO timestamp parsing) is configured on the Datadog + # side via a logs processing pipeline. We explicitly set + # Enabled: false instead of omitting ProcessingConfiguration — + # CloudFormation does not push outright property removals to + # Firehose, so omitting it would leave a previously-attached Lambda + # processor live on the stream. + ProcessingConfiguration: + Enabled: false + + # --------------------------------------------------------------------------- + # EventBridge rule. Captures aws.lambda "Durable Execution Status Change" + # events and routes them to Firehose. The functionArn matcher uses two + # wildcard entries so it covers both unqualified ARNs and ARNs with a + # version or alias suffix; a single suffix() match would miss qualified + # ARNs. + # --------------------------------------------------------------------------- + EventRule: + Type: AWS::Events::Rule + Properties: + Description: >- + Routes Lambda Durable Function execution status-change events to the + Datadog Firehose delivery stream. + State: ENABLED + EventPattern: + source: + - aws.lambda + detail-type: + - Durable Execution Status Change + detail: + status: !Ref Statuses + # Two wildcard matchers per populated filter slot: one for + # unqualified ARNs and one for version/alias-qualified ARNs. + # Empty slots resolve to AWS::NoValue and are stripped from the + # rendered list by CloudFormation, so the EventPattern ends up + # with exactly 2N matchers where N is the count of populated + # FunctionNameFilterN slots. + functionArn: !If + - HasFunctionFilter + - - !If [HasFilter1, {wildcard: !Sub "*:function:${FunctionNameFilter1}"}, !Ref AWS::NoValue] + - !If [HasFilter1, {wildcard: !Sub "*:function:${FunctionNameFilter1}:*"}, !Ref AWS::NoValue] + - !If [HasFilter2, {wildcard: !Sub "*:function:${FunctionNameFilter2}"}, !Ref AWS::NoValue] + - !If [HasFilter2, {wildcard: !Sub "*:function:${FunctionNameFilter2}:*"}, !Ref AWS::NoValue] + - !If [HasFilter3, {wildcard: !Sub "*:function:${FunctionNameFilter3}"}, !Ref AWS::NoValue] + - !If [HasFilter3, {wildcard: !Sub "*:function:${FunctionNameFilter3}:*"}, !Ref AWS::NoValue] + - !If [HasFilter4, {wildcard: !Sub "*:function:${FunctionNameFilter4}"}, !Ref AWS::NoValue] + - !If [HasFilter4, {wildcard: !Sub "*:function:${FunctionNameFilter4}:*"}, !Ref AWS::NoValue] + - !If [HasFilter5, {wildcard: !Sub "*:function:${FunctionNameFilter5}"}, !Ref AWS::NoValue] + - !If [HasFilter5, {wildcard: !Sub "*:function:${FunctionNameFilter5}:*"}, !Ref AWS::NoValue] + - !Ref AWS::NoValue + Targets: + - Id: FirehoseTarget + Arn: !GetAtt DeliveryStream.Arn + RoleArn: !GetAtt EventBridgeRole.Arn + + EventBridgeRole: + Type: AWS::IAM::Role + Properties: + AssumeRolePolicyDocument: + Version: "2012-10-17" + Statement: + - Effect: Allow + Principal: + Service: events.amazonaws.com + Action: sts:AssumeRole + Policies: + - PolicyName: PutToFirehose + PolicyDocument: + Version: "2012-10-17" + Statement: + - Effect: Allow + Action: + - firehose:PutRecord + - firehose:PutRecordBatch + Resource: !GetAtt DeliveryStream.Arn + +Outputs: + DeliveryStreamArn: + Description: ARN of the Firehose delivery stream. + Value: !GetAtt DeliveryStream.Arn + BackupBucketName: + Description: S3 bucket that captures records the Datadog intake rejected. + Value: !Ref BackupBucket + EventRuleArn: + Description: ARN of the EventBridge rule capturing durable execution events. + Value: !GetAtt EventRule.Arn + ForwarderVersion: + Description: Version of this forwarder template. + Value: !FindInMap [Constants, DdDurableEventForwarder, Version] From 7ced82c8da67325692eb68452aad65e33c8a899e Mon Sep 17 00:00:00 2001 From: Yiming Luo <10097700+lym953@users.noreply.github.com> Date: Thu, 11 Jun 2026 14:20:26 -0400 Subject: [PATCH 2/9] Add release.sh to publish the durable-function forwarder template to S3 Publishes template.yaml to the public datadog-cloudformation-template bucket at aws/lambda-durable-function-event-forwarder/.yaml (+ latest.yaml), authenticating to the Datadog Prod account (464622532012) via the prod-engineering role. Requires a semver version arg, validates the template, and refuses to overwrite an already-published version. README publishing section updated to match. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../README.md | 66 ++++---- .../release.sh | 143 ++++++++++++++++++ 2 files changed, 171 insertions(+), 38 deletions(-) create mode 100755 aws/durable_function_event_forwarder/release.sh diff --git a/aws/durable_function_event_forwarder/README.md b/aws/durable_function_event_forwarder/README.md index dee9e6b0c..268eaa877 100644 --- a/aws/durable_function_event_forwarder/README.md +++ b/aws/durable_function_event_forwarder/README.md @@ -138,41 +138,34 @@ Once the template is hosted at a public S3 URL, customers can reference it directly — no zip artifact, region replication, or layer publish is needed. The template is the only thing to ship. -The convention mirrors `aws/logs_monitoring/release.sh` in this repo: - -- **Public bucket**: `datadog-cloudformation-template` (same bucket the - Lambda Forwarder already uses). -- **Versioned object key**: - `aws/durable_function_event_forwarder/.yaml` -- **Floating "latest" key**: - `aws/durable_function_event_forwarder/latest.yaml` -- **Sandbox / staging**: separate path - `aws/durable_function_event_forwarder-staging/.yaml` against the - per-environment sandbox bucket. - -Upload commands per release (run from this directory): +Use `release.sh` (modeled on `aws/logs_monitoring/release.sh`), run from this +directory with a required semantic version: ```bash -VERSION=$(yq '.Mappings.Constants.DdDurableEventForwarder.Version' template.yaml | tr -d '"') -BUCKET=datadog-cloudformation-template - -aws cloudformation validate-template --template-body file://template.yaml - -aws s3 cp template.yaml \ - "s3://${BUCKET}/aws/durable_function_event_forwarder/${VERSION}.yaml" \ - --content-type "application/yaml" - -aws s3 cp template.yaml \ - "s3://${BUCKET}/aws/durable_function_event_forwarder/latest.yaml" \ - --content-type "application/yaml" +./release.sh 0.1.0 ``` +The script: + +1. Authenticates to the Datadog Prod account (`464622532012`, which owns the + bucket) using the `prod-engineering` role via + `aws-vault exec sso-prod-engineering` (override with `AWS_VAULT_PROFILE`), + and aborts unless the resolved account is the prod account. +2. Validates `template.yaml` with `aws cloudformation validate-template`. +3. **Refuses to overwrite an already-published `.yaml`** — released + versions are immutable, so bump the version to republish. +4. After a confirmation prompt, uploads `template.yaml` to both keys in the + public bucket `datadog-cloudformation-template`: + - `aws/lambda-durable-function-event-forwarder/.yaml` (new, immutable) + - `aws/lambda-durable-function-event-forwarder/latest.yaml` (floating + pointer, always overwritten to point at the version just published) + Customer-facing URLs after publish: - Versioned (recommended for nested stacks): - `https://datadog-cloudformation-template.s3.amazonaws.com/aws/durable_function_event_forwarder/.yaml` + `https://datadog-cloudformation-template.s3.amazonaws.com/aws/lambda-durable-function-event-forwarder/.yaml` - Latest (convenient for one-off console deploys; not pinned, will change): - `https://datadog-cloudformation-template.s3.amazonaws.com/aws/durable_function_event_forwarder/latest.yaml` + `https://datadog-cloudformation-template.s3.amazonaws.com/aws/lambda-durable-function-event-forwarder/latest.yaml` ### Quick-create link (give this to customers) @@ -183,7 +176,7 @@ parameter's `Default:` value. ``` https://console.aws.amazon.com/cloudformation/home#/stacks/quickcreate ?stackName=datadog-durable-function-event-forwarder - &templateURL=https://datadog-cloudformation-template.s3.amazonaws.com/aws/durable_function_event_forwarder/latest.yaml + &templateURL=https://datadog-cloudformation-template.s3.amazonaws.com/aws/lambda-durable-function-event-forwarder/latest.yaml ¶m_DdService= ¶m_DdEnv= ``` @@ -195,15 +188,11 @@ link. ### Release checklist 1. Bump `Mappings.Constants.DdDurableEventForwarder.Version` in - `template.yaml`. -2. `aws cloudformation validate-template ...` -3. `aws s3 cp` to both `.yaml` and `latest.yaml` keys. -4. Tag the release (`git tag durable-function-event-forwarder/`). - -A `release.sh` modeled on `aws/logs_monitoring/release.sh` can automate -steps 1–4. It is intentionally not committed yet — the template ships -self-contained, so a one-line `aws s3 cp` is sufficient until cross-env -sandbox/prod automation is needed. + `template.yaml` to the new version and merge to `master`. +2. Run `./release.sh ` with the same version — it validates the + template, refuses to overwrite an existing `.yaml`, then uploads + `.yaml` and updates `latest.yaml`. +3. Tag the release (`git tag durable-function-event-forwarder/`). ## Deploying directly @@ -224,7 +213,7 @@ aws cloudformation deploy \ DurableFunctionEvents: Type: AWS::CloudFormation::Stack Properties: - TemplateURL: https:///aws/durable_function_event_forwarder//template.yaml + TemplateURL: https://datadog-cloudformation-template.s3.amazonaws.com/aws/lambda-durable-function-event-forwarder/.yaml Parameters: DdApiKeySecretArn: !Ref DatadogApiKeySecret DdService: my-service @@ -300,6 +289,7 @@ methods. | File | Purpose | | --- | --- | | `template.yaml` | Canonical CloudFormation template. | +| `release.sh` | Publishes `template.yaml` to the public `datadog-cloudformation-template` bucket (`aws/lambda-durable-function-event-forwarder/.yaml` + `latest.yaml`). | ## Notes diff --git a/aws/durable_function_event_forwarder/release.sh b/aws/durable_function_event_forwarder/release.sh new file mode 100755 index 000000000..c6bdfcaa0 --- /dev/null +++ b/aws/durable_function_event_forwarder/release.sh @@ -0,0 +1,143 @@ +#!/usr/bin/env bash + +# Publishes the Lambda Durable Function event-forwarder CloudFormation template +# to the public Datadog CloudFormation template bucket: +# +# https://datadog-cloudformation-template.s3.amazonaws.com/aws/lambda-durable-function-event-forwarder/.yaml +# https://datadog-cloudformation-template.s3.amazonaws.com/aws/lambda-durable-function-event-forwarder/latest.yaml +# +# The bucket lives in the Datadog Prod account (464622532012); the script +# authenticates with that account's prod-engineering role via aws-vault. +# +# Usage: +# ./release.sh # e.g. ./release.sh 0.1.0 +# +# A semantic version is REQUIRED. The script refuses to overwrite an existing +# .yaml (immutable released versions); latest.yaml is always updated to +# point at the version being published. +# +# Override the aws-vault profile with AWS_VAULT_PROFILE if yours differs. + +set -o nounset -o pipefail -o errexit + +log_info() { + local BLUE='\033[0;34m' + local RESET='\033[0m' + printf -- "%b%b%b\n" "${BLUE}" "${*}" "${RESET}" 1>&2 +} + +log_success() { + local GREEN='\033[0;32m' + local RESET='\033[0m' + printf -- "%b%b%b\n" "${GREEN}" "${*}" "${RESET}" 1>&2 +} + +log_error() { + local RED='\033[0;31m' + local RESET='\033[0m' + printf -- "%b%b%b\n" "${RED}" "${*}" "${RESET}" 1>&2 + exit 1 +} + +user_confirm() { + local input + read -r -p "${1:-Are you sure}? [y/N] " input (e.g. ${0} 0.1.0)" +fi + +VERSION="${1}" +if [[ ! ${VERSION} =~ ^[0-9]+\.[0-9]+\.[0-9]+$ ]]; then + log_error "You must use a semantic version (e.g. 0.1.0), got: '${VERSION}'" +fi + +if ! command -v aws >/dev/null 2>&1; then + log_error "aws CLI not found; please install it first" +fi + +if ! command -v aws-vault >/dev/null 2>&1; then + log_error "aws-vault not found; please install it first" +fi + +if [[ ! -f ${TEMPLATE_PATH} ]]; then + log_error "Template not found at ${TEMPLATE_PATH}" +fi + +readonly VERSION_KEY="${KEY_PREFIX}/${VERSION}.yaml" +readonly LATEST_KEY="${KEY_PREFIX}/latest.yaml" +readonly VERSION_URL="https://${BUCKET}.s3.amazonaws.com/${VERSION_KEY}" +readonly LATEST_URL="https://${BUCKET}.s3.amazonaws.com/${LATEST_KEY}" + +# --- Validate identity ----------------------------------------------------- # + +log_info "Authenticating with aws-vault profile '${AWS_VAULT_PROFILE}'..." +CURRENT_ACCOUNT="$(aws_login aws sts get-caller-identity --query Account --output text)" + +if [[ ${CURRENT_ACCOUNT} != "${PROD_ACCOUNT_ID}" ]]; then + log_error "Expected to be in the Datadog Prod account (${PROD_ACCOUNT_ID}) but got ${CURRENT_ACCOUNT}. Check AWS_VAULT_PROFILE." +fi +log_success "Authenticated to account ${CURRENT_ACCOUNT}." + +# --- Validate the template ------------------------------------------------- # + +log_info "Validating ${TEMPLATE_PATH}..." +aws_login aws cloudformation validate-template --template-body "file://${TEMPLATE_PATH}" >/dev/null +log_success "Template is valid." + +# --- Refuse to overwrite an already-published version ---------------------- # + +log_info "Checking whether s3://${BUCKET}/${VERSION_KEY} already exists..." +if aws_login aws s3api head-object --bucket "${BUCKET}" --key "${VERSION_KEY}" >/dev/null 2>&1; then + log_error "s3://${BUCKET}/${VERSION_KEY} already exists. Released versions are immutable; bump the version." +fi +log_success "Version ${VERSION} is not published yet." + +# --- Confirm and publish --------------------------------------------------- # + +log_info "About to publish:" +log_info " ${TEMPLATE_PATH}" +log_info " -> s3://${BUCKET}/${VERSION_KEY} (new, immutable)" +log_info " -> s3://${BUCKET}/${LATEST_KEY} (overwrite: latest -> ${VERSION})" + +if ! user_confirm "Continue"; then + log_error "Aborting..." +fi + +log_info "Uploading versioned template..." +aws_login aws s3 cp "${TEMPLATE_PATH}" "s3://${BUCKET}/${VERSION_KEY}" --content-type "text/yaml" + +log_info "Updating latest.yaml..." +aws_login aws s3 cp "${TEMPLATE_PATH}" "s3://${BUCKET}/${LATEST_KEY}" --content-type "text/yaml" + +log_success "Published version ${VERSION}!" +log_info "Versioned URL: ${VERSION_URL}" +log_info "Latest URL: ${LATEST_URL}" +log_info "" +log_info "CloudFormation quick-create URL (latest):" +log_info "https://console.aws.amazon.com/cloudformation/home#/stacks/create/review?templateURL=${LATEST_URL}&stackName=datadog-durable-function-event-forwarder" From e62e263ddd278056a4a35b8272d0d36f40f87b4b Mon Sep 17 00:00:00 2001 From: Yiming Luo <10097700+lym953@users.noreply.github.com> Date: Thu, 11 Jun 2026 14:48:18 -0400 Subject: [PATCH 3/9] Rename function-name filter to ARN filter; default to all statuses - FunctionNameFilter{1..5} -> FunctionArnFilter{1..5}: users now supply an unqualified function ARN (or wildcard over one) instead of a bare name. We append ":*" since detail.functionArn is always version/alias-qualified, and an AllowedPattern rejects a pasted qualified ARN at deploy time. - Statuses now defaults to "" (forward all). The status key is dropped from the EventPattern when empty, and the whole detail block is omitted when neither a status nor a function filter is set (empty detail:{} is invalid). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../README.md | 35 +++-- .../template.yaml | 146 ++++++++++-------- 2 files changed, 107 insertions(+), 74 deletions(-) diff --git a/aws/durable_function_event_forwarder/README.md b/aws/durable_function_event_forwarder/README.md index 268eaa877..ace0ddb99 100644 --- a/aws/durable_function_event_forwarder/README.md +++ b/aws/durable_function_event_forwarder/README.md @@ -44,8 +44,8 @@ EventBridge rule -> Firehose -> Datadog HTTP intake (raw EventBridge JSON) | `DdEnv` | no | "" | Datadog `env` tag. | | `DdVersion` | no | "" | Datadog `version` tag. | | `DdTags` | no | "" | Comma-delimited extra tags (for example `team:durable,owner:platform`). | -| `Statuses` | no | `TIMED_OUT,STOPPED` | EventBridge `detail.status` values to forward. Must be uppercase, comma-delimited. | -| `FunctionNameFilter1` … `FunctionNameFilter5` | no | "" | Up to 5 independent function-name filters. Each accepts a name or EventBridge wildcard (for example `prod-*-orders`). All five empty matches all functions in the region. Each populated slot adds matchers for both unqualified and version/alias-qualified ARNs — see [Filtering multiple functions](#filtering-multiple-functions). | +| `Statuses` | no | "" | EventBridge `detail.status` values to forward (uppercase, comma-delimited). Empty (the default) forwards **all** statuses — no status filter is added to the rule. | +| `FunctionArnFilter1` … `FunctionArnFilter5` | no | "" | Up to 5 independent function-ARN filters. Each accepts an **unqualified** function ARN or an EventBridge wildcard over one (for example `arn:aws:lambda:us-east-2:123456789012:function:my-durable-*`); do not add a version/alias suffix — `:*` is appended automatically. All five empty matches all functions in the region. See [Filtering multiple functions](#filtering-multiple-functions). | | `BufferIntervalSeconds` | no | `60` | Firehose buffer interval (60–900). | `Rules.ApiKeyRequired` asserts at least one of the four API key parameters @@ -227,12 +227,20 @@ Datadog's logs processing pipeline. ## Filtering multiple functions -Up to 5 independent function-name filters are exposed as separate -parameters (`FunctionNameFilter1` … `FunctionNameFilter5`). Each -populated slot contributes two matchers to the EventBridge rule — one -for unqualified ARNs and one for version/alias-qualified ARNs. Empty -slots are stripped from the rendered list at deploy time, so leaving -gaps (e.g., populating slots 1, 3, 5) is fine. +Up to 5 independent function-ARN filters are exposed as separate +parameters (`FunctionArnFilter1` … `FunctionArnFilter5`). Each accepts an +**unqualified** function ARN or an EventBridge wildcard over one (for +example `arn:aws:lambda:us-east-2:123456789012:function:my-durable-*`); +scope by region and account by including them in the pattern. Each +populated slot contributes one matcher to the EventBridge rule: the +supplied ARN with `:*` appended. The durable-execution `detail.functionArn` +always carries a version/alias qualifier (per the AWS +[Monitoring durable functions](https://docs.aws.amazon.com/lambda/latest/dg/durable-monitoring.html#durable-monitoring-eventbridge) +docs), so `:*` is what matches — do not add a qualifier yourself. An +`AllowedPattern` rejects a trailing qualifier so a pasted qualified ARN +fails at deploy time rather than silently matching nothing. Empty slots +are stripped from the rendered list at deploy time, so leaving gaps (e.g., +populating slots 1, 3, 5) is fine. Why five separate parameters instead of one comma-separated list: `AWS::Events::Rule.EventPattern` is typed `Json` (an arbitrary blob), so @@ -245,8 +253,8 @@ exposed as individual parameters. We chose (c) because each slot is locally simple to read in the template. If you need more than 5 filters in one region, either widen one of the -slots with a wildcard (`prod-*` covers every function starting with -`prod-`) or deploy a second stack — they're independent. +slots with a wildcard (`...:function:prod-*` covers every function whose +name starts with `prod-`) or deploy a second stack — they're independent. ## API key from KMS ciphertext @@ -293,9 +301,10 @@ methods. ## Notes -- The function-name filter emits two `wildcard` patterns per value so it - matches both unqualified ARNs and version/alias-qualified ARNs. A single - `suffix(":function:")` matcher would miss qualified ARNs. +- The function-ARN filter emits one `wildcard` pattern per value: the + supplied unqualified ARN with `:*` appended. The event's + `detail.functionArn` always carries a version/alias qualifier, so the + `:*` is what matches; a bare-ARN matcher would never fire. - `BufferingHints` is set explicitly even at its default value: omitting it has historically caused CloudFormation drift on subsequent updates. - The backup bucket is retained on stack deletion diff --git a/aws/durable_function_event_forwarder/template.yaml b/aws/durable_function_event_forwarder/template.yaml index 61d0675cf..def60f8ad 100644 --- a/aws/durable_function_event_forwarder/template.yaml +++ b/aws/durable_function_event_forwarder/template.yaml @@ -36,11 +36,11 @@ Metadata: default: Event filters Parameters: - Statuses - - FunctionNameFilter1 - - FunctionNameFilter2 - - FunctionNameFilter3 - - FunctionNameFilter4 - - FunctionNameFilter5 + - FunctionArnFilter1 + - FunctionArnFilter2 + - FunctionArnFilter3 + - FunctionArnFilter4 + - FunctionArnFilter5 - Label: default: Tuning Parameters: @@ -57,11 +57,11 @@ Metadata: DdVersion: { default: Version tag } DdTags: { default: Additional tags } Statuses: { default: Statuses to forward } - FunctionNameFilter1: { default: Function name filter 1 (optional) } - FunctionNameFilter2: { default: Function name filter 2 (optional) } - FunctionNameFilter3: { default: Function name filter 3 (optional) } - FunctionNameFilter4: { default: Function name filter 4 (optional) } - FunctionNameFilter5: { default: Function name filter 5 (optional) } + FunctionArnFilter1: { default: Function ARN filter 1 (optional) } + FunctionArnFilter2: { default: Function ARN filter 2 (optional) } + FunctionArnFilter3: { default: Function ARN filter 3 (optional) } + FunctionArnFilter4: { default: Function ARN filter 4 (optional) } + FunctionArnFilter5: { default: Function ARN filter 5 (optional) } BufferIntervalSeconds: { default: Firehose buffer interval (seconds) } Mappings: @@ -154,42 +154,54 @@ Parameters: # ---- Event filters ---- Statuses: Type: CommaDelimitedList - Default: "TIMED_OUT,STOPPED" + Default: "" Description: >- Comma-separated list of execution status values to forward. Values must - match the EventBridge detail.status field exactly (uppercase). Default - forwards only terminal failure-like statuses. - # Up to 5 independent function-name filters. CloudFormation has no + match the EventBridge detail.status field exactly (uppercase). Leave + empty (the default) to forward every status — no status filter is added + to the EventBridge rule. + # Up to 5 independent function-ARN filters. CloudFormation has no # native iteration that fits AWS::Events::Rule.EventPattern (a Json blob, # not a schema-typed list), so each slot is exposed as its own optional - # parameter. Each populated slot emits two wildcard matchers — one for - # unqualified ARNs and one for version/alias-qualified ARNs. Slots left - # empty are removed from the EventPattern via AWS::NoValue, so they have - # no effect on the rendered rule. - FunctionNameFilter1: + # parameter. Each populated slot emits one wildcard matcher: the supplied + # UNqualified ARN with ":*" appended, since the event's functionArn always + # carries a version/alias qualifier. The AllowedPattern rejects a trailing + # qualifier so a pasted qualified ARN fails at deploy time instead of + # silently matching nothing. Slots left empty are removed from the + # EventPattern via AWS::NoValue, so they have no effect on the rendered rule. + FunctionArnFilter1: Type: String Default: "" + AllowedPattern: "^$|^arn:aws[a-z-]*:lambda:[a-z0-9-*]*:[0-9*]*:function:[a-zA-Z0-9_*-]+$" Description: >- - Optional Lambda function name or EventBridge wildcard pattern (for - example "my-fn" or "prod-*-orders") used to restrict which functions' - events are captured. If all five FunctionNameFilterN parameters are - empty, the rule matches every function in this region. - FunctionNameFilter2: + Optional UNqualified Lambda function ARN, or an EventBridge wildcard + pattern over one (for example + "arn:aws:lambda:us-east-2:123456789012:function:my-durable-*"), used to + restrict which functions' events are captured. Do not include a version + or alias suffix — ":*" is appended automatically to match any qualifier. + Scope by region and account by including them in the pattern. If all + five FunctionArnFilterN parameters are empty, the rule matches every + function in this region. + FunctionArnFilter2: Type: String Default: "" - Description: Optional additional function name or wildcard pattern. - FunctionNameFilter3: + AllowedPattern: "^$|^arn:aws[a-z-]*:lambda:[a-z0-9-*]*:[0-9*]*:function:[a-zA-Z0-9_*-]+$" + Description: Optional additional unqualified function ARN or wildcard pattern. + FunctionArnFilter3: Type: String Default: "" - Description: Optional additional function name or wildcard pattern. - FunctionNameFilter4: + AllowedPattern: "^$|^arn:aws[a-z-]*:lambda:[a-z0-9-*]*:[0-9*]*:function:[a-zA-Z0-9_*-]+$" + Description: Optional additional unqualified function ARN or wildcard pattern. + FunctionArnFilter4: Type: String Default: "" - Description: Optional additional function name or wildcard pattern. - FunctionNameFilter5: + AllowedPattern: "^$|^arn:aws[a-z-]*:lambda:[a-z0-9-*]*:[0-9*]*:function:[a-zA-Z0-9_*-]+$" + Description: Optional additional unqualified function ARN or wildcard pattern. + FunctionArnFilter5: Type: String Default: "" - Description: Optional additional function name or wildcard pattern. + AllowedPattern: "^$|^arn:aws[a-z-]*:lambda:[a-z0-9-*]*:[0-9*]*:function:[a-zA-Z0-9_*-]+$" + Description: Optional additional unqualified function ARN or wildcard pattern. # ---- Tuning ---- BufferIntervalSeconds: @@ -211,17 +223,25 @@ Conditions: HasEnv: !Not [!Equals [!Ref DdEnv, ""]] HasVersion: !Not [!Equals [!Ref DdVersion, ""]] HasTags: !Not [!Equals [!Ref DdTags, ""]] - HasFilter1: !Not [!Equals [!Ref FunctionNameFilter1, ""]] - HasFilter2: !Not [!Equals [!Ref FunctionNameFilter2, ""]] - HasFilter3: !Not [!Equals [!Ref FunctionNameFilter3, ""]] - HasFilter4: !Not [!Equals [!Ref FunctionNameFilter4, ""]] - HasFilter5: !Not [!Equals [!Ref FunctionNameFilter5, ""]] + # Statuses is a CommaDelimitedList; an empty default joins to "" so this is + # false, which drops the status key from the EventPattern (forward all). + HasStatusFilter: !Not [!Equals [!Join ["", !Ref Statuses], ""]] + HasFilter1: !Not [!Equals [!Ref FunctionArnFilter1, ""]] + HasFilter2: !Not [!Equals [!Ref FunctionArnFilter2, ""]] + HasFilter3: !Not [!Equals [!Ref FunctionArnFilter3, ""]] + HasFilter4: !Not [!Equals [!Ref FunctionArnFilter4, ""]] + HasFilter5: !Not [!Equals [!Ref FunctionArnFilter5, ""]] HasFunctionFilter: !Or - !Condition HasFilter1 - !Condition HasFilter2 - !Condition HasFilter3 - !Condition HasFilter4 - !Condition HasFilter5 + # When neither a status nor a function filter is set, the detail block is + # omitted entirely — an empty "detail: {}" is rejected by EventBridge. + HasDetailFilter: !Or + - !Condition HasStatusFilter + - !Condition HasFunctionFilter Rules: ApiKeyRequired: @@ -536,10 +556,9 @@ Resources: # --------------------------------------------------------------------------- # EventBridge rule. Captures aws.lambda "Durable Execution Status Change" - # events and routes them to Firehose. The functionArn matcher uses two - # wildcard entries so it covers both unqualified ARNs and ARNs with a - # version or alias suffix; a single suffix() match would miss qualified - # ARNs. + # events and routes them to Firehose. Each filter is an unqualified + # function ARN with ":*" appended, because the event's detail.functionArn + # always carries a version/alias qualifier. # --------------------------------------------------------------------------- EventRule: Type: AWS::Events::Rule @@ -553,27 +572,32 @@ Resources: - aws.lambda detail-type: - Durable Execution Status Change - detail: - status: !Ref Statuses - # Two wildcard matchers per populated filter slot: one for - # unqualified ARNs and one for version/alias-qualified ARNs. - # Empty slots resolve to AWS::NoValue and are stripped from the - # rendered list by CloudFormation, so the EventPattern ends up - # with exactly 2N matchers where N is the count of populated - # FunctionNameFilterN slots. - functionArn: !If - - HasFunctionFilter - - - !If [HasFilter1, {wildcard: !Sub "*:function:${FunctionNameFilter1}"}, !Ref AWS::NoValue] - - !If [HasFilter1, {wildcard: !Sub "*:function:${FunctionNameFilter1}:*"}, !Ref AWS::NoValue] - - !If [HasFilter2, {wildcard: !Sub "*:function:${FunctionNameFilter2}"}, !Ref AWS::NoValue] - - !If [HasFilter2, {wildcard: !Sub "*:function:${FunctionNameFilter2}:*"}, !Ref AWS::NoValue] - - !If [HasFilter3, {wildcard: !Sub "*:function:${FunctionNameFilter3}"}, !Ref AWS::NoValue] - - !If [HasFilter3, {wildcard: !Sub "*:function:${FunctionNameFilter3}:*"}, !Ref AWS::NoValue] - - !If [HasFilter4, {wildcard: !Sub "*:function:${FunctionNameFilter4}"}, !Ref AWS::NoValue] - - !If [HasFilter4, {wildcard: !Sub "*:function:${FunctionNameFilter4}:*"}, !Ref AWS::NoValue] - - !If [HasFilter5, {wildcard: !Sub "*:function:${FunctionNameFilter5}"}, !Ref AWS::NoValue] - - !If [HasFilter5, {wildcard: !Sub "*:function:${FunctionNameFilter5}:*"}, !Ref AWS::NoValue] - - !Ref AWS::NoValue + # detail is omitted entirely when neither a status nor a function + # filter is set (an empty "detail: {}" is rejected by EventBridge), + # so the default rule matches on source + detail-type alone. + detail: !If + - HasDetailFilter + - # Status key omitted when Statuses is empty, so the rule forwards + # every status by default. + status: !If [HasStatusFilter, !Ref Statuses, !Ref AWS::NoValue] + # One wildcard matcher per populated filter slot. The user supplies + # an UNqualified function ARN (or a wildcard over one) and we append + # ":*" — the durable-execution detail.functionArn is always + # version/alias-qualified (see AWS "Monitoring durable functions" + # docs), so the ":*" is what actually matches and a bare-ARN matcher + # would never fire. Empty slots resolve to AWS::NoValue and are + # stripped from the rendered list by CloudFormation, so the + # EventPattern ends up with exactly N matchers where N is the count + # of populated FunctionArnFilterN slots. + functionArn: !If + - HasFunctionFilter + - - !If [HasFilter1, {wildcard: !Sub "${FunctionArnFilter1}:*"}, !Ref AWS::NoValue] + - !If [HasFilter2, {wildcard: !Sub "${FunctionArnFilter2}:*"}, !Ref AWS::NoValue] + - !If [HasFilter3, {wildcard: !Sub "${FunctionArnFilter3}:*"}, !Ref AWS::NoValue] + - !If [HasFilter4, {wildcard: !Sub "${FunctionArnFilter4}:*"}, !Ref AWS::NoValue] + - !If [HasFilter5, {wildcard: !Sub "${FunctionArnFilter5}:*"}, !Ref AWS::NoValue] + - !Ref AWS::NoValue + - !Ref AWS::NoValue Targets: - Id: FirehoseTarget Arn: !GetAtt DeliveryStream.Arn From a54f1a93e9f674c66c6a20cf4f3d9ad1f247fbbc Mon Sep 17 00:00:00 2001 From: Yiming Luo <10097700+lym953@users.noreply.github.com> Date: Fri, 12 Jun 2026 13:00:32 -0400 Subject: [PATCH 4/9] Drop KMS-ciphertext API-key option; trim param docs - Remove the DdApiKeyKmsCiphertext / DdApiKeyKmsKeyArn key path and its four deploy-time decrypter resources (Lambda + role + log group + custom resource). It was carried over from the Lambda forwarder's DD_KMS_API_KEY pattern, but that forwarder already has a runtime Lambda; here it meant a whole custom-resource decrypter for the least-secure of the options. The two dynamic-reference paths (Secrets Manager, SSM SecureString) cover the "keep plaintext out of the template" need more securely at zero resource cost. API key is now one of three. Also drops the now-unneeded W1030 cfn-lint suppression. - Trim verbose parameter descriptions; list valid Statuses values (RUNNING/SUCCEEDED/FAILED/TIMED_OUT/STOPPED); replace em-dashes with ASCII so they render correctly in the CloudFormation console. - release.sh: validate with local cfn-lint instead of cloudformation:ValidateTemplate (the publishing role is scoped to S3). Co-Authored-By: Claude Opus 4.8 (1M context) --- .../README.md | 46 +--- .../release.sh | 23 +- .../template.yaml | 201 ++---------------- 3 files changed, 40 insertions(+), 230 deletions(-) diff --git a/aws/durable_function_event_forwarder/README.md b/aws/durable_function_event_forwarder/README.md index ace0ddb99..7cdb6d2f2 100644 --- a/aws/durable_function_event_forwarder/README.md +++ b/aws/durable_function_event_forwarder/README.md @@ -34,11 +34,9 @@ EventBridge rule -> Firehose -> Datadog HTTP intake (raw EventBridge JSON) | Parameter | Required | Default | Description | | --- | --- | --- | --- | -| `DdApiKey` | one of four | "" | Plaintext Datadog API key (`NoEcho`). | -| `DdApiKeySecretArn` | one of four | "" | ARN of a Secrets Manager secret whose `SecretString` is the API key. Resolved via `{{resolve:secretsmanager:...}}`. | -| `DdApiKeySsmParameterName` | one of four | "" | Name of an SSM SecureString parameter holding the API key. Resolved via `{{resolve:ssm-secure:...}}`. | -| `DdApiKeyKmsCiphertext` | one of four | "" | Base64-encoded KMS ciphertext of the API key. A deploy-time Lambda decrypts it via `kms:Decrypt` and hands the plaintext to Firehose as a `NoEcho` custom-resource attribute. See [API key from KMS ciphertext](#api-key-from-kms-ciphertext). | -| `DdApiKeyKmsKeyArn` | when ciphertext set | "" | ARN of the KMS key that encrypted `DdApiKeyKmsCiphertext`. Used to scope the decrypter Lambda's `kms:Decrypt` permission. Required when `DdApiKeyKmsCiphertext` is set (enforced by a `Rules` assertion). | +| `DdApiKey` | one of three | "" | Plaintext Datadog API key (`NoEcho`). | +| `DdApiKeySecretArn` | one of three | "" | ARN of a Secrets Manager secret whose `SecretString` is the API key. Resolved via `{{resolve:secretsmanager:...}}`. | +| `DdApiKeySsmParameterName` | one of three | "" | Name of an SSM SecureString parameter holding the API key. Resolved via `{{resolve:ssm-secure:...}}`. | | `DdSite` | no | `datadoghq.com` | Datadog site; used to build the Firehose destination URL. | | `DdService` | no | `datadog-durable-function-event-forwarder` | Datadog `service` tag applied to every forwarded event. Override to match your existing service taxonomy. | | `DdEnv` | no | "" | Datadog `env` tag. | @@ -48,7 +46,7 @@ EventBridge rule -> Firehose -> Datadog HTTP intake (raw EventBridge JSON) | `FunctionArnFilter1` … `FunctionArnFilter5` | no | "" | Up to 5 independent function-ARN filters. Each accepts an **unqualified** function ARN or an EventBridge wildcard over one (for example `arn:aws:lambda:us-east-2:123456789012:function:my-durable-*`); do not add a version/alias suffix — `:*` is appended automatically. All five empty matches all functions in the region. See [Filtering multiple functions](#filtering-multiple-functions). | | `BufferIntervalSeconds` | no | `60` | Firehose buffer interval (60–900). | -`Rules.ApiKeyRequired` asserts at least one of the four API key parameters +`Rules.ApiKeyRequired` asserts at least one of the three API key parameters is set and fails the stack action with a clear message otherwise. ## Outputs @@ -256,42 +254,6 @@ If you need more than 5 filters in one region, either widen one of the slots with a wildcard (`...:function:prod-*` covers every function whose name starts with `prod-`) or deploy a second stack — they're independent. -## API key from KMS ciphertext - -`DdApiKeyKmsCiphertext` accepts the base64 output of: - -```bash -aws kms encrypt \ - --key-id arn:aws:kms:us-east-1:123456789012:key/abcd... \ - --plaintext "$DATADOG_API_KEY" \ - --query CiphertextBlob \ - --output text -``` - -At stack create/update time, a short-lived Lambda decrypts the ciphertext -once via `kms:Decrypt` and returns the plaintext to CloudFormation as a -`NoEcho` custom-resource attribute. The Firehose `AccessKey` then -references the value via `!GetAtt DecryptedApiKey.ApiKey`. - -The decrypter Lambda's IAM role is scoped to `kms:Decrypt` on the single -key ARN you pass as `DdApiKeyKmsKeyArn`. A `Rules` assertion fails the -stack action if the ciphertext is set without the key ARN. - -**Security trade-off.** The plaintext is materialized in two places -during deploy: the custom-resource response body (suppressed from stack -events by `NoEcho: true`) and the rendered Firehose resource properties. -This is weaker than `DdApiKeySecretArn` or `DdApiKeySsmParameterName`, -which AWS treats specially and never logs. Use the KMS-ciphertext option -when you already have an encrypted ciphertext blob in your deployment -flow (e.g., from `serverless-plugin-kms` or an internal config store) -and don't want to add a Secrets Manager / SSM secret. - -The decrypter resources (`ApiKeyKmsDecrypterFunction`, -`ApiKeyKmsDecrypterRole`, `ApiKeyKmsDecrypterLogGroup`, `DecryptedApiKey`) -are conditional on `UseApiKeyKms` and only exist when -`DdApiKeyKmsCiphertext` is set — no overhead for the other three API-key -methods. - ## Files | File | Purpose | diff --git a/aws/durable_function_event_forwarder/release.sh b/aws/durable_function_event_forwarder/release.sh index c6bdfcaa0..1615b8db3 100755 --- a/aws/durable_function_event_forwarder/release.sh +++ b/aws/durable_function_event_forwarder/release.sh @@ -105,10 +105,25 @@ fi log_success "Authenticated to account ${CURRENT_ACCOUNT}." # --- Validate the template ------------------------------------------------- # - -log_info "Validating ${TEMPLATE_PATH}..." -aws_login aws cloudformation validate-template --template-body "file://${TEMPLATE_PATH}" >/dev/null -log_success "Template is valid." +# Validate locally with cfn-lint rather than cloudformation:ValidateTemplate — +# the publishing role is scoped to S3 and is not granted CloudFormation +# actions. cfn-lint needs no AWS permissions. + +if command -v cfn-lint >/dev/null 2>&1; then + log_info "Validating ${TEMPLATE_PATH} with cfn-lint..." + set +e + cfn-lint "${TEMPLATE_PATH}" + LINT_RC=$? + set -e + # cfn-lint exit codes are a bitmask: 2 = errors, 4 = warnings, 8 = info. + # Only error-level findings should block a release. + if ((LINT_RC & 2)); then + log_error "cfn-lint reported errors; aborting." + fi + log_success "Template passed cfn-lint (no error-level findings)." +else + log_info "cfn-lint not found; skipping local template validation." +fi # --- Refuse to overwrite an already-published version ---------------------- # diff --git a/aws/durable_function_event_forwarder/template.yaml b/aws/durable_function_event_forwarder/template.yaml index def60f8ad..a9d554f98 100644 --- a/aws/durable_function_event_forwarder/template.yaml +++ b/aws/durable_function_event_forwarder/template.yaml @@ -5,15 +5,6 @@ Description: >- to the Datadog HTTP intake via Amazon Data Firehose. Metadata: - # W1030 is suppressed at the template level because cfn-lint can't model - # the KmsCiphertextRequiresKeyArn Rule: it sees DdApiKeyKmsKeyArn's - # empty-string default and warns that the resulting Resource arn might - # not match the IAM ARN regex, even though that resource only exists - # when the parameter is required to be a valid ARN. - cfn-lint: - config: - ignore_checks: - - W1030 AWS::CloudFormation::Interface: ParameterGroups: - Label: @@ -22,8 +13,6 @@ Metadata: - DdApiKey - DdApiKeySecretArn - DdApiKeySsmParameterName - - DdApiKeyKmsCiphertext - - DdApiKeyKmsKeyArn - Label: default: Datadog routing and tagging Parameters: @@ -49,8 +38,6 @@ Metadata: DdApiKey: { default: API key (plaintext) } DdApiKeySecretArn: { default: Secrets Manager secret ARN } DdApiKeySsmParameterName: { default: SSM SecureString parameter name } - DdApiKeyKmsCiphertext: { default: KMS ciphertext (base64) } - DdApiKeyKmsKeyArn: { default: KMS key ARN (for ciphertext) } DdSite: { default: Datadog site } DdService: { default: Service tag } DdEnv: { default: Env tag } @@ -77,44 +64,21 @@ Parameters: Default: "" Description: >- Datadog API key. Provide a plaintext value here OR set DdApiKeySecretArn - OR DdApiKeySsmParameterName instead. The value is passed to Firehose as - the X-Amz-Firehose-Access-Key header and stored opaquely by Firehose; it - is not written to CloudWatch logs by this stack. + OR DdApiKeySsmParameterName instead. DdApiKeySecretArn: Type: String Default: "" AllowedPattern: "^$|^arn:.*:secretsmanager:.*" Description: >- ARN of a Secrets Manager secret whose SecretString is the Datadog API - key. Resolved at deploy time via a CloudFormation `resolve:secretsmanager` - dynamic reference, so the secret value never appears in the template. + key. DdApiKeySsmParameterName: Type: String Default: "" AllowedPattern: "^$|^/[a-zA-Z0-9/_.-]+$" Description: >- Name (not ARN) of an SSM Parameter Store SecureString parameter that - holds the Datadog API key. Resolved at deploy time via a CloudFormation - `resolve:ssm-secure` dynamic reference. - DdApiKeyKmsCiphertext: - Type: String - NoEcho: true - Default: "" - Description: >- - Base64-encoded KMS ciphertext of the Datadog API key (the output of - `aws kms encrypt --plaintext --key-id --query - CiphertextBlob --output text`). A short-lived deploy-time Lambda - decrypts the value and hands the plaintext to Firehose. Requires - DdApiKeyKmsKeyArn so the decrypter's IAM role can be scoped to the - specific key. - DdApiKeyKmsKeyArn: - Type: String - Default: "" - AllowedPattern: "^$|^arn:.*:kms:.*" - Description: >- - ARN of the KMS key that encrypted DdApiKeyKmsCiphertext. Used to - scope the decrypter Lambda's `kms:Decrypt` permission. Required when - DdApiKeyKmsCiphertext is set; ignored otherwise. + holds the Datadog API key. # ---- Routing ---- DdSite: @@ -156,10 +120,10 @@ Parameters: Type: CommaDelimitedList Default: "" Description: >- - Comma-separated list of execution status values to forward. Values must - match the EventBridge detail.status field exactly (uppercase). Leave - empty (the default) to forward every status — no status filter is added - to the EventBridge rule. + Comma-separated list of execution status values to forward. Valid values + are RUNNING, SUCCEEDED, FAILED, TIMED_OUT, and STOPPED (uppercase, matched + against detail.status exactly). Leave empty (the default) to forward every + status - no status filter is added to the EventBridge rule. # Up to 5 independent function-ARN filters. CloudFormation has no # native iteration that fits AWS::Events::Rule.EventPattern (a Json blob, # not a schema-typed list), so each slot is exposed as its own optional @@ -178,7 +142,7 @@ Parameters: pattern over one (for example "arn:aws:lambda:us-east-2:123456789012:function:my-durable-*"), used to restrict which functions' events are captured. Do not include a version - or alias suffix — ":*" is appended automatically to match any qualifier. + or alias suffix - ":*" is appended automatically to match any qualifier. Scope by region and account by including them in the pattern. If all five FunctionArnFilterN parameters are empty, the rule matches every function in this region. @@ -219,7 +183,6 @@ Conditions: UseApiKey: !Not [!Equals [!Ref DdApiKey, ""]] UseApiKeySecret: !Not [!Equals [!Ref DdApiKeySecretArn, ""]] UseApiKeySsm: !Not [!Equals [!Ref DdApiKeySsmParameterName, ""]] - UseApiKeyKms: !Not [!Equals [!Ref DdApiKeyKmsCiphertext, ""]] HasEnv: !Not [!Equals [!Ref DdEnv, ""]] HasVersion: !Not [!Equals [!Ref DdVersion, ""]] HasTags: !Not [!Equals [!Ref DdTags, ""]] @@ -238,7 +201,7 @@ Conditions: - !Condition HasFilter4 - !Condition HasFilter5 # When neither a status nor a function filter is set, the detail block is - # omitted entirely — an empty "detail: {}" is rejected by EventBridge. + # omitted entirely - an empty "detail: {}" is rejected by EventBridge. HasDetailFilter: !Or - !Condition HasStatusFilter - !Condition HasFunctionFilter @@ -250,18 +213,9 @@ Rules: - !Not [!Equals [!Ref DdApiKey, ""]] - !Not [!Equals [!Ref DdApiKeySecretArn, ""]] - !Not [!Equals [!Ref DdApiKeySsmParameterName, ""]] - - !Not [!Equals [!Ref DdApiKeyKmsCiphertext, ""]] - AssertDescription: >- - One of DdApiKey, DdApiKeySecretArn, DdApiKeySsmParameterName, or - DdApiKeyKmsCiphertext must be set. - KmsCiphertextRequiresKeyArn: - RuleCondition: !Not [!Equals [!Ref DdApiKeyKmsCiphertext, ""]] - Assertions: - - Assert: !Not [!Equals [!Ref DdApiKeyKmsKeyArn, ""]] AssertDescription: >- - DdApiKeyKmsKeyArn is required when DdApiKeyKmsCiphertext is set. - The ARN is used to scope the decrypter Lambda's kms:Decrypt - permission. + One of DdApiKey, DdApiKeySecretArn, or DdApiKeySsmParameterName + must be set. Resources: # --------------------------------------------------------------------------- @@ -308,7 +262,7 @@ Resources: # --------------------------------------------------------------------------- # Firehose delivery stream. HTTP endpoint destination targets the Datadog - # Firehose-specific intake (which speaks the Firehose protocol — do not use + # Firehose-specific intake (which speaks the Firehose protocol - do not use # the standard /api/v2/logs endpoint here). Backup mode is FailedDataOnly so # the bucket only receives records the endpoint rejected. # --------------------------------------------------------------------------- @@ -362,124 +316,6 @@ Resources: Resource: - !GetAtt FirehoseLogGroup.Arn - # --------------------------------------------------------------------------- - # Deploy-time API key decrypter. Only created when DdApiKeyKmsCiphertext is - # set. The custom resource invokes the Lambda once per stack action; the - # Lambda calls KMS:Decrypt on the ciphertext blob and returns the plaintext - # API key in a NoEcho response. The Firehose AccessKey then references the - # returned value via !GetAtt. The plaintext is briefly present in the - # custom resource's response and in the Firehose resource's rendered - # properties — a security trade-off that customers using Secrets Manager - # or SSM SecureString dynamic references can avoid. - # --------------------------------------------------------------------------- - ApiKeyKmsDecrypterRole: - Type: AWS::IAM::Role - Condition: UseApiKeyKms - Properties: - AssumeRolePolicyDocument: - Version: "2012-10-17" - Statement: - - Effect: Allow - Principal: - Service: lambda.amazonaws.com - Action: sts:AssumeRole - ManagedPolicyArns: - - !Sub "arn:${AWS::Partition}:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole" - Policies: - - PolicyName: DecryptApiKey - PolicyDocument: - Version: "2012-10-17" - Statement: - - Effect: Allow - Action: kms:Decrypt - Resource: !Ref DdApiKeyKmsKeyArn - - ApiKeyKmsDecrypterFunction: - Type: AWS::Lambda::Function - Condition: UseApiKeyKms - Properties: - Description: >- - Decrypts the KMS-encrypted Datadog API key at deploy time and - returns the plaintext to CloudFormation as a NoEcho custom-resource - attribute. - Runtime: nodejs22.x - Architectures: [arm64] - Handler: index.handler - Role: !GetAtt ApiKeyKmsDecrypterRole.Arn - Timeout: 30 - MemorySize: 128 - Code: - ZipFile: | - 'use strict'; - const https = require('https'); - const { URL } = require('url'); - const { KMSClient, DecryptCommand } = require('@aws-sdk/client-kms'); - - const respond = (event, context, status, data, reason) => - new Promise((resolve, reject) => { - const body = JSON.stringify({ - Status: status, - Reason: reason || `See ${context.logGroupName}/${context.logStreamName}`, - PhysicalResourceId: event.PhysicalResourceId || context.logStreamName, - StackId: event.StackId, - RequestId: event.RequestId, - LogicalResourceId: event.LogicalResourceId, - NoEcho: true, - Data: data || {}, - }); - const u = new URL(event.ResponseURL); - const req = https.request({ - hostname: u.hostname, - port: 443, - path: u.pathname + u.search, - method: 'PUT', - headers: { 'content-type': '', 'content-length': Buffer.byteLength(body) }, - }, (res) => { res.on('data', () => {}); res.on('end', resolve); }); - req.on('error', reject); - req.write(body); - req.end(); - }); - - exports.handler = async (event, context) => { - try { - if (event.RequestType === 'Delete') { - return respond(event, context, 'SUCCESS', {}); - } - const ciphertext = event.ResourceProperties && event.ResourceProperties.Ciphertext; - if (!ciphertext) { - return respond(event, context, 'FAILED', {}, 'Ciphertext property is required'); - } - const kms = new KMSClient({}); - const out = await kms.send(new DecryptCommand({ - CiphertextBlob: Buffer.from(ciphertext, 'base64'), - })); - const apiKey = Buffer.from(out.Plaintext).toString('utf8'); - return respond(event, context, 'SUCCESS', { ApiKey: apiKey }); - } catch (err) { - // Do not log err.message — KMS errors can be informative - // but plaintext should never reach CloudWatch. - console.error('KMS decrypt failed:', err.name || 'Error'); - return respond(event, context, 'FAILED', {}, `Decrypt failed: ${err.name || 'Error'}`); - } - }; - - ApiKeyKmsDecrypterLogGroup: - Type: AWS::Logs::LogGroup - Condition: UseApiKeyKms - Properties: - LogGroupName: !Sub "/aws/lambda/${ApiKeyKmsDecrypterFunction}" - RetentionInDays: 7 - - DecryptedApiKey: - Type: Custom::DatadogApiKeyKmsDecrypt - Condition: UseApiKeyKms - Properties: - ServiceToken: !GetAtt ApiKeyKmsDecrypterFunction.Arn - # Including the ciphertext in the properties is what makes - # CloudFormation re-invoke the custom resource on update if the - # ciphertext changes. - Ciphertext: !Ref DdApiKeyKmsCiphertext - DeliveryStream: Type: AWS::KinesisFirehose::DeliveryStream Properties: @@ -502,22 +338,19 @@ Resources: - !If - UseApiKeySsm - !Sub "{{resolve:ssm-secure:${DdApiKeySsmParameterName}}}" - - !If - - UseApiKeyKms - - !GetAtt DecryptedApiKey.ApiKey - - !Ref AWS::NoValue + - !Ref AWS::NoValue BufferingHints: IntervalInSeconds: !Ref BufferIntervalSeconds SizeInMBs: 4 RetryOptions: DurationInSeconds: 60 # Datadog's Firehose intake does not interpret common-attributes - # header keys (dd-service / dd-source / dd-tags) as log metadata — + # header keys (dd-service / dd-source / dd-tags) as log metadata - # it surfaces each as a raw tag with the literal key, and tag # values can't contain commas so a joined dd-tags value would be # mangled. We explicitly set CommonAttributes: [] (instead of # omitting RequestConfiguration entirely) because CloudFormation - # does not push outright property removals to Firehose — omission + # does not push outright property removals to Firehose - omission # would leave previously-configured attributes live on the stream. # Datadog's AWS integration auto-tags service/source/region/ # aws_account from the raw envelope's source field and the @@ -547,7 +380,7 @@ Resources: # all reshaping (function ARN qualifier stripping, detail.* # flattening, ISO timestamp parsing) is configured on the Datadog # side via a logs processing pipeline. We explicitly set - # Enabled: false instead of omitting ProcessingConfiguration — + # Enabled: false instead of omitting ProcessingConfiguration - # CloudFormation does not push outright property removals to # Firehose, so omitting it would leave a previously-attached Lambda # processor live on the stream. @@ -582,7 +415,7 @@ Resources: status: !If [HasStatusFilter, !Ref Statuses, !Ref AWS::NoValue] # One wildcard matcher per populated filter slot. The user supplies # an UNqualified function ARN (or a wildcard over one) and we append - # ":*" — the durable-execution detail.functionArn is always + # ":*" - the durable-execution detail.functionArn is always # version/alias-qualified (see AWS "Monitoring durable functions" # docs), so the ":*" is what actually matches and a bare-ARN matcher # would never fire. Empty slots resolve to AWS::NoValue and are From ad5d2fc1123ba984deaf73f03d16032a35da03a9 Mon Sep 17 00:00:00 2001 From: Yiming Luo <10097700+lym953@users.noreply.github.com> Date: Fri, 12 Jun 2026 13:23:27 -0400 Subject: [PATCH 5/9] Trim parameter descriptions to reduce console noise Move implementation detail out of customer-facing parameter Descriptions: drop the Firehose URL derivation from DdSite, the service-taxonomy guidance from DdService, and the status-matching/EventBridge-rule notes from Statuses. Preserve the one fact not otherwise self-documenting (the API key becomes the X-Amz-Firehose-Access-Key header; dynamic references keep plaintext out of the template) as a comment at the AccessKey field. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../template.yaml | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/aws/durable_function_event_forwarder/template.yaml b/aws/durable_function_event_forwarder/template.yaml index a9d554f98..1a2db0440 100644 --- a/aws/durable_function_event_forwarder/template.yaml +++ b/aws/durable_function_event_forwarder/template.yaml @@ -85,9 +85,7 @@ Parameters: Type: String Default: datadoghq.com AllowedPattern: .+ - Description: >- - Datadog site to deliver events to. The Firehose destination URL is - derived as https://aws-kinesis-http-intake.logs./v1/input. + Description: Datadog site to deliver events to. # ---- Tagging ---- DdService: @@ -97,9 +95,7 @@ Parameters: ConstraintDescription: DdService cannot be empty. Description: >- Datadog service tag applied to every forwarded event. Defaults to - "datadog-durable-function-event-forwarder"; override to match your - existing service taxonomy if you already have Datadog conventions for - this category of events. + "datadog-durable-function-event-forwarder". DdEnv: Type: String Default: "" @@ -121,9 +117,8 @@ Parameters: Default: "" Description: >- Comma-separated list of execution status values to forward. Valid values - are RUNNING, SUCCEEDED, FAILED, TIMED_OUT, and STOPPED (uppercase, matched - against detail.status exactly). Leave empty (the default) to forward every - status - no status filter is added to the EventBridge rule. + are RUNNING, SUCCEEDED, FAILED, TIMED_OUT, and STOPPED. Leave empty (the + default) to forward every status. # Up to 5 independent function-ARN filters. CloudFormation has no # native iteration that fits AWS::Events::Rule.EventPattern (a Json blob, # not a schema-typed list), so each slot is exposed as its own optional @@ -329,6 +324,11 @@ Resources: # Attributes header on each request, which Datadog's Firehose # intake parses into log metadata / tags). Url: !Sub "https://aws-kinesis-http-intake.logs.${DdSite}/v1/input" + # The API key becomes the X-Amz-Firehose-Access-Key header on each + # request and is stored opaquely by Firehose. The two dynamic- + # reference paths resolve the value straight into this resource at + # deploy time, so the plaintext never appears in the template source, + # the stack parameters, or stack events. AccessKey: !If - UseApiKey - !Ref DdApiKey From cffbe63dfd54ac2af8e640f1a0955f58352482ed Mon Sep 17 00:00:00 2001 From: Yiming Luo <10097700+lym953@users.noreply.github.com> Date: Fri, 12 Jun 2026 14:10:45 -0400 Subject: [PATCH 6/9] Mark Event filters / Statuses as optional in the console Append "(Optional)" to the Event filters parameter group and "(optional)" to the Statuses label, matching the Lambda forwarder's labeling and the sibling FunctionArnFilter labels, so the console makes clear nothing in the section is required. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../release.sh | 158 ------------------ .../template.yaml | 4 +- 2 files changed, 2 insertions(+), 160 deletions(-) delete mode 100755 aws/durable_function_event_forwarder/release.sh diff --git a/aws/durable_function_event_forwarder/release.sh b/aws/durable_function_event_forwarder/release.sh deleted file mode 100755 index 1615b8db3..000000000 --- a/aws/durable_function_event_forwarder/release.sh +++ /dev/null @@ -1,158 +0,0 @@ -#!/usr/bin/env bash - -# Publishes the Lambda Durable Function event-forwarder CloudFormation template -# to the public Datadog CloudFormation template bucket: -# -# https://datadog-cloudformation-template.s3.amazonaws.com/aws/lambda-durable-function-event-forwarder/.yaml -# https://datadog-cloudformation-template.s3.amazonaws.com/aws/lambda-durable-function-event-forwarder/latest.yaml -# -# The bucket lives in the Datadog Prod account (464622532012); the script -# authenticates with that account's prod-engineering role via aws-vault. -# -# Usage: -# ./release.sh # e.g. ./release.sh 0.1.0 -# -# A semantic version is REQUIRED. The script refuses to overwrite an existing -# .yaml (immutable released versions); latest.yaml is always updated to -# point at the version being published. -# -# Override the aws-vault profile with AWS_VAULT_PROFILE if yours differs. - -set -o nounset -o pipefail -o errexit - -log_info() { - local BLUE='\033[0;34m' - local RESET='\033[0m' - printf -- "%b%b%b\n" "${BLUE}" "${*}" "${RESET}" 1>&2 -} - -log_success() { - local GREEN='\033[0;32m' - local RESET='\033[0m' - printf -- "%b%b%b\n" "${GREEN}" "${*}" "${RESET}" 1>&2 -} - -log_error() { - local RED='\033[0;31m' - local RESET='\033[0m' - printf -- "%b%b%b\n" "${RED}" "${*}" "${RESET}" 1>&2 - exit 1 -} - -user_confirm() { - local input - read -r -p "${1:-Are you sure}? [y/N] " input (e.g. ${0} 0.1.0)" -fi - -VERSION="${1}" -if [[ ! ${VERSION} =~ ^[0-9]+\.[0-9]+\.[0-9]+$ ]]; then - log_error "You must use a semantic version (e.g. 0.1.0), got: '${VERSION}'" -fi - -if ! command -v aws >/dev/null 2>&1; then - log_error "aws CLI not found; please install it first" -fi - -if ! command -v aws-vault >/dev/null 2>&1; then - log_error "aws-vault not found; please install it first" -fi - -if [[ ! -f ${TEMPLATE_PATH} ]]; then - log_error "Template not found at ${TEMPLATE_PATH}" -fi - -readonly VERSION_KEY="${KEY_PREFIX}/${VERSION}.yaml" -readonly LATEST_KEY="${KEY_PREFIX}/latest.yaml" -readonly VERSION_URL="https://${BUCKET}.s3.amazonaws.com/${VERSION_KEY}" -readonly LATEST_URL="https://${BUCKET}.s3.amazonaws.com/${LATEST_KEY}" - -# --- Validate identity ----------------------------------------------------- # - -log_info "Authenticating with aws-vault profile '${AWS_VAULT_PROFILE}'..." -CURRENT_ACCOUNT="$(aws_login aws sts get-caller-identity --query Account --output text)" - -if [[ ${CURRENT_ACCOUNT} != "${PROD_ACCOUNT_ID}" ]]; then - log_error "Expected to be in the Datadog Prod account (${PROD_ACCOUNT_ID}) but got ${CURRENT_ACCOUNT}. Check AWS_VAULT_PROFILE." -fi -log_success "Authenticated to account ${CURRENT_ACCOUNT}." - -# --- Validate the template ------------------------------------------------- # -# Validate locally with cfn-lint rather than cloudformation:ValidateTemplate — -# the publishing role is scoped to S3 and is not granted CloudFormation -# actions. cfn-lint needs no AWS permissions. - -if command -v cfn-lint >/dev/null 2>&1; then - log_info "Validating ${TEMPLATE_PATH} with cfn-lint..." - set +e - cfn-lint "${TEMPLATE_PATH}" - LINT_RC=$? - set -e - # cfn-lint exit codes are a bitmask: 2 = errors, 4 = warnings, 8 = info. - # Only error-level findings should block a release. - if ((LINT_RC & 2)); then - log_error "cfn-lint reported errors; aborting." - fi - log_success "Template passed cfn-lint (no error-level findings)." -else - log_info "cfn-lint not found; skipping local template validation." -fi - -# --- Refuse to overwrite an already-published version ---------------------- # - -log_info "Checking whether s3://${BUCKET}/${VERSION_KEY} already exists..." -if aws_login aws s3api head-object --bucket "${BUCKET}" --key "${VERSION_KEY}" >/dev/null 2>&1; then - log_error "s3://${BUCKET}/${VERSION_KEY} already exists. Released versions are immutable; bump the version." -fi -log_success "Version ${VERSION} is not published yet." - -# --- Confirm and publish --------------------------------------------------- # - -log_info "About to publish:" -log_info " ${TEMPLATE_PATH}" -log_info " -> s3://${BUCKET}/${VERSION_KEY} (new, immutable)" -log_info " -> s3://${BUCKET}/${LATEST_KEY} (overwrite: latest -> ${VERSION})" - -if ! user_confirm "Continue"; then - log_error "Aborting..." -fi - -log_info "Uploading versioned template..." -aws_login aws s3 cp "${TEMPLATE_PATH}" "s3://${BUCKET}/${VERSION_KEY}" --content-type "text/yaml" - -log_info "Updating latest.yaml..." -aws_login aws s3 cp "${TEMPLATE_PATH}" "s3://${BUCKET}/${LATEST_KEY}" --content-type "text/yaml" - -log_success "Published version ${VERSION}!" -log_info "Versioned URL: ${VERSION_URL}" -log_info "Latest URL: ${LATEST_URL}" -log_info "" -log_info "CloudFormation quick-create URL (latest):" -log_info "https://console.aws.amazon.com/cloudformation/home#/stacks/create/review?templateURL=${LATEST_URL}&stackName=datadog-durable-function-event-forwarder" diff --git a/aws/durable_function_event_forwarder/template.yaml b/aws/durable_function_event_forwarder/template.yaml index 1a2db0440..cb87b110f 100644 --- a/aws/durable_function_event_forwarder/template.yaml +++ b/aws/durable_function_event_forwarder/template.yaml @@ -22,7 +22,7 @@ Metadata: - DdVersion - DdTags - Label: - default: Event filters + default: Event filters (Optional) Parameters: - Statuses - FunctionArnFilter1 @@ -43,7 +43,7 @@ Metadata: DdEnv: { default: Env tag } DdVersion: { default: Version tag } DdTags: { default: Additional tags } - Statuses: { default: Statuses to forward } + Statuses: { default: Statuses to forward (optional) } FunctionArnFilter1: { default: Function ARN filter 1 (optional) } FunctionArnFilter2: { default: Function ARN filter 2 (optional) } FunctionArnFilter3: { default: Function ARN filter 3 (optional) } From 4dfa70baba5cbb5d66d1ef3514039cad6c6c6c8a Mon Sep 17 00:00:00 2001 From: Yiming Luo <10097700+lym953@users.noreply.github.com> Date: Fri, 12 Jun 2026 14:11:29 -0400 Subject: [PATCH 7/9] Scrub release.sh references from README Publishing is handled in a separate PR. Remove the release.sh usage steps and release checklist from the README and the release.sh row from the Files table, keeping the published-URL pattern and quick-create link. (The release.sh file itself was removed in the previous commit.) Co-Authored-By: Claude Opus 4.8 (1M context) --- .../README.md | 37 ++++--------------- 1 file changed, 7 insertions(+), 30 deletions(-) diff --git a/aws/durable_function_event_forwarder/README.md b/aws/durable_function_event_forwarder/README.md index 7cdb6d2f2..d569001f0 100644 --- a/aws/durable_function_event_forwarder/README.md +++ b/aws/durable_function_event_forwarder/README.md @@ -136,27 +136,14 @@ Once the template is hosted at a public S3 URL, customers can reference it directly — no zip artifact, region replication, or layer publish is needed. The template is the only thing to ship. -Use `release.sh` (modeled on `aws/logs_monitoring/release.sh`), run from this -directory with a required semantic version: +Publishing `template.yaml` to the public `datadog-cloudformation-template` +bucket is handled by separate release tooling, not by this PR. The published +keys are: -```bash -./release.sh 0.1.0 -``` - -The script: - -1. Authenticates to the Datadog Prod account (`464622532012`, which owns the - bucket) using the `prod-engineering` role via - `aws-vault exec sso-prod-engineering` (override with `AWS_VAULT_PROFILE`), - and aborts unless the resolved account is the prod account. -2. Validates `template.yaml` with `aws cloudformation validate-template`. -3. **Refuses to overwrite an already-published `.yaml`** — released - versions are immutable, so bump the version to republish. -4. After a confirmation prompt, uploads `template.yaml` to both keys in the - public bucket `datadog-cloudformation-template`: - - `aws/lambda-durable-function-event-forwarder/.yaml` (new, immutable) - - `aws/lambda-durable-function-event-forwarder/latest.yaml` (floating - pointer, always overwritten to point at the version just published) +- `aws/lambda-durable-function-event-forwarder/.yaml` (immutable; + recommended for nested stacks) +- `aws/lambda-durable-function-event-forwarder/latest.yaml` (floating pointer, + always updated to the latest published version) Customer-facing URLs after publish: @@ -183,15 +170,6 @@ https://console.aws.amazon.com/cloudformation/home#/stacks/quickcreate the API key parameter on the console form; never pre-fill `DdApiKey` in a link. -### Release checklist - -1. Bump `Mappings.Constants.DdDurableEventForwarder.Version` in - `template.yaml` to the new version and merge to `master`. -2. Run `./release.sh ` with the same version — it validates the - template, refuses to overwrite an existing `.yaml`, then uploads - `.yaml` and updates `latest.yaml`. -3. Tag the release (`git tag durable-function-event-forwarder/`). - ## Deploying directly ```bash @@ -259,7 +237,6 @@ name starts with `prod-`) or deploy a second stack — they're independent. | File | Purpose | | --- | --- | | `template.yaml` | Canonical CloudFormation template. | -| `release.sh` | Publishes `template.yaml` to the public `datadog-cloudformation-template` bucket (`aws/lambda-durable-function-event-forwarder/.yaml` + `latest.yaml`). | ## Notes From 6d7645cc079017fbcf107db339484bbf3fd4b9f9 Mon Sep 17 00:00:00 2001 From: Yiming Luo <10097700+lym953@users.noreply.github.com> Date: Fri, 12 Jun 2026 14:38:06 -0400 Subject: [PATCH 8/9] Remove unused service/env/version/tags parameters DdService, DdEnv, DdVersion, and DdTags were declared but never wired to anything (the Firehose intake can't carry them as proper facets, so the stack transmitted nothing). They misled users into thinking they set tags. Remove them and the dead HasEnv/HasVersion/HasTags conditions; DdSite stays (it builds the Firehose URL). Service/env/version/tags are set in the Datadog log processing pipeline instead. cfn-lint is now warning-free. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../README.md | 18 ++------- .../template.yaml | 37 +------------------ 2 files changed, 5 insertions(+), 50 deletions(-) diff --git a/aws/durable_function_event_forwarder/README.md b/aws/durable_function_event_forwarder/README.md index d569001f0..69114177f 100644 --- a/aws/durable_function_event_forwarder/README.md +++ b/aws/durable_function_event_forwarder/README.md @@ -38,10 +38,6 @@ EventBridge rule -> Firehose -> Datadog HTTP intake (raw EventBridge JSON) | `DdApiKeySecretArn` | one of three | "" | ARN of a Secrets Manager secret whose `SecretString` is the API key. Resolved via `{{resolve:secretsmanager:...}}`. | | `DdApiKeySsmParameterName` | one of three | "" | Name of an SSM SecureString parameter holding the API key. Resolved via `{{resolve:ssm-secure:...}}`. | | `DdSite` | no | `datadoghq.com` | Datadog site; used to build the Firehose destination URL. | -| `DdService` | no | `datadog-durable-function-event-forwarder` | Datadog `service` tag applied to every forwarded event. Override to match your existing service taxonomy. | -| `DdEnv` | no | "" | Datadog `env` tag. | -| `DdVersion` | no | "" | Datadog `version` tag. | -| `DdTags` | no | "" | Comma-delimited extra tags (for example `team:durable,owner:platform`). | | `Statuses` | no | "" | EventBridge `detail.status` values to forward (uppercase, comma-delimited). Empty (the default) forwards **all** statuses — no status filter is added to the rule. | | `FunctionArnFilter1` … `FunctionArnFilter5` | no | "" | Up to 5 independent function-ARN filters. Each accepts an **unqualified** function ARN or an EventBridge wildcard over one (for example `arn:aws:lambda:us-east-2:123456789012:function:my-durable-*`); do not add a version/alias suffix — `:*` is appended automatically. All five empty matches all functions in the region. See [Filtering multiple functions](#filtering-multiple-functions). | | `BufferIntervalSeconds` | no | `60` | Firehose buffer interval (60–900). | @@ -97,9 +93,9 @@ Anything beyond these — a service override (`service:my-orders-service`), `env`/`version`, custom tags, attribute flattening, ARN qualifier stripping, timestamp parsing for relative-time tooltips — is the Datadog log processing pipeline's responsibility (see -below). The `DdService`/`DdEnv`/`DdVersion`/`DdTags` parameters remain -on the stack for forward compatibility but are not currently propagated; -configure their equivalents in the pipeline instead. +below). The stack intentionally exposes no service/env/version/tags +parameters: the Firehose intake can't carry them as proper facets, so +configure them in the pipeline instead. ### Datadog-side processing pipeline @@ -162,8 +158,6 @@ parameter's `Default:` value. https://console.aws.amazon.com/cloudformation/home#/stacks/quickcreate ?stackName=datadog-durable-function-event-forwarder &templateURL=https://datadog-cloudformation-template.s3.amazonaws.com/aws/lambda-durable-function-event-forwarder/latest.yaml - ¶m_DdService= - ¶m_DdEnv= ``` (Removed the line breaks — paste as a single URL.) The customer fills in @@ -178,9 +172,7 @@ aws cloudformation deploy \ --stack-name datadog-durable-function-event-forwarder \ --capabilities CAPABILITY_IAM \ --parameter-overrides \ - DdApiKeySecretArn=arn:aws:secretsmanager:us-east-1:123456789012:secret:datadog/api-key-AbCdEf \ - DdService=my-service \ - DdEnv=prod + DdApiKeySecretArn=arn:aws:secretsmanager:us-east-1:123456789012:secret:datadog/api-key-AbCdEf ``` ## Consuming as a nested stack @@ -192,8 +184,6 @@ DurableFunctionEvents: TemplateURL: https://datadog-cloudformation-template.s3.amazonaws.com/aws/lambda-durable-function-event-forwarder/.yaml Parameters: DdApiKeySecretArn: !Ref DatadogApiKeySecret - DdService: my-service - DdEnv: prod ``` The template is fully self-contained — no Lambda zip artifact, no region diff --git a/aws/durable_function_event_forwarder/template.yaml b/aws/durable_function_event_forwarder/template.yaml index cb87b110f..5630c4562 100644 --- a/aws/durable_function_event_forwarder/template.yaml +++ b/aws/durable_function_event_forwarder/template.yaml @@ -14,13 +14,9 @@ Metadata: - DdApiKeySecretArn - DdApiKeySsmParameterName - Label: - default: Datadog routing and tagging + default: Datadog routing Parameters: - DdSite - - DdService - - DdEnv - - DdVersion - - DdTags - Label: default: Event filters (Optional) Parameters: @@ -39,10 +35,6 @@ Metadata: DdApiKeySecretArn: { default: Secrets Manager secret ARN } DdApiKeySsmParameterName: { default: SSM SecureString parameter name } DdSite: { default: Datadog site } - DdService: { default: Service tag } - DdEnv: { default: Env tag } - DdVersion: { default: Version tag } - DdTags: { default: Additional tags } Statuses: { default: Statuses to forward (optional) } FunctionArnFilter1: { default: Function ARN filter 1 (optional) } FunctionArnFilter2: { default: Function ARN filter 2 (optional) } @@ -87,30 +79,6 @@ Parameters: AllowedPattern: .+ Description: Datadog site to deliver events to. - # ---- Tagging ---- - DdService: - Type: String - Default: datadog-durable-function-event-forwarder - AllowedPattern: .+ - ConstraintDescription: DdService cannot be empty. - Description: >- - Datadog service tag applied to every forwarded event. Defaults to - "datadog-durable-function-event-forwarder". - DdEnv: - Type: String - Default: "" - Description: Datadog env tag. Optional. - DdVersion: - Type: String - Default: "" - Description: Datadog version tag. Optional. - DdTags: - Type: String - Default: "" - Description: >- - Extra comma-delimited tags appended to every forwarded event (for - example team:durable,owner:platform). Optional. - # ---- Event filters ---- Statuses: Type: CommaDelimitedList @@ -178,9 +146,6 @@ Conditions: UseApiKey: !Not [!Equals [!Ref DdApiKey, ""]] UseApiKeySecret: !Not [!Equals [!Ref DdApiKeySecretArn, ""]] UseApiKeySsm: !Not [!Equals [!Ref DdApiKeySsmParameterName, ""]] - HasEnv: !Not [!Equals [!Ref DdEnv, ""]] - HasVersion: !Not [!Equals [!Ref DdVersion, ""]] - HasTags: !Not [!Equals [!Ref DdTags, ""]] # Statuses is a CommaDelimitedList; an empty default joins to "" so this is # false, which drops the status key from the EventPattern (forward all). HasStatusFilter: !Not [!Equals [!Join ["", !Ref Statuses], ""]] From 0ace987e3b6b656e316f8f26107be6835bd1cdc5 Mon Sep 17 00:00:00 2001 From: Yiming Luo <10097700+lym953@users.noreply.github.com> Date: Fri, 12 Jun 2026 15:01:22 -0400 Subject: [PATCH 9/9] Trim README and fix the event payload example - Cut the Publishing/Deploying/nested-stack/Filtering/Files/Notes sections; the Datadog-side pipeline section now just says to install the AWS Lambda integration (its OOTB logs pipeline is provisioned automatically). - Correct the example payload to the field names from AWS's "Monitoring durable functions" doc (durableExecutionArn, durableExecutionName, functionArn, status, startTimestamp; endTimestamp for terminal states) and link the doc. The previous executionName/executionStartTime/executionEndTime fields were inaccurate. - Remove the speculative auto-tag enumeration / integration attribution; state only that tagging and reshaping happen on the Datadog side. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../README.md | 189 ++---------------- 1 file changed, 20 insertions(+), 169 deletions(-) diff --git a/aws/durable_function_event_forwarder/README.md b/aws/durable_function_event_forwarder/README.md index 69114177f..94606d5f6 100644 --- a/aws/durable_function_event_forwarder/README.md +++ b/aws/durable_function_event_forwarder/README.md @@ -22,10 +22,7 @@ EventBridge rule -> Firehose -> Datadog HTTP intake (raw EventBridge JSON) `https://aws-kinesis-http-intake.logs./v1/input` using the Datadog API key as the `X-Amz-Firehose-Access-Key` header. The stack does **not** attach any custom metadata to Firehose's outbound - requests; Datadog's AWS integration auto-tags incoming logs from the - EventBridge envelope (`source:lambda`, `service:lambda`, - `region:`, `aws_account:`, - `sourcecategory:aws`) and from the Firehose ARN. + requests; tagging and reshaping are handled on the Datadog side. - Records the endpoint rejects are written to the S3 backup bucket (`S3BackupMode: FailedDataOnly`); under normal operation the bucket stays empty. @@ -38,8 +35,8 @@ EventBridge rule -> Firehose -> Datadog HTTP intake (raw EventBridge JSON) | `DdApiKeySecretArn` | one of three | "" | ARN of a Secrets Manager secret whose `SecretString` is the API key. Resolved via `{{resolve:secretsmanager:...}}`. | | `DdApiKeySsmParameterName` | one of three | "" | Name of an SSM SecureString parameter holding the API key. Resolved via `{{resolve:ssm-secure:...}}`. | | `DdSite` | no | `datadoghq.com` | Datadog site; used to build the Firehose destination URL. | -| `Statuses` | no | "" | EventBridge `detail.status` values to forward (uppercase, comma-delimited). Empty (the default) forwards **all** statuses — no status filter is added to the rule. | -| `FunctionArnFilter1` … `FunctionArnFilter5` | no | "" | Up to 5 independent function-ARN filters. Each accepts an **unqualified** function ARN or an EventBridge wildcard over one (for example `arn:aws:lambda:us-east-2:123456789012:function:my-durable-*`); do not add a version/alias suffix — `:*` is appended automatically. All five empty matches all functions in the region. See [Filtering multiple functions](#filtering-multiple-functions). | +| `Statuses` | no | "" | EventBridge `detail.status` values to forward (uppercase, comma-delimited). Empty (the default) forwards **all** statuses. | +| `FunctionArnFilter1` … `FunctionArnFilter5` | no | "" | Up to 5 independent function-ARN filters. Each accepts an **unqualified** function ARN or an EventBridge wildcard over one (for example `arn:aws:lambda:us-east-2:123456789012:function:my-durable-*`); do not add a version/alias suffix — `:*` is appended automatically. All five empty matches all functions in the region. | | `BufferIntervalSeconds` | no | `60` | Firehose buffer interval (60–900). | `Rules.ApiKeyRequired` asserts at least one of the three API key parameters @@ -58,183 +55,37 @@ is set and fails the stack action with a clear message otherwise. The stack does **no transformation in AWS**. Firehose forwards each EventBridge record to Datadog verbatim, so Datadog receives the raw -envelope: +envelope. See AWS's +[Monitoring durable functions](https://docs.aws.amazon.com/lambda/latest/dg/durable-monitoring.html#durable-monitoring-eventbridge) +for the full event schema and the five `status` values (`RUNNING`, +`SUCCEEDED`, `FAILED`, `TIMED_OUT`, `STOPPED`): ```json { "version": "0", - "id": "...", + "id": "d019b03c-a8a3-9d58-85de-241e96206538", "detail-type": "Durable Execution Status Change", "source": "aws.lambda", "account": "123456789012", - "time": "", + "time": "2025-11-20T13:08:22Z", "region": "us-east-1", "resources": [], "detail": { - "functionArn": "arn:aws:lambda:us-east-1:123456789012:function:my-fn:$LATEST", - "executionName": "...", - "executionStartTime": "", - "executionEndTime": "", - "status": "TIMED_OUT" + "durableExecutionArn": "arn:aws:lambda:us-east-1:123456789012:function:my-function:$LATEST/durable-execution/090c4189-b18b-4296-9d0c-cfd01dc3a122/9f7d84c9-ea3d-3ffc-b3e5-5ec51c34ffc9", + "durableExecutionName": "order-123", + "functionArn": "arn:aws:lambda:us-east-1:123456789012:function:my-function:2", + "status": "RUNNING", + "startTimestamp": "2025-11-20T13:08:22.345Z" } } ``` -The stack itself does not attach metadata to the Firehose request. -Datadog's AWS integration auto-derives these tags from the envelope and -the Firehose ARN: - -- `source:lambda` and `service:lambda` (from `source:aws.lambda`) -- `region:` -- `aws_account:` -- `sourcecategory:aws` - -Anything beyond these — a service override -(`service:my-orders-service`), `env`/`version`, custom tags, attribute -flattening, ARN qualifier stripping, timestamp parsing for relative-time -tooltips — is the Datadog log processing pipeline's responsibility (see -below). The stack intentionally exposes no service/env/version/tags -parameters: the Firehose intake can't carry them as proper facets, so -configure them in the pipeline instead. +Terminal states (`SUCCEEDED`, `STOPPED`, `FAILED`, `TIMED_OUT`) also include +an `endTimestamp`. ### Datadog-side processing pipeline -Configure a Datadog logs processing pipeline (Logs → Configuration → -Pipelines → New Pipeline, filter `source:lambda` + -`@detail-type:"Durable Execution Status Change"`) with these -processors: - -1. **Date Remapper** on `time` so EventBridge's `time` becomes the log's - official date. -2. **Attribute Remapper** to flatten `detail.*` to top-level attributes — - for example `detail.functionArn` → `function_arn`, - `detail.executionName` → `lambda.durable_function.execution_name`, - etc. (Use snake_case names so they match the rest of the Lambda - namespace.) -3. **Grok / String Builder** to strip the `:` suffix - (`:$LATEST`, `:prod`, `:1`, …) from `function_arn`, so all events for - the same function share a single ARN value regardless of how it was - invoked. -4. **Arithmetic Processor** on `detail.executionStartTime` / - `detail.executionEndTime` (parse to epoch ms) if you want numeric - range facets and the relative-time tooltip on those fields. -5. **Message Remapper** if you want a human-readable message like - `Durable execution is `. - -These are all UI-configurable; no template changes needed. The benefit -of doing this in Datadog rather than in a transformer Lambda is that -pipeline tweaks ship instantly without redeploying the stack, and you -get to test against a sample log via Datadog's pipeline preview. - -## Publishing the template (Datadog operators) - -Once the template is hosted at a public S3 URL, customers can reference it -directly — no zip artifact, region replication, or layer publish is needed. -The template is the only thing to ship. - -Publishing `template.yaml` to the public `datadog-cloudformation-template` -bucket is handled by separate release tooling, not by this PR. The published -keys are: - -- `aws/lambda-durable-function-event-forwarder/.yaml` (immutable; - recommended for nested stacks) -- `aws/lambda-durable-function-event-forwarder/latest.yaml` (floating pointer, - always updated to the latest published version) - -Customer-facing URLs after publish: - -- Versioned (recommended for nested stacks): - `https://datadog-cloudformation-template.s3.amazonaws.com/aws/lambda-durable-function-event-forwarder/.yaml` -- Latest (convenient for one-off console deploys; not pinned, will change): - `https://datadog-cloudformation-template.s3.amazonaws.com/aws/lambda-durable-function-event-forwarder/latest.yaml` - -### Quick-create link (give this to customers) - -Drop the URL into a CloudFormation quick-create deeplink so customers can -launch the stack with one click. Anything not pre-filled defaults to the -parameter's `Default:` value. - -``` -https://console.aws.amazon.com/cloudformation/home#/stacks/quickcreate - ?stackName=datadog-durable-function-event-forwarder - &templateURL=https://datadog-cloudformation-template.s3.amazonaws.com/aws/lambda-durable-function-event-forwarder/latest.yaml -``` - -(Removed the line breaks — paste as a single URL.) The customer fills in -the API key parameter on the console form; never pre-fill `DdApiKey` in a -link. - -## Deploying directly - -```bash -aws cloudformation deploy \ - --template-file template.yaml \ - --stack-name datadog-durable-function-event-forwarder \ - --capabilities CAPABILITY_IAM \ - --parameter-overrides \ - DdApiKeySecretArn=arn:aws:secretsmanager:us-east-1:123456789012:secret:datadog/api-key-AbCdEf -``` - -## Consuming as a nested stack - -```yaml -DurableFunctionEvents: - Type: AWS::CloudFormation::Stack - Properties: - TemplateURL: https://datadog-cloudformation-template.s3.amazonaws.com/aws/lambda-durable-function-event-forwarder/.yaml - Parameters: - DdApiKeySecretArn: !Ref DatadogApiKeySecret -``` - -The template is fully self-contained — no Lambda zip artifact, no region -replication, no `ZipCopier` custom resource. Firehose forwards -EventBridge records to Datadog directly; all reshaping happens in -Datadog's logs processing pipeline. - -## Filtering multiple functions - -Up to 5 independent function-ARN filters are exposed as separate -parameters (`FunctionArnFilter1` … `FunctionArnFilter5`). Each accepts an -**unqualified** function ARN or an EventBridge wildcard over one (for -example `arn:aws:lambda:us-east-2:123456789012:function:my-durable-*`); -scope by region and account by including them in the pattern. Each -populated slot contributes one matcher to the EventBridge rule: the -supplied ARN with `:*` appended. The durable-execution `detail.functionArn` -always carries a version/alias qualifier (per the AWS -[Monitoring durable functions](https://docs.aws.amazon.com/lambda/latest/dg/durable-monitoring.html#durable-monitoring-eventbridge) -docs), so `:*` is what matches — do not add a qualifier yourself. An -`AllowedPattern` rejects a trailing qualifier so a pasted qualified ARN -fails at deploy time rather than silently matching nothing. Empty slots -are stripped from the rendered list at deploy time, so leaving gaps (e.g., -populating slots 1, 3, 5) is fine. - -Why five separate parameters instead of one comma-separated list: -`AWS::Events::Rule.EventPattern` is typed `Json` (an arbitrary blob), so -CloudFormation does not auto-convert `Fn::ForEach` Map output to a list -the way it does for schema-typed list properties. The only ways to -build a dynamic-length list inside an `EventPattern` are (a) a -custom-resource macro, (b) `CommaDelimitedList` with `!Select` plus -inline comma-padding tricks repeated per slot, or (c) fixed N slots -exposed as individual parameters. We chose (c) because each slot is -locally simple to read in the template. - -If you need more than 5 filters in one region, either widen one of the -slots with a wildcard (`...:function:prod-*` covers every function whose -name starts with `prod-`) or deploy a second stack — they're independent. - -## Files - -| File | Purpose | -| --- | --- | -| `template.yaml` | Canonical CloudFormation template. | - -## Notes - -- The function-ARN filter emits one `wildcard` pattern per value: the - supplied unqualified ARN with `:*` appended. The event's - `detail.functionArn` always carries a version/alias qualifier, so the - `:*` is what matches; a bare-ARN matcher would never fire. -- `BufferingHints` is set explicitly even at its default value: omitting - it has historically caused CloudFormation drift on subsequent updates. -- The backup bucket is retained on stack deletion - (`DeletionPolicy: Retain`) so failed records survive teardown. +Install the **AWS Lambda** integration in Datadog; its out-of-the-box logs +pipeline is provisioned automatically and reshapes these events (field +renaming, ARN qualifier stripping, timestamp parsing, human-readable +message). No manual pipeline setup is required.