Skip to content

chore: add e2e lambda instrumentation test suite#706

Draft
ava-silver wants to merge 6 commits into
mainfrom
ava.silver/chore/add-e2e-lambda-instrumentation-test-suite
Draft

chore: add e2e lambda instrumentation test suite#706
ava-silver wants to merge 6 commits into
mainfrom
ava.silver/chore/add-e2e-lambda-instrumentation-test-suite

Conversation

@ava-silver

@ava-silver ava-silver commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Adds a full-lifecycle end-to-end test suite for the AWS Lambda instrumentation this plugin performs. It deploys a real, ephemeral Lambda with the plugin enabled, verifies the deployed config and the telemetry it ships to Datadog, proves re-deploy is idempotent, then tears the stack down and verifies a clean end-state.

Conforms to the shared contract in serverless-ci/e2e/spec.md and mirrors the datadog-ci reference suite (e2e/cloud-run.test.ts + e2e/helpers/*).

Full lifecycle (e2e/lambda.test.ts):

sls deploy (APPLY: provision + instrument)  -> verify CONFIG
  -> aws lambda invoke (trigger)             -> verify TELEMETRY (traces + logs)
  -> sls deploy again                         -> assert IDEMPOTENT (no diff/dup)
  -> sls remove (REMOVE)                      -> verify CLEAN (function gone)
  -> teardown (always, even on failure)

For this tool the plugin runs as part of sls deploy, so provisioning the uninstrumented workload and APPLY are the same step. REMOVE deletes the whole CloudFormation stack, so the clean end-state is the function (and all its DD config) being absent -- asserted explicitly.

Config (helpers/lambda-verifier.ts): asserts the pinned Datadog Node layer + extension layer (versions read from src/layers.json, so drift blames the plugin), the redirected handler with the original preserved in DD_LAMBDA_HANDLER, the required DD_* env vars, and the service / env / version / dd_sls_plugin tags. Identity (run-id service name, env, version) is asserted, not mere presence.

Telemetry (helpers/lambda-telemetry-checker.ts): polls spans + logs (15s × 20) filtered by the unique service name; matched records must carry the full identity (service + env + version), asserting identity, not existence.

Idempotent: re-deploy must produce a byte-for-byte identical instrumentation snapshot -- no double-wrap, no duplicate layers.

Motivation

The plugin had no end-to-end coverage of the instrumentation it actually performs against a live AWS account and a live Datadog org. Unit tests verify the generated config; this suite verifies that a real deploy produces a working, instrumented function that ships correctly-tagged telemetry, and that removal leaves nothing behind.

Testing Guidelines

Ran the full suite locally end-to-end against the serverless sandbox account (us-east-1):

Test Files  1 passed (1)
     Tests  4 passed (4)

Config verified on the real deployed function; spans (identity service + e2e + 1.0.0) and logs (service + e2e) found in Datadog; idempotent re-apply; clean removal with no leaked CloudFormation stacks.

cd e2e
cp .env.local.example .env.local   # fill in DATADOG_API_KEY / DATADOG_APP_KEY
npm install
aws-vault exec sso-serverless-sandbox-account-admin -- npm test

See e2e/README.md for full local-run + auth prerequisites.

Additional Notes

Auth (CI) -- no static Datadog keys live in this repo:

  • AWS: GitHub→AWS OIDC via aws-actions/configure-aws-credentials, assuming vars.AWS_ROLE_ARN_E2E (scoped to this repo, e2e sandbox account).
  • Datadog: short-lived API + App keys minted at runtime via DataDog/dd-sts-action under the serverless-plugin-datadog-e2e policy, exported to the suite as DATADOG_API_KEY / DATADOG_APP_KEY.

Fail-loud, skip-quiet: the suite fails loudly on any auth or telemetry failure. It only no-ops via the SKIP_LAMBDA_TESTS flag / the dorny/paths-filter gate when no relevant files changed (src/**, e2e/**, the workflow file).

Resource hygiene: every run uses a unique name one-e2e-slsplugin-lambda-<runid> and stamps one_e2e_created:<unix-ts> atomically at creation (helpers/naming.ts) on the stack and every resource, so the cross-repo sweeper can age it out. In-test teardown runs in afterAll regardless of outcome.

Pinned + bounded: pinned artifacts (Node layer + extension via layers.json, serverless@3, one canonical runtime nodejs20.x), bounded retries on transient cloud errors (helpers/exec.ts), telemetry polled on a budget. TS helpers are runner-agnostic (node:assert, no vitest imports).

Packaging: standalone npm project under e2e/ (isolated from the plugin Yarn Berry setup). setup.sh builds the plugin and installs it into the workload fixture from a packed tarball -- a file: link would recurse, since the repo root contains the fixture.

Required repo settings: variables AWS_ROLE_ARN_E2E (OIDC deploy role), AWS_REGION_E2E (default us-east-1), optionally DD_SITE_E2E. The OIDC role and the dd-sts policy backing it are cataloged in serverless-ci/e2e/iam-infra.md.

Types of changes

  • Bug fix
  • New feature
  • Breaking change
  • Misc (docs, refactoring, dependency upgrade, etc.)

Check all that apply

  • This PR's description is comprehensive
  • This PR contains breaking changes that are documented in the description
  • This PR introduces new APIs or parameters that are documented and unlikely to change in the foreseeable future
  • This PR impacts documentation, and it has been updated (or a ticket has been logged)
  • This PR's changes are covered by the automated tests
  • This PR collects user input/sensitive content into Datadog

Copy link
Copy Markdown
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@codecov-commenter

codecov-commenter commented Jun 16, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.69%. Comparing base (2d7ee65) to head (589965f).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #706   +/-   ##
=======================================
  Coverage   77.69%   77.69%           
=======================================
  Files          12       12           
  Lines        1112     1112           
  Branches      350      350           
=======================================
  Hits          864      864           
  Misses        118      118           
  Partials      130      130           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@datadog-datadog-prod-us1-2

datadog-datadog-prod-us1-2 Bot commented Jun 16, 2026

Copy link
Copy Markdown

Pipelines

Fix all issues with BitsAI

⚠️ Warnings

🚦 1 Pipeline job failed

e2e | Lambda e2e (Node 20)   View in Datadog   GitHub Actions

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 589965f | Docs | Datadog PR Page | Give us feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants