Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitlab/datasources/environments.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@ environments:
account: 425362996713
add_layer_version_permissions: 0
automatically_bump_version: 1
serverless_testing:
external_id: serverless-testing-publish-externalid
role_to_assume: lambda-extension-image-publisher
account: 093468662994
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new serverless_testing environment entry is included in the range $environment_name, $environment := (ds "environments").environments loop in .gitlab/templates/pipeline.yaml.tpl, which generates publish layer <env> jobs that expect add_layer_version_permissions and automatically_bump_version. Since those keys are missing here, the generated pipeline will either render invalid values or run publish-layer jobs with unintended defaults/behavior.

Consider either (a) adding the missing fields with explicit values for serverless_testing and ensuring the role can publish layers, or (b) changing the pipeline template/data model to exclude serverless_testing from the layer-publish environment loop (e.g., add a flag to environments and skip when false).

Suggested change
account: 093468662994
account: 093468662994
add_layer_version_permissions: 0
automatically_bump_version: 0

Copilot uses AI. Check for mistakes.
prod:
external_id: prod-publish-externalid
role_to_assume: dd-serverless-layer-deployer-role
Expand Down
23 changes: 13 additions & 10 deletions .gitlab/scripts/build_private_image.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,17 +7,17 @@

set -e

DOCKER_TARGET_IMAGE="425362996713.dkr.ecr.us-east-1.amazonaws.com/self-monitoring-lambda-extension"
# ECR target for private extension images, used by self-monitoring container runtimes.
# Defaults to the serverless-testing account's datadog-lambda-extension repo.
PRIVATE_IMAGE_ECR_ACCOUNT="${PRIVATE_IMAGE_ECR_ACCOUNT:-093468662994}"
PRIVATE_IMAGE_ECR_REPO="${PRIVATE_IMAGE_ECR_REPO:-datadog-lambda-extension}"
DOCKER_TARGET_IMAGE="${PRIVATE_IMAGE_ECR_ACCOUNT}.dkr.ecr.us-east-1.amazonaws.com/${PRIVATE_IMAGE_ECR_REPO}"
EXTENSION_DIR=".layers"
IMAGE_TAG="latest"

printf "Authenticating Docker to ECR...\n"
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 425362996713.dkr.ecr.us-east-1.amazonaws.com
printf "Authenticating Docker to ECR (%s)...\n" "$PRIVATE_IMAGE_ECR_ACCOUNT"
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin "${PRIVATE_IMAGE_ECR_ACCOUNT}.dkr.ecr.us-east-1.amazonaws.com"

# NOTE: this probably does not work the way that we expect it to, especially
# when suffixes are involved. This is a known bug but we don't really check
# anything other than the basic `self-monitoring-lambda-extension:latest` image
# in our self-monitoring, so it's not a thing we're going to fix right now.
LAYER_NAME="Datadog-Extension"
if [ -z "$PIPELINE_LAYER_SUFFIX" ]; then
printf "Building container images tagged without suffix\n"
Expand All @@ -26,8 +26,11 @@ else
LAYER_NAME="${LAYER_NAME}-${PIPELINE_LAYER_SUFFIX}"
fi

# Increment last version
latest_version=$(aws lambda list-layer-versions --region us-east-1 --layer-name $LAYER_NAME --query 'LayerVersions[0].Version || `0`')
# Get the latest published layer version to derive the image tag.
# Layers are published in the sandbox account (425362996713), so query there
# regardless of which account we're pushing images to.
SANDBOX_ACCOUNT="425362996713"
latest_version=$(aws lambda list-layer-versions --region us-east-1 --layer-name "arn:aws:lambda:us-east-1:${SANDBOX_ACCOUNT}:layer:${LAYER_NAME}" --query 'LayerVersions[0].Version || `0`')
VERSION=$(($latest_version + 1))
Comment on lines +33 to 34
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script now calls aws lambda list-layer-versions against a sandbox layer ARN (425362996713), but the CI job that runs it was updated to assume the serverless_testing role. Unless that role is granted lambda:ListLayerVersions on the sandbox layer resource, this call will fail with AccessDenied and the image publish job will exit (due to set -e).

Either ensure the lambda-extension-image-publisher role has cross-account permission to list versions for the sandbox layer ARN(s), or adjust the workflow so the layer version lookup is performed with sandbox credentials (e.g., assume the sandbox role just for this lookup).

Suggested change
latest_version=$(aws lambda list-layer-versions --region us-east-1 --layer-name "arn:aws:lambda:us-east-1:${SANDBOX_ACCOUNT}:layer:${LAYER_NAME}" --query 'LayerVersions[0].Version || `0`')
VERSION=$(($latest_version + 1))
SANDBOX_LAYER_ARN="arn:aws:lambda:us-east-1:${SANDBOX_ACCOUNT}:layer:${LAYER_NAME}"
if [ -n "${SANDBOX_LAYER_LOOKUP_ROLE_ARN:-}" ]; then
printf "Assuming sandbox role for layer version lookup in account %s...\n" "$SANDBOX_ACCOUNT"
assume_role_output=$(aws sts assume-role \
--region us-east-1 \
--role-arn "$SANDBOX_LAYER_LOOKUP_ROLE_ARN" \
--role-session-name "build-private-image-layer-lookup" \
--query 'Credentials.[AccessKeyId,SecretAccessKey,SessionToken]' \
--output text)
read -r sandbox_access_key_id sandbox_secret_access_key sandbox_session_token <<EOF
$assume_role_output
EOF
latest_version=$(
AWS_ACCESS_KEY_ID="$sandbox_access_key_id" \
AWS_SECRET_ACCESS_KEY="$sandbox_secret_access_key" \
AWS_SESSION_TOKEN="$sandbox_session_token" \
aws lambda list-layer-versions \
--region us-east-1 \
--layer-name "$SANDBOX_LAYER_ARN" \
--query 'LayerVersions[0].Version || `0`' \
--output text
)
else
latest_version=$(aws lambda list-layer-versions \
--region us-east-1 \
--layer-name "$SANDBOX_LAYER_ARN" \
--query 'LayerVersions[0].Version || `0`' \
--output text)
fi
if ! [[ "$latest_version" =~ ^[0-9]+$ ]]; then
printf "Failed to resolve a numeric sandbox layer version for %s. Configure SANDBOX_LAYER_LOOKUP_ROLE_ARN with a role that can call lambda:ListLayerVersions on the sandbox layer, or grant the current role that permission.\n" "$SANDBOX_LAYER_ARN" >&2
exit 1
fi
VERSION=$((latest_version + 1))

Copilot uses AI. Check for mistakes.
printf "Tagging container image with version: $VERSION and latest\n"

Expand All @@ -39,4 +42,4 @@ docker buildx build \
--tag "$DOCKER_TARGET_IMAGE:${VERSION}${SUFFIX}" \
--push .

printf "Image built and pushed to $DOCKER_TARGET_IMAGE:${IMAGE_TAG}${SUFFIX} for ${PLATFORM}\n"
printf "Image built and pushed to $DOCKER_TARGET_IMAGE:${IMAGE_TAG}${SUFFIX}\n"
2 changes: 1 addition & 1 deletion .gitlab/templates/pipeline.yaml.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -235,7 +235,7 @@ publish private images ({{ $multi_arch_image_flavor.name }}):
variables:
SUFFIX: {{ $multi_arch_image_flavor.suffix }}
before_script:
{{ with $environment := (ds "environments").environments.sandbox }}
{{ with $environment := (ds "environments").environments.serverless_testing }}
- EXTERNAL_ID_NAME={{ $environment.external_id }} ROLE_TO_ASSUME={{ $environment.role_to_assume }} AWS_ACCOUNT={{ $environment.account }} source .gitlab/scripts/get_secrets.sh
{{ end }}
script:
Expand Down
Loading