Claude instructions to reproduce/troubleshoot CI failures by cataphract · Pull Request #3778 · DataDog/dd-trace-php

cataphract · 2026-04-07T13:52:55Z

Description

Add instructions so that claude can monitor CI pipelines, download artifacts, reproduce and troubleshoot CI failures locally.

These were mostly generated through several iterations where claude is asked to execute jobs by reading the instructions and suggest changes to the files and then by manually reviewing the suggestions.

In the end, the instructions don't need to be 100% accurate, and claude fails to closely follow the instructions 100% of the time anyway; the point is that the instructions and the helper scripts allow it to mimic the CI jobs locally in a reasonable amount of time, without excessive trial and error.

Reviewer checklist

Test coverage seems ok.
Appropriate labels assigned.

datadog-datadog-prod-us1 · 2026-04-07T14:05:23Z

⚠️ Tests

✨ Fix all issues with BitsAI or with Cursor

⚠️ Other Violations

🧪 9 Tests failed

ext/openssl/tests/bug74796.phpt (Bug #74796: TLS encryption fails behind HTTP proxy) from php.ext.openssl.tests

(Fix with Cursor)

001- string(19) "Hello from server 0"
002- NULL
003- string(19) "Hello from server 1"
004- NULL
005- string(19) "Hello from server 2"
006- NULL
007- cs.php.net
008- uk.php.net
009- us.php.net
002+ error:0A000086:SSL routines::certificate verify failed in /usr/local/src/php/ext/openssl/tests/ServerClientTestCase.inc(191) : eval()'d code on line 15
...

ext/openssl/tests/sni_server_key_cert.phpt (sni_server with separate pk and cert) from php.ext.openssl.tests

(Fix with Cursor)

001- string(%d) "cs.php.net"
002- string(%d) "uk.php.net"
003- string(%d) "us.php.net"
002+ error:0A000086:SSL routines::certificate verify failed in /usr/local/src/php/ext/openssl/tests/ServerClientTestCase.inc(191) : eval()'d code on line 9
003+ 
005+ 
007+ 
009+ 
010+ Deprecated: openssl_x509_parse(): Passing null to parameter #1 ($certificate) of type OpenSSLCertificate|string is deprecated in /usr/local/src/php/ext/openssl/tests/ServerClientTestCase.inc(191) : eval()'d code on line 11
011+ 
...

ext/openssl/tests/sni_server.phpt (sni_server) from php.ext.openssl.tests

(Fix with Cursor)

001- string(%d) "cs.php.net"
002- string(%d) "uk.php.net"
003- string(%d) "us.php.net"
002+ error:0A000086:SSL routines::certificate verify failed in /usr/local/src/php/ext/openssl/tests/ServerClientTestCase.inc(191) : eval()'d code on line 9
003+ 
005+ 
007+ 
009+ 
010+ Deprecated: openssl_x509_parse(): Passing null to parameter #1 ($certificate) of type OpenSSLCertificate|string is deprecated in /usr/local/src/php/ext/openssl/tests/ServerClientTestCase.inc(191) : eval()'d code on line 11
011+ 
...

View all

ℹ️ Info

No other issues found (see more)

❄️ No new flaky tests detected

🎯 Code Coverage (details)
• Patch Coverage: 100.00%
• Overall Coverage: 60.64% (-0.05%)

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 8d0cfcd | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!}

bwoebi · 2026-04-08T11:21:59Z

.claude/ci/build-slim-package.py

@@ -0,0 +1,305 @@
+#!/usr/bin/env -S uv run --script


This is essentially doing the same than tooling/bin/build-debug-artifact?
Only that it seemingly doesn't take care of caching the build, i.e. full rebuild every time? - I see, dockerh does that. Odd.

I just looked at it, and if we want to use tooling/bin/build-debug-artifactinstead, it needs to be updated to use the centos build. Because otherwise the bookworm images are too recent for system-tests

Well, I've been using tooling/bin/build-debug-artifact all the time successfully for system tests. Did you actually try it? (./tooling/bin/build-debug-artifact gnu-aarch64-8.2-nts /Users/bob.weinand/system-tests/binaries is what I've been using mostly)

I'll try, but the images for apache-mod-... are the appsec-ci images, which are based on bullseye

yeah, fails:

[08-Apr-2026 15:02:37 UTC] PHP Warning: PHP Startup: Unable to load dynamic library 'ddtrace.so'
(tried: .../ddtrace.so (/lib/aarch64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found
(required by .../ddtrace.so)), ...

I wasn't aware that we were using different base images for system tests :-(

I see, I was working with php-fpm-* images.

We really ought to unify images between appsec and tracer. (Ideally by just putting everything missing into bookworm images)

I've changed it to build on centos, like on ci.

bwoebi · 2026-04-08T11:25:40Z

.claude/ci/system-tests.md

+variant. Empty stub files satisfy this for versions you don't build.
+Profiling can also be fully stubbed. See
+[section 1c](#1c-slim-build--one-php-version-only) for all stub
+commands.


Why is this talking about generate-final-atrifact at all? seems like noise, you always want a local build?

Yeah, this has leftovers from a previous strategy. I'll clean that up

bwoebi · 2026-04-08T12:04:41Z

.claude/ci/download-artifacts

@@ -0,0 +1,390 @@
+#!/usr/bin/env -S uv run --script


Two questions:
a) Can we move this under tooling/bin ?
b) Would it be reasonable to have it as PHP script instead of python?

bwoebi · 2026-04-08T12:08:25Z

.claude/ci/github-actions-profiler.md

+Output: `target/debug/libdatadog_php_profiling.so` (~144 MB vs ~20 MB for profiler-release).
+Use the same `php -d extension=...` command, just point to the debug path.
+
+## ZTS tests -- parallel PECL extension


https://github.com/DataDog/dd-trace-php/blob/master/dockerfiles/ci/bookworm/build-extensions.sh#L171-L175
What's this about?

yeah, once it moved to use the ci bookworm images this no longer became necessary. I'll reword.

bwoebi · 2026-04-08T12:16:37Z

.claude/debugging-system-tests.md

+```
+logs/docker/weblog/logs/php_error.log
+logs/docker/weblog/logs/tracer.log
+```


Something I noticed is that it sometimes adds debug info, and then becomes frustrated because it doesn't show up.
fprintf(stderr), php_log_err(E_NOTICE) etc.

Basically, it has the tracer logs, so it should simply use LOG(ERROR, <printf compatible args>) in tracer code or error!() in profiler code, and it'll definitely show up.

I've tested this and added a section explainging debug methods. Interestinglt, error! doesn't print anything either with the current config in system-tests.

.claude/general.md

- Move download-artifacts from .claude/ci/ to tooling/bin/ and update all references in documentation. - Rewrite download-artifacts into PHP and use curl multi handles. Time to find the datadog-setup.php from a commit drops from ~35 s to ~10 s. - Delete build-slim-package.py; its functionality is now more or less covered by build-debug-artifact, the difference being in that build-debug-artifacts builds, well, debug artifacts, unlike CI. - Switch build-debug-artifact from bookworm to centos-7 images for GLIBC 2.17 compatibility with all weblog base images. - Use debug build profiles (CFLAGS=-O0, cargo dev profile, cmake Debug) instead of release/RelWithDebInfo. - Add Rust appsec helper build support to build-debug-artifact. - Resolve libddwaf commit on the host to avoid git failures in worktrees; pass LIBDDWAF_GIT_COMMIT to cmake in build-appsec-helper.sh. - Suppress git safe.directory errors in build-appsec-helper-rust.sh. - Simplify system-tests.md: remove stub file instructions, point to "Slim package with debug binaries" section in building-locally.md. - Simplify github-actions-profiler.md: note that bookworm images already include the parallel PECL extension. - Add "Slim package with debug binaries" section to building-locally.md documenting build-debug-artifact usage.

bwoebi · 2026-04-08T17:06:33Z

.claude/ci/windows-tests.md

+Matrix: PHP 7.2--8.5 (versions where `version_compare($v, "7.2", ">=")`)
+
+## What It Tests
+
+`windows test_c` starts `httpbin-windows` and `php-request-replayer-2.0-windows`


Can we try avoiding info which becomes trivially stale, e.g. say Matrix: PHP 7.2+, php-request-replayer-*-windows etc.? (also in the other files)

Done in 9c628cb

pr-commenter · 2026-04-09T14:41:11Z

Benchmarks [ tracer ]

Benchmark execution time: 2026-04-09 14:38:49

Comparing candidate commit 8d0cfcd in PR branch glopes/claude-ci with baseline commit 42c7c25 in branch master.

Found 5 performance improvements and 3 performance regressions! Performance is the same for 186 metrics, 0 unstable metrics.

scenario:ContextPropagationBench/benchInject64Bit-opcache

🟩 execution_time [-1.723µs; -1.405µs] or [-11.946%; -9.740%]

scenario:EmptyFileBench/benchEmptyFileOverhead

🟥 execution_time [+137.585µs; +343.675µs] or [+4.206%; +10.506%]

scenario:MessagePackSerializationBench/benchMessagePackSerialization

🟥 execution_time [+3.048µs; +5.172µs] or [+2.966%; +5.034%]

scenario:MessagePackSerializationBench/benchMessagePackSerialization-opcache

🟥 execution_time [+4.467µs; +6.733µs] or [+4.461%; +6.724%]

scenario:PDOBench/benchPDOOverhead

🟩 execution_time [-9.200µs; -6.990µs] or [-3.729%; -2.833%]

scenario:PDOBench/benchPDOOverheadWithDBM

🟩 execution_time [-8.669µs; -6.135µs] or [-3.513%; -2.486%]

scenario:PHPRedisBench/benchRedisOverhead

🟩 execution_time [-48.600µs; -33.113µs] or [-4.927%; -3.357%]

scenario:SamplingRuleMatchingBench/benchRegexMatching4-opcache

🟩 execution_time [-11.918µs; -11.671µs] or [-89.259%; -87.413%]

cataphract requested a review from a team as a code owner April 7, 2026 13:52

Claude instructions to reproduce/troubleshoot CI failures

b057948

cataphract force-pushed the glopes/claude-ci branch from 25f80d4 to b057948 Compare April 7, 2026 13:53

Remove the hack with fake binaries for system-tests.md

7297680

bwoebi reviewed Apr 8, 2026

View reviewed changes

.claude/general.md Outdated Show resolved Hide resolved

cataphract force-pushed the glopes/claude-ci branch from ef7269e to ad04b9f Compare April 8, 2026 16:58

cataphract requested a review from bwoebi April 8, 2026 17:00

bwoebi reviewed Apr 8, 2026

View reviewed changes

Simplify description of version variants

9c628cb

cataphract requested a review from bwoebi April 9, 2026 11:24

Try to improve CI monitoring protocol

8d0cfcd

Conversation

cataphract commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Reviewer checklist

Uh oh!

datadog-datadog-prod-us1 bot commented Apr 7, 2026 • edited by datadog-official bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Other Violations

ℹ️ Info

Uh oh!

bwoebi Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bwoebi Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pr-commenter bot commented Apr 9, 2026

Benchmarks [ tracer ]

scenario:ContextPropagationBench/benchInject64Bit-opcache

scenario:EmptyFileBench/benchEmptyFileOverhead

scenario:MessagePackSerializationBench/benchMessagePackSerialization

scenario:MessagePackSerializationBench/benchMessagePackSerialization-opcache

scenario:PDOBench/benchPDOOverhead

scenario:PDOBench/benchPDOOverheadWithDBM

scenario:PHPRedisBench/benchRedisOverhead

scenario:SamplingRuleMatchingBench/benchRegexMatching4-opcache

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cataphract commented Apr 7, 2026 •

edited

Loading

datadog-datadog-prod-us1 bot commented Apr 7, 2026 •

edited by datadog-official bot

Loading

bwoebi Apr 8, 2026 •

edited

Loading

bwoebi Apr 8, 2026 •

edited

Loading