feat(FeatureFlags): FFE APM feature-flag span enrichment (experimental, gated)#3996
feat(FeatureFlags): FFE APM feature-flag span enrichment (experimental, gated)#3996leoromanovsky wants to merge 12 commits into
Conversation
…n-enrichment gate - Add serial_id (i64) + has_serial_id (bool) to the Rust FfeResult struct and populate from assignment.serial_id (unwrap_or(0) + is_some()) in all ctors; regenerate the cbindgen common.h ABI to match. - Surface serialId as a nullable int on the DDTrace\FfeResult object in the C reader (tracer/functions.c), guarded by has_serial_id so absence stays null (Pattern B: missing != 0); update the stub + arginfo. - Thread serialId into ResultMapper::exposureData (only when present). - Add the gate CONFIG(BOOL, DD_EXPERIMENTAL_FLAGGING_PROVIDER_SPAN_ENRICHMENT_ENABLED, "false") to ext/configuration.h (distinct from the provider-enabled gate). - Update existing FFE phpt EXPECT blocks for the new serialId field.
…oot-close write - Add DDTrace\FeatureFlags\SpanEnrichmentAccumulator: per-root-span accumulator + ULEB128 delta-varint/base64/SHA256 codec ported verbatim from the frozen Node reference (dd-trace-js#8343). Limits 200/10/20/5/64, dedupe+sort, object defaults via json_encode, UTF-8-safe 64-char truncation; tag shapes ffe_flags_enc (bare base64), ffe_subjects_enc / ffe_runtime_defaults (JSON objects). - DataDogProvider: accumulate INLINE in resolve() right after recordEvaluationMetric (DG-004, no finally hook); gate-gated lazy accumulator (DG-005 zero-idle); error isolation via try/catch(\Throwable); runtime-default detection via missing variant. - Native request-scoped staging store in tracer/ffe.c (+ ddtrace_globals.h) flushed into the root span meta on the ddtrace_close_span root branch and cleared on root close / RSHUTDOWN (no cross-request leak); gate-off path does no work. - Add DDTrace\Internal\set_ffe_span_enrichment_tags() PHP-callable staging fn. - Tests: SpanEnrichmentAccumulatorTest (7 required L0 cases incl. gate-off control + codec golden round-trip), serial_id_passthrough.phpt (C bridge), ResultMapper serialId threading cases.
…ry (CR-01) The per-provider SpanEnrichmentAccumulator was only ever added to: clear() had zero production callers and accumulateSpanEnrichment() re-staged the FULL accumulated set on every resolve(). After a root span closed, the next root span re-staged the prior root's serial ids / hashed subjects / runtime defaults (within-request multi-root contamination), and because OpenFeature providers are process-level singletons the accumulator leaked across requests in persistent SAPIs -- a privacy leak of SHA256 subject keys. Fix: reset the PHP accumulator on the root-span boundary, in lockstep with the native close-span flush (which already clears the native staging slots on the same ddtrace_close_span root branch + RSHUTDOWN): - Track the active root span id (spl_object_id of DDTrace\root_span()). On any boundary transition, clear the accumulator + native staging store so a dropped/abandoned root (which never runs its onClose) and a new request both start clean. - Bind a one-shot accumulator clear to the root span's $onClose so the PHP object is reset when the root closes (mirrors the frozen Node reference #onSpanFinish cleanup). - Lifecycle is injectable (rootIdResolver / rootCloseScheduler) so the pure-PHP L0 suite can drive root transitions without the extension. Regression tests (fail-before / pass-after): two sequential root spans in one request -> root 2 stages only its own serial ids/subjects/ defaults; dropped-root and cross-request reset -> no carryover incl. no leaked hashed subject keys; root close clears the accumulator with no subsequent eval. Plus a Node String(value) runtime-default parity test (null/true/false/scalars/objects). Native ABI passthrough, codec (ZAgUAg==), limits, gate-off DG-005, and DG-004 inline accumulation are unchanged.
|
Benchmarks [ tracer ]Benchmark execution time: 2026-06-17 08:47:40 Comparing candidate commit 3229c80 in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 194 metrics, 0 unstable metrics.
|
Long-running CLI servers (parametric test apps) starve the SIGVTALRM-driven
remote-config refresh because the process is mostly blocked in IO rather than
burning CPU time, so an FFE evaluation issued right after the agent ACKs a
pushed UFC config still sees no config and falls back to defaults. Add a
dd_trace_internal_fn('await_ffe_config') testing hook that actively pumps
remote configs (mirrors await_agent_info) until ddog_ffe_has_config() is true.
Enables the FROZEN system-tests span-enrichment parametric suite to load UFC
via Remote Config in the long-running PHP parametric server.
Span enrichment was accumulated only inside the OpenFeature DataDogProvider (DG-004 inline path). The native DDTrace\FeatureFlags\Client evaluates flags without going through the provider, so consumers on the native path (the parametric system-tests app, and any non-OpenFeature caller) produced ffe_* tags on the root span for OpenFeature but NOT for the native Client. Extract the per-root-span accumulate/encode/root-boundary lifecycle into a reusable PHP7-compatible SpanEnrichmentBinder and bind it on Client::evaluate(), so both the provider and the native Client stage identical ffe_* tags from the same EvaluationDetails and stay in lockstep with the native close-span write. Honours the FROZEN contract (limits 200/10/20/5/64, delta-varint, SHA256 subjects, runtime-default detection). DG-005: no-op with the gate off.
…ment gate Register DD_EXPERIMENTAL_FLAGGING_PROVIDER_SPAN_ENRICHMENT_ENABLED in metadata/supported-configurations.json by running tooling/generate-supported-configurations.sh. The config was added to ext/configuration.h but the generated metadata was not regenerated, causing the Configuration Consistency CI check to fail.
assertIsInt() is only available in PHPUnit 7.5+, so the new serialId exposure-data test errored on the PHP 7.0 API unit-test job (older PHPUnit). assertInternalType() is unavailable too (removed in PHPUnit 9, and the matrix runs up to PHPUnit <10). Replace with assertTrue(is_int(...)), which works across the whole 7.0-8.5 matrix. The preceding strict assertSame already enforces the integer type.
…-only PR review (#3996), two native findings: - should-fix: DDTrace\root_span() calls dd_ensure_root_span(), which CREATES an autoroot span when none exists. Resolving the root id while merely evaluating a feature flag must not have that side effect. Add a non-creating DDTrace\Internal\peek_root_span_id() that reads DDTRACE_G(active_stack)-> root_span directly (no dd_ensure_root_span) and returns its object handle, identical to spl_object_id(\DDTrace\root_span()) but without trace-state creation. Wired into the stub + committed arginfo (phpize build uses the committed header as-is; no CI stub-hash gate). - should-fix: await_ffe_config sits in the production dd_trace_internal_fn dispatcher and actively pumps Remote Config, blocking up to 5s. Guard it behind a new DD_TEST_HELPERS compile flag (config.m4, defined for the standard CI/test/package builds the system-tests + ffe-dogfooding harnesses run against) so a hardened production build can compile the heavyweight test helper out of the dispatcher entirely. ZTS-safe (DDTRACE_G accessor); no allocation, no refcount changes.
… all paths PR review (#3996) blocker + should-fix. blocker: tracer/ffe.c set_ffe_span_enrichment_tags() REPLACES the three request-global tag slots on every call. Both DataDogProvider and each FeatureFlags\Client/SpanEnrichmentBinder owned a SEPARATE accumulator and staged independently, so two clients, two providers, or a mixed OpenFeature + native-client evaluation under ONE root span would OVERWRITE earlier serial ids / hashed subjects / runtime defaults instead of aggregating them. Fix: introduce SpanEnrichmentRegistry, a single request-scoped accumulator that ALL PHP evaluation paths feed. The staged tag set is now the union of every evaluation on the active root span, matching the frozen Node contract. No tag/encoding/limit semantics changed. should-fix (per-binder onClose retention): the lifecycle is centralized in the registry, which binds AT MOST ONE root-close reset per root span (tracked by rootCloseBoundRootId). Many short-lived clients under one long-lived root no longer each retain a closure + accumulator. SpanEnrichmentBinder is now a thin gate-checked adapter; DataDogProvider drops its inline accumulator + lifecycle. should-fix (gate-off not inert): Client and DataDogProvider now construct NO binder unless DD_EXPERIMENTAL_FLAGGING_PROVIDER_SPAN_ENRICHMENT_ENABLED is on, and evaluate()/resolve() skip the enrichment call entirely when the binder is absent — no per-evaluation config read with the gate off (DG-005). should-fix (root side effect): the registry resolves the root id via the new non-creating DDTrace\Internal\peek_root_span_id(), falling back to the (creating) DDTrace\root_span() only on older extensions.
…non-creating root PR review (#3996) regression coverage. - SpanEnrichmentRegistryTest (PHPUnit, runs without the native ext): two binders (standing in for two clients / a client + a provider) under one simulated root AGGREGATE their serial ids, hashed subjects, and runtime defaults into one staged payload rather than overwriting; CR-01 per-root reset still holds; at most ONE root-close reset is bound across many short-lived binders; the root-close reset clears the shared accumulator. - ClientTest: gate-off Client allocates no SpanEnrichmentBinder and evaluate() short-circuits enrichment without error. - SpanEnrichmentAccumulatorTest: rewired the DG-004 inline + CR-01 multi-root harness to drive the shared registry's seams (the lifecycle moved out of the provider); gate-off assertions now check spanEnrichmentBinder is null. - peek_root_span_id_non_creating.phpt (orchestrator L2, needs built ext): proves peek_root_span_id() returns null without creating a root span (active_span() stays null) and otherwise equals spl_object_id(root_span()).
…arity) json_encode() without flags escaped non-ASCII to \uXXXX and '/' to '\/', diverging from the frozen Node JSON.stringify contract for ffe_subjects_enc and ffe_runtime_defaults. For object/struct runtime defaults the \uXXXX inflation also pushed the value past the 64-char limit so the truncation cut mid-escape-sequence, yielding invalid JSON inside the tag. Add JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_SLASHES at all three json_encode sites (toSpanTags subjects + runtime defaults, and stringifyDefault for object/array values) so the emitted bytes match Node exactly. Verified via the ffe-dogfooding unicode scenario: decoded ffe_runtime_defaults is now raw UTF-8 (héllo-wörld-☃-日本語-Ω / こんにちは / 🎉), valid JSON, codepoint-safe 64-char truncation.
feat(FeatureFlags): FFE APM feature-flag span enrichment
Summary
Adds Feature Flag Events (FFE) span enrichment to the feature-flag integration. When feature
flags are evaluated, the evaluation metadata is attached to the root APM span so APM customers
can filter traces and errors by active flag variant, and the FFE/Experimentation platform can
correlate spans with experiments. The wire format matches the merged reference implementation
(
dd-trace-js#8343) so backend/Trino decode is identical.How it works
DataDogProvideror the nativeDDTrace\FeatureFlags\Client).ffe_*tags.Configuration
Opt-in, off by default:
This is distinct from
DD_EXPERIMENTAL_FLAGGING_PROVIDER_ENABLED.Span tags added
ffe_flags_encffe_subjects_encdoLog=true){ sha256(key): encodedIds }ffe_runtime_defaults{ flagKey: value }Limits: 200 serial IDs, 10 subjects, 20 experiments/subject, 5 runtime defaults, 64 chars/runtime-default value (UTF-8-safe truncation).
Changes
DD_EXPERIMENTAL_FLAGGING_PROVIDER_SPAN_ENRICHMENT_ENABLED(ext/configuration.h), off by default; thread the splitserial_idRust → C → PHP mapper (components-rs/ffe.rs,tracer/ffe.c/.h,ResultMapper.php).SpanEnrichmentAccumulator.php) with delta-varint serial IDs + SHA256-hashed subject keys; writeffe_*tags at root-span close (tracer/span.c).SpanEnrichmentBinder: binds enrichment to the nativeDDTrace\FeatureFlags\Clientpath in addition to the OpenFeatureDataDogProvider, so non-OpenFeature consumers are enriched too..phptext tests for native bridge, serial-id passthrough, eval metrics, and remote-config lifecycle.Decisions
finallyhook): PHP OpenFeature does not passResolutionDetailstofinallyhooks, so enrichment is accumulated inline.ffe_*tags.ffe_*are bare tag names on spanmeta(not_dd.-prefixed); subject keys are SHA256 hashes emitted only when logging is authorized.Validation
FFE dogfooding app
Validated live against the
ffe-dogfoodingapp via atrace-intaketee-proxy that captures the raw/v0.4/tracespayload and decodes theffe_*tags. Flagffe-dogfooding-string-flag(serial2312):web.request, auto-instrumented web SAPI) carriedffe_flags_encdecoding to serial[2312]plus a SHA256-hashedffe_subjects_enc→[2312].ffe_*tags.Local system-tests run
Ran the frozen
system-testsparametric suite (tests/parametric/test_ffe/test_span_enrichment.py, unchanged) against this branch's tracer (dd-library-php-1.21.0, C extension built from source foraarch64-linux-gnu, PHP 8.2 NTS):All 18 cases pass —
ffe_flags_encaggregates serial IDs across evaluations and propagates from child spans to the root (ZAgUAg==→[100,108,128,130]);ffe_subjects_enccarries SHA256-hashed targeting keys gated ondoLog;ffe_runtime_defaultsis added for not-found flags with 64-char truncation; and all frozen limits are enforced. TheSpanEnrichmentBinderchange above was required so the nativeDDTrace\FeatureFlags\Clientpath (used by the parametric server) is enriched. The system-tests enablement (parametricserver.php+manifests/php.yml) is a separate draft PR againstDataDog/system-tests.Full dogfooding matrix + fix (2026-06-17)
Re-validated end-to-end through the real OpenFeature provider path behind the
trace-intaketee-proxy, decoding
ffe_*withscripts/decode_ffe_span_tags.py(root spanweb.request,service
ffe-dogfooding-php8-openfeature, extension built from this branch):ffe_flags_enc→[2312];ffe_subjects_enc={sha256(targeting key): ids}only when do_logffe_*tags; no binder constructedffe_flags_enc=[829,1442,2311,2312], nothing overwritten (sharedSpanEnrichmentRegistry)ffe_runtime_defaultsraw UTF-8 (héllo-wörld-☃-日本語-Ω,こんにちは,🎉), valid JSON, values truncated to 64ZAgUAg==→[100,108,128,130]Fix found by the matrix (commit
fix(ffe): emit ffe_* JSON as raw UTF-8 with unescaped slashes):the unicode scenario showed
ffe_runtime_defaultswas\uXXXX-escaped (and object values weretruncated mid-escape-sequence, yielding invalid JSON) because
json_encode()was called withoutflags. Added
JSON_UNESCAPED_UNICODE | JSON_UNESCAPED_SLASHESat the threejson_encodesites inSpanEnrichmentAccumulator.phpso the emitted bytes match the frozen NodeJSON.stringifycontract (raw UTF-8, bare
/). Existing accumulator unit testsjson_decodethe tags (normalizingescapes) so they are unaffected; the dogfooding loop is what surfaced the divergence.
System-tests re-confirmed against a tarball rebuilt from this branch's source: 18 passed
(
TEST_LIBRARY=php ./run.sh PARAMETRIC -k span_enrichment, libraryphp@1.21.0).