feat(openfeature): emit server-side EVP flagevaluation#8902
feat(openfeature): emit server-side EVP flagevaluation#8902leoromanovsky wants to merge 26 commits into
Conversation
…egation - Add FLAGEVALUATIONS_ENDPOINT constant to constants/constants.js - Implement FlagEvaluationsWriter extending BaseFFEWriter with: - Two-tier aggregation (full → degraded → drop-counted) - Comparable canonical-context key (sorted, type-tagged, length-delimited; no hash) - Caps: globalCap=131072 / perFlagCap=10000 / degradedCap=32768 - Context pruning: 256 fields / 256 chars before keying - Flush interval 10000ms; endpoint /evp_proxy/v2/api/v2/flagevaluations - runtime_default_used from absent variant; omitempty optional fields per tier - Add writer unit spec (21 tests, all passing)
… wiring - Add FlagEvalEVPHook: cheap scalar extraction in finally() + non-blocking enqueue (reviewer concern #7 — Finally covers error/default; no inline aggregation) - Wire FlagEvaluationsWriter + FlagEvalEVPHook in flagging_provider.js behind DD_FLAGGING_EVALUATION_COUNTS_ENABLED killswitch (default on) - OTel EvalMetricsHook remains always registered (PRES-01 non-regression) - Destroy FlagEvaluationsWriter in onClose() alongside SpanEnrichmentHook - Update flagging_provider.spec.js to stub new writer/hook, cover killswitch, update hook count assertions (3 hooks when all enabled)
- flag_evaluations.js: remove extra spaces before inline comment (no-multi-spaces); change interval literal 10000 → 10_000 (numeric-separators-style) - flagging_provider.js: route DD_FLAGGING_EVALUATION_COUNTS_ENABLED through the config system instead of reading process.env directly (eslint-process-env rule); flip negated condition to positive (no-negated-condition rule) — register env var in supported-configurations.json as experimental.flaggingProvider.evaluationCountsEnabled (default true) and update generated-config-types.d.ts accordingly - Tests updated to set mockConfig.experimental.flaggingProvider.evaluationCountsEnabled instead of process.env for killswitch coverage; behaviour unchanged
|
Overall package sizeSelf size: 6.35 MB Dependency sizes| name | version | self size | total size | |------|---------|-----------|------------| | import-in-the-middle | 3.2.0 | 104.26 kB | 843.44 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | dc-polyfill | 0.1.11 | 25.74 kB | 25.74 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe |
BenchmarksBenchmark execution time: 2026-06-24 00:35:10 Comparing candidate commit 167eff1 in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 1982 metrics, 13 unstable metrics.
|
…val hot path Split the Finally hook into a cheap synchronous capture and a deferred aggregator: the hook enqueues a raw event onto a bounded hand-off queue (drop-and-count on overflow) and a setImmediate-scheduled drain runs the prune + canonical-key + two-tier aggregation off the caller's evaluation call stack. flush()/destroy() drain pending events first so no queued evaluation is lost on flush or shutdown. Source variant from evaluationDetails.variant (not the evaluated value) and allocationKey/eval-time from evaluationDetails.flagMetadata, matching the OTel eval-metrics hook. Eval-time falls back to hook-fire time since the Datadog Node evaluator does not stamp dd.eval.timestamp_ms. Add evaluationCountsEnabled to the public type declarations, wire the hook and aggregator into the openfeature benchmark, and add lifecycle, JSON-schema, async-boundary, and backpressure tests.
The EVP flagevaluation hot-path benchmark lived only in benchmark/openfeature.js, which is reached via benchmark/index.js (npm run bench) — a path CI does not run. The canonical CI benchmark suite is sirun (benchmark/sirun/), executed via .gitlab/benchmarks/bp-runner.yml (BENCHMARKS_PATH: benchmark/sirun -> runall.sh), which discovers benchmarks by iterating subdirectories with a meta.json. Add a benchmark/sirun/openfeature/ directory (meta.json + index.js + README) mirroring the llmobs sirun bench: require.cache-stub the egress request module, require the real FlagEvaluationsWriter + FlagEvalEVPHook from packages/dd-trace/src, a pre-flight sanity assertion, then a startup-guard-fenced timed loop. Two variants: flag-eval-hook (the synchronous Finally-hook cost charged to the caller's evaluation) and aggregate (the deferred off-hot-path aggregator). Register the dir in CODEOWNERS under the feature-flagging team, matching how llmobs owns its bench dir.
…-guard ceiling The flag-eval-hook variant's COUNT=12000000 left the timed loop too short relative to module load (the FlagEvaluationsWriter require chain). On a fast runner the startup share crosses the startup-guard 10% ceiling, so done() asserts and the variant exits non-zero — failing the GROUP 3 shard (the bucket both openfeature variants land in) on every Node major in CI. Raise COUNT to 40000000 so the loop dominates: measured startup share drops to ~2% with comfortable headroom across repeated runs. The aggregate variant (per-iteration aggregator work, share ~0.15%) already had ample margin and is unchanged.
The previous batching path rebuilt and re-encoded the whole candidate payload for every aggregate event. On a 10,050-event degradation batch, a throwaway benchmark measured old batching at 13.7-14.8s per flush versus 3.7-4.3ms with incremental byte accounting, with identical output length. Validation: npm run test:openfeature
…uations-cross-sdk
…2446-evp-flagevaluation-nodejs # Conflicts: # package.json
…2446-evp-flagevaluation-nodejs # Conflicts: # benchmark/sirun/runall.sh
Motivation
Customers and APM need the same server-side feature-flag evaluation signal across SDKs so rollout behavior can be correlated with application behavior without SDK-specific blind spots. This Node.js contribution adds bounded EVP
flagevaluationdelivery while preserving the existing OTelfeature_flag.evaluationsmetric path, so adoption can be judged on smoothness and time-to-land with APM approval.Changes
DD_FLAGGING_EVALUATION_COUNTS_ENABLED.flag-eval-metrics-hook.jspath unchanged.FlagEvalEVPHook/flag_eval_evp_hook.jsso it is distinct from the OTel metrics hook./evp_proxy/v2/api/v2/flagevaluation.reasonout of the enqueue path, event model, aggregation keys, payload shape, tests, and TypeScript declarations.error.message.{context, flagEvaluations}wrapper, before posting to EVP.targeting_keyandcontext; drops only if the degraded row still cannot fit the event or payload limit.Decisions
reasonis intentionally not part of EVP payloads or aggregate keys because the worker schema does not accept it.ffe-dogfooding-nodejs;ffe-dogfooding-nodedid not receive the required RC configuration.flowchart TD A[drained aggregated rows] --> B[serialize candidate batch as JSON] B --> C{batch <= 5 MiB?} C -- yes --> D[post asynchronously through EVP proxy] C -- no --> E{current row fits degraded?} E -- yes --> F[omit targeting_key and context] F --> B E -- no --> G[drop, log, count]Validation Evidence
Dogfooding App
ffe-dogfoodingapp-nodejswas run with localdd-trace-js,@datadog/flagging-core, and@datadog/openfeature-node-serverartifacts and reachedPROVIDER_READY.ffe-dogfooding-string-flagfive times for each public-safe targeting key, keeping the evaluation context stable per key so batching/aggregation is observable:nodejs-batch-evp-agent-20260623T024835Z-alphanodejs-batch-evp-agent-20260623T024835Z-bravonodejs-batch-evp-agent-20260623T024835Z-charlievariant_2withreason=TARGETING_MATCH.System Tests
Staging End-To-End
eventplatform.system.track(TRACK => 'flagevaluation')againstus1.staging.dogfor the exact targeting keys above.nodejs-batch-evp-agent-20260623T024835Z-alpha:evaluation_count=5,flag.key=ffe-dogfooding-string-flag,variant.key=variant_2,allocation.key=allocation-override-392dd7c149f8nodejs-batch-evp-agent-20260623T024835Z-bravo:evaluation_count=5,flag.key=ffe-dogfooding-string-flag,variant.key=variant_2,allocation.key=allocation-override-392dd7c149f8nodejs-batch-evp-agent-20260623T024835Z-charlie:evaluation_count=5,flag.key=ffe-dogfooding-string-flag,variant.key=variant_2,allocation.key=allocation-override-392dd7c149f8