Skip to content

[PROF-13798] [Browser Profiler] Quota check#4514

Draft
thomasbertet wants to merge 18 commits into
mainfrom
thomas.bertet/PROF-13798-profiling-quota-in-sdk
Draft

[PROF-13798] [Browser Profiler] Quota check#4514
thomasbertet wants to merge 18 commits into
mainfrom
thomas.bertet/PROF-13798-profiling-quota-in-sdk

Conversation

@thomasbertet
Copy link
Copy Markdown
Collaborator

@thomasbertet thomasbertet commented Apr 21, 2026

Motivation

Adds client-side enforcement of the per-org profiling quota by calling the quota admission API before allowing profiling data to be sent. Ticket: PROF-13798.

Changes

  • New quotaCheck.ts: calls GET https://quota.browser-intake-<site>/api/v2/profiling/quota?session_id=<id> with a DD-CLIENT-TOKEN header. Returns a QuotaResult { decision: 'quota_ok' | 'quota_ko', reason: QuotaReason }. Client-side 5 s timeout via AbortController + Promise.race. Uses the Zone.js-safe fetch and setTimeout wrappers from @datadog/browser-core. Fail-open on timeout (reason: 'timeout'), network error and unparseable response (reason: 'api-error').

  • QuotaReason type: BackendQuotaReason (exact strings from the API: quota_ok, quota_exceeded, org_disabled, backend_unavailable, backend_client_not_initialized, undefined) union with FrontendQuotaReason (timeout, api-error).

  • profiler.ts: profiler starts recording immediately (optimistic), quota check fires in parallel. On quota_ko decision: profiler stops, trace is discarded (no data sent), _dd.profiling.quota_reason is set on RUM events with the specific reason. A generation counter prevents stale results from a prior session applying to a new one. SESSION_RENEWED restarts the profiler (and re-checks quota) when previously stopped due to quota_ko.

  • rumProfiler.types.ts: RumProfilerStoppedInstance.stateReason uses 'quota_ko' to cover all quota-denied outcomes.

  • transportConfiguration.ts: exposes clientToken on TransportConfiguration (and thus RumConfiguration) so it is accessible from the profiler lazy-loaded chunk.

Note: _dd.profiling.quota_reason is passed with as any until the rum-events-format schema is updated.

Test instructions

  • Unit tests: yarn test:unit --spec packages/rum/src/domain/profiling/quotaCheck.spec.ts (13 tests)
  • Unit tests: yarn test:unit --spec packages/rum/src/domain/profiling/profiler.spec.ts (28 tests, includes quota check scenarios)
  • Verify _dd.profiling.quota_reason appears on RUM events when the quota API returns admitted: false

Checklist

  • Added unit tests for this change.
  • Tested locally
  • Tested on staging
  • Added e2e/integration tests for this change.
  • Updated documentation and/or relevant AGENTS.md file

@thomasbertet thomasbertet changed the title [PROF-13798] ✨ Gate profiling on quota admission API [PROF-13798] [Browser Profiler] Quota check Apr 21, 2026
@datadog-prod-us1-4
Copy link
Copy Markdown

datadog-prod-us1-4 Bot commented Apr 21, 2026

Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
Patch Coverage: 80.85%
Overall Coverage: 76.98% (+0.02%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 1177a58 | Docs | Datadog PR Page | Give us feedback!

@cit-pr-commenter-54b7da
Copy link
Copy Markdown

cit-pr-commenter-54b7da Bot commented Apr 21, 2026

Bundles Sizes Evolution

📦 Bundle Name Base Size Local Size 𝚫 𝚫% Status
Rum 169.51 KiB 169.97 KiB +469 B +0.27%
Rum Profiler 5.97 KiB 7.31 KiB +1.34 KiB +22.42%
Rum Recorder 21.23 KiB 21.23 KiB 0 B 0.00%
Logs 54.70 KiB 54.73 KiB +40 B +0.07%
Rum Slim 127.85 KiB 127.89 KiB +40 B +0.03%
Worker 22.99 KiB 22.99 KiB 0 B 0.00%
🚀 CPU Performance
Action Name Base CPU Time (ms) Local CPU Time (ms) 𝚫%
RUM - add global context 0.0021 0.0029 +38.10%
RUM - add action 0.0105 0.0129 +22.86%
RUM - add error 0.0097 0.0131 +35.05%
RUM - add timing 0.0004 0.0007 +75.00%
RUM - start view 0.0092 0.0137 +48.91%
RUM - start/stop session replay recording 0.0007 0.0009 +28.57%
Logs - log message 0.0139 0.0185 +33.09%
🧠 Memory Performance
Action Name Base Memory Consumption Local Memory Consumption 𝚫
RUM - add global context 38.24 KiB 40.55 KiB +2.32 KiB
RUM - add action 64.64 KiB 68.76 KiB +4.12 KiB
RUM - add timing 36.73 KiB 37.28 KiB +563 B
RUM - add error 70.01 KiB 69.26 KiB -768 B
RUM - start/stop session replay recording 45.58 KiB 45.14 KiB -445 B
RUM - start view 483.88 KiB 468.80 KiB -15.08 KiB
Logs - log message 55.00 KiB 54.59 KiB -415 B

🔗 RealWorld

New module that calls GET /api/unstable/profiling/admission with the
RUM session ID. Returns 'quota-ok' (HTTP 200, timeout, network error)
or 'quota-exceeded' (HTTP 429). Client-side 5s timeout via
AbortController + Promise.race. Uses the Zone.js-safe fetch wrapper
from @datadog/browser-core.
Profiler starts recording immediately (optimistic), then fires
checkProfilingQuota() in parallel. On quota-exceeded (HTTP 429):
- Profiler stops and discards the in-flight trace (no data sent)
- _dd.profiling.error_reason is set to 'quota-exceeded' on RUM events

Stale results (from a prior session) are discarded via a generation
counter incremented on each start(). Within-session cancellation
(user stop, session expiry) is handled by an instance state guard.

SESSION_RENEWED now also restarts the profiler when it was stopped
due to quota-exceeded, re-checking quota for the new session.
Expose clientToken on TransportConfiguration so it is accessible from
RumConfiguration downstream. Previously clientToken was only available
at init time when building endpoint builders, making it inaccessible
from the profiler chunk.

Also run Prettier on the two spec files that had formatting issues.
…eckProfilingQuota

Move client token from dd-api-key query param to DD-CLIENT-TOKEN header
to avoid leaking it in URL logs. Add getQuotaBaseURL() to resolve the
correct base per site (datad0g.com uses dd.datad0g.com, others use
app.<site>). Add credentials: 'omit' to suppress cookie sending.
…ingQuota

Replace bespoke getQuotaBaseURL() with buildEndpointHost() so all sites
are handled consistently (US1, EU1, AP1, AP2, GOV, staging).
Drop the /api/unstable/profiling/admission path — session_id is now
appended directly as a query param to the quota host.
…aReason and FrontendQuotaReason

- checkProfilingQuota returns QuotaResult { decision: 'quota_ok' | 'quota_ko', reason: QuotaReason }
- BackendQuotaReason: exact strings from the API (quota_ok, quota_exceeded, org_disabled,
  backend_unavailable, backend_client_not_initialized, undefined)
- FrontendQuotaReason: SDK-only reasons for fail-open cases (timeout, api-error)
- decision drives profiler stop logic; reason flows to quota_reason in RUM events
- stateReason simplified to 'quota_ko' covering all denied cases
@thomasbertet thomasbertet force-pushed the thomas.bertet/PROF-13798-profiling-quota-in-sdk branch from 2a65de1 to 525fb77 Compare May 13, 2026 15:39
…void CSP violations

The quota check was sending a direct cross-origin fetch to
quota.browser-intake-* which violated the E2E test environment's CSP
(connect-src restricted to the local test server). Fix:

- quotaCheck.ts: respects configuration.proxy (string or function),
  routing the request through the local proxy server in E2E
- transportConfiguration.ts: exposes proxy on TransportConfiguration
  (and thus RumConfiguration) alongside clientToken
- intake.ts: adds a GET handler for the quota path that always returns
  admitted:true in the test environment
- profiler.spec.ts: fix post-rebase breakages (minNumberOfSamples
  removed, SESSION_RENEWED payload required)
@thomasbertet
Copy link
Copy Markdown
Collaborator Author

/to-staging

@gh-worker-devflow-routing-ef8351
Copy link
Copy Markdown

gh-worker-devflow-routing-ef8351 Bot commented May 13, 2026

View all feedbacks in Devflow UI.

2026-05-13 19:24:45 UTC ℹ️ Start processing command /to-staging


2026-05-13 19:24:51 UTC ℹ️ Branch Integration: starting soon, merge expected in approximately 0s (p90)

Commit 1177a58550 will soon be integrated into staging-20.


2026-05-13 19:39:41 UTC ℹ️ Branch Integration: this commit was successfully integrated

Commit 1177a58550 has been merged into staging-20 in merge commit 0b7f455dee.

If you need to revert this integration, you can use the following command: /code revert-integration -b staging-20

gh-worker-dd-mergequeue-cf854d Bot added a commit that referenced this pull request May 13, 2026
…o staging-20

Integrated commit sha: 1177a58

Co-authored-by: thomasbertet <thomas.bertet@datadoghq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant