Skip to content

CRE-4405: Add gateway request tracing; use default HTTP transport#22559

Merged
cedric-cordenier merged 1 commit into
developfrom
CRE-4405-add-request-tracing
May 25, 2026
Merged

CRE-4405: Add gateway request tracing; use default HTTP transport#22559
cedric-cordenier merged 1 commit into
developfrom
CRE-4405-add-request-tracing

Conversation

@cedric-cordenier

@cedric-cordenier cedric-cordenier commented May 20, 2026

Copy link
Copy Markdown
Contributor
  • Add http tracing to outbound requests made in the Gateway. This will allow us to monitor request latencies at a more granular level.
  • Import my fork of doyensec/safeurl; this adds the ability to override the http.Transport being used. By default, safeurl uses an empty http.Transport which means we don't benefit sensible defaults. One of the consequences of this is that connection pooling is disabled. I am trying to upstream the changes, and will remove the replace once the changes have been upstreamed.

See this screenshot from a before/after of using the default HTTP transport using the metrics:
Screenshot 2026-05-21 at 13 12 54

@github-actions

Copy link
Copy Markdown
Contributor

I see you updated files related to core. Please run make gocs in the root directory to add a changeset as well as in the text include at least one of the following tags:

  • #added For any new functionality added.
  • #breaking_change For any functionality that requires manual action for the node to boot.
  • #bugfix For bug fixes.
  • #changed For any change to the existing functionality.
  • #db_update For any feature that introduces updates to database schema.
  • #deprecation_notice For any upcoming deprecation functionality.
  • #internal For changesets that need to be excluded from the final changelog.
  • #nops For any feature that is NOP facing and needs to be in the official Release Notes for the release.
  • #removed For any functionality/config that is removed.
  • #updated For any functionality that is updated.
  • #wip For any change that is not ready yet and external communication about it should be held off till it is feature complete.

@github-actions

github-actions Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

✅ No conflicts with other open PRs targeting develop

@cedric-cordenier cedric-cordenier added the build-publish Build and Publish image to SDLC label May 20, 2026
@cedric-cordenier cedric-cordenier force-pushed the CRE-4405-add-request-tracing branch from 4b4b31f to 9d8b7e1 Compare May 20, 2026 12:55
@trunk-io

trunk-io Bot commented May 20, 2026

Copy link
Copy Markdown

Static BadgeStatic BadgeStatic BadgeStatic Badge

View Full Report ↗︎Docs

@cl-sonarqube-production

Copy link
Copy Markdown

@cedric-cordenier cedric-cordenier force-pushed the CRE-4405-add-request-tracing branch 2 times, most recently from 1554a55 to c28c309 Compare May 21, 2026 12:17
@cedric-cordenier cedric-cordenier changed the title CRE-4405: Add gateway request tracing CRE-4405: Add gateway request tracing; use default HTTP transport May 21, 2026
@cedric-cordenier cedric-cordenier force-pushed the CRE-4405-add-request-tracing branch 2 times, most recently from 824c827 to 714bf49 Compare May 21, 2026 13:33
Comment thread go.mod Outdated
@cedric-cordenier cedric-cordenier force-pushed the CRE-4405-add-request-tracing branch 2 times, most recently from 5f7921f to cd68438 Compare May 21, 2026 14:00
@cedric-cordenier cedric-cordenier marked this pull request as ready for review May 21, 2026 15:27
Copilot AI review requested due to automatic review settings May 21, 2026 15:28
@cedric-cordenier cedric-cordenier requested review from a team as code owners May 21, 2026 15:28

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Risk Rating: MEDIUM

Adds outbound HTTP request tracing/metrics for the Gateway HTTP client and switches safeurl to use a transport with Go’s sensible defaults (enabling connection pooling), via a temporary fork.

Changes:

  • Instrument Gateway outbound HTTP requests with net/http/httptrace phase timing and a total request-duration metric, plus OTEL histogram views registration.
  • Configure safeurl to use the default HTTP transport (via a temporary replace to a forked safeurl).
  • Propagate the safeurl replace across submodules (deployment, integration-tests, system-tests, scripts) with updated go.sum entries.

Scrupulous human review recommended (high impact areas):

  • core/services/gateway/network/httpclient.go: transport wiring and safety around http.DefaultTransport usage (panic risk) and ensuring the chosen transport semantics match desired pooling/timeouts.
  • core/services/gateway/network/httpclient_trace.go + httpclient_metrics.go: metric semantics (phase “durations” vs “time-to” events) and label meaning/usage.

Reviewed changes

Copilot reviewed 12 out of 19 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
core/services/gateway/network/httpclient.go Adds transport override for safeurl client and records trace/total metrics on request paths.
core/services/gateway/network/httpclient_trace.go Introduces httptrace.ClientTrace callbacks to record phase timings and connection reuse.
core/services/gateway/network/httpclient_metrics.go Defines OTEL histogram instrument, attributes, and metric view/bucket configuration.
core/services/gateway/network/httpclient_metrics_test.go Basic construction/recording smoke tests for the new metrics helpers.
core/cmd/shell.go Registers the new Gateway HTTP client metric views in the node’s metric view list.
go.mod Adds replace to forked safeurl pending upstream merge.
go.sum Updates checksums for the forked safeurl.
deployment/go.mod Adds replace to forked safeurl in deployment module.
deployment/go.sum Updates checksums for the forked safeurl in deployment module.
integration-tests/go.mod Adds replace to forked safeurl in integration-tests module.
integration-tests/go.sum Updates checksums for the forked safeurl in integration-tests module.
integration-tests/load/go.mod Adds replace to forked safeurl in integration-tests/load module.
integration-tests/load/go.sum Updates checksums for the forked safeurl in integration-tests/load module.
system-tests/lib/go.mod Adds replace to forked safeurl in system-tests/lib module.
system-tests/lib/go.sum Updates checksums for the forked safeurl in system-tests/lib module.
system-tests/tests/go.mod Adds replace to forked safeurl in system-tests/tests module.
system-tests/tests/go.sum Updates checksums for the forked safeurl in system-tests/tests module.
core/scripts/go.mod Adds replace to forked safeurl in core/scripts module.
core/scripts/go.sum Updates checksums for the forked safeurl in core/scripts module.

Comment thread core/services/gateway/network/httpclient.go Outdated
Comment thread core/services/gateway/network/httpclient_trace.go
Comment thread core/services/gateway/network/httpclient_metrics.go
Comment thread system-tests/tests/go.mod
Comment on lines +311 to +314
requestStart := time.Now()
trace, traceState := newClientTrace(ctx, req.Method, requestStart, c.metrics)
timeoutCtx = httptrace.WithClientTrace(timeoutCtx, trace)

bolekk
bolekk previously approved these changes May 21, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 19 changed files in this pull request and generated 2 comments.

Comment on lines +57 to +67
// recordTotal records the total request lifetime with the result attributes.
// The histogram's count for phase=total doubles as the request counter.
// Success means the request returned a response, it does not imply a successful 2xx statusCode.
func (m *httpClientMetrics) recordTotal(ctx context.Context, method string, statusCode int, success, connReused bool, d time.Duration) {
m.phaseDuration.Record(ctx, d.Milliseconds(), metric.WithAttributes(
attribute.String("method", method),
attribute.String("phase", string(PhaseTotal)),
attribute.String("statusCode", strconv.Itoa(statusCode)),
attribute.String("success", strconv.FormatBool(success)),
attribute.String("connectionReused", strconv.FormatBool(connReused)),
))

defaultTransport, ok := http.DefaultTransport.(*http.Transport)
if !ok {
return nil, errors.New("could not coerce http.DefaultTransport to *http.Transport")
@cedric-cordenier cedric-cordenier force-pushed the CRE-4405-add-request-tracing branch from 3bddc70 to facd17c Compare May 25, 2026 13:02
@cl-sonarqube-production

Copy link
Copy Markdown

@cedric-cordenier cedric-cordenier added this pull request to the merge queue May 25, 2026
Merged via the queue into develop with commit 3d5e5c7 May 25, 2026
239 of 240 checks passed
@cedric-cordenier cedric-cordenier deleted the CRE-4405-add-request-tracing branch May 25, 2026 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build-publish Build and Publish image to SDLC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants