Skip to content

fix: corrects and stabilizes finagle instrumentation for http#17867

Merged
trask merged 25 commits into
open-telemetry:mainfrom
dmarkwat:fix/finagle-http
May 4, 2026
Merged

fix: corrects and stabilizes finagle instrumentation for http#17867
trask merged 25 commits into
open-telemetry:mainfrom
dmarkwat:fix/finagle-http

Conversation

@dmarkwat
Copy link
Copy Markdown
Contributor

@dmarkwat dmarkwat commented Apr 17, 2026

Instruments finagle-http/twitter-util with more rigor than the previous iteration. Chiefly:

  • more complete instrumentation of Promise state machine (with inspiration from Kamon)
  • instrumentation of the FuturePool edge case which otel executors instrumentation can be blind to
  • correct chaining of netty http messages to finagle requests across various async boundaries in finagle code paths
  • cleanup of previous iterations which were broken in practice but not under test (note: use of different "offload" pools addresses this)

A note on the adjustment/removal/addition of instrumentations: introduction of FuturePool as the "offloader" -- or really any use of FuturePool within the service -- should have been done originally to ensure thread boundaries were crossed and the implementation was sufficiently flexed. Since it wasn't, once a FuturePool was introduced to the previous iteration's tests, the tests broke -- both here in isolation (however thorough) and in practical code applications. The addition of different implementations based on the different threading models (unbounded pool, immediate pool, none defined) is to ensure all different modes are flexed from here on out (Parameterized tests don't work on BeforeAll methods, where the server construction occurs).

@dmarkwat dmarkwat requested a review from a team as a code owner April 17, 2026 05:50
@dmarkwat dmarkwat marked this pull request as draft April 17, 2026 06:11
@dmarkwat
Copy link
Copy Markdown
Contributor Author

Client tests need addressing. Server was the central focus in issue debugging and evaluation. Will mark as ready once complete.

@dmarkwat dmarkwat force-pushed the fix/finagle-http branch 6 times, most recently from d650adf to 6794a2a Compare April 20, 2026 03:45
@dmarkwat dmarkwat marked this pull request as ready for review April 20, 2026 03:47
@dmarkwat
Copy link
Copy Markdown
Contributor Author

Tests are passing, including indy modules. This is now ready for review.

@dmarkwat
Copy link
Copy Markdown
Contributor Author

Ran some IRL cases, and while I didn't uncover anything concrete, I did notice some edge cases related to scheduling and evaluating the Promise state machine's outcomes that could leave users open to polluted ThreadLocal Contexts. That and a gap in the H2 bridge has now been addressed.

Copy link
Copy Markdown
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @dmarkwat! AI code review incoming...

The big-picture restructuring (per-Promise-K virtual field, per-FuturePool/respond/transform wrapping, netty→finagle bridging through Request.ctx(), and the offloaded vs. immediate test variants) all looks reasonable. A few correctness and javaagent-hygiene issues, mostly around null/NPE handling on hot paths and a couple of throw new IllegalArgumentException(...) calls in advice helper code that violate the "javaagent code must never throw" rule (they're suppressed inside advice, but the one inside channelRead is invoked by netty itself, not by advice, so it is not suppressed and would propagate to the application).

@dmarkwat
Copy link
Copy Markdown
Contributor Author

@trask , my pleasure -- it was a challenging project, happy to contribute whatever I can! I think I've addressed all the comments and rebased.

Comment thread instrumentation/finagle-http-23.11/javaagent/build.gradle.kts Outdated
@dmarkwat
Copy link
Copy Markdown
Contributor Author

@trask All comments should be addressed. LMK if there's anything you'd prefer I do with the ContextStorageCloser printing. I found it net helpful when sifting through the leaky storage issues I was experiencing.

@otelbot-java-instrumentation otelbot-java-instrumentation Bot added the test native This label can be applied to PRs to trigger them to run native tests label Apr 30, 2026
Copy link
Copy Markdown
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last AI-review round:

@dmarkwat
Copy link
Copy Markdown
Contributor Author

@trask addressed comments and rebased

Copy link
Copy Markdown
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Comment thread instrumentation/finagle-http-23.11/javaagent/build.gradle.kts Outdated
Comment thread instrumentation/finagle-http-23.11/javaagent/build.gradle.kts Outdated
@dmarkwat
Copy link
Copy Markdown
Contributor Author

dmarkwat commented May 1, 2026

@trask done 👍 !

@trask trask merged commit 8a4225b into open-telemetry:main May 4, 2026
94 checks passed
@otelbot
Copy link
Copy Markdown
Contributor

otelbot Bot commented May 4, 2026

Thank you for your contribution @dmarkwat! 🎉 We would like to hear from you about your experience contributing to OpenTelemetry by taking a few minutes to fill out this survey.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test native This label can be applied to PRs to trigger them to run native tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants