Skip to content

kafka/server: remove foreign_ptr from response_ptr#30042

Merged
travisdowns merged 2 commits intoredpanda-data:devfrom
travisdowns:td-remove-response-foreign-ptr
Apr 6, 2026
Merged

kafka/server: remove foreign_ptr from response_ptr#30042
travisdowns merged 2 commits intoredpanda-data:devfrom
travisdowns:td-remove-response-foreign-ptr

Conversation

@travisdowns
Copy link
Copy Markdown
Member

The response_ptr type was ss::foreign_ptr<std::unique_ptr<response>>,
but the foreign_ptr wrapper is unnecessary: the response is always
allocated and destroyed on the connection's shard. The wrapper was added
in 2020 (1313f09) when iobuf stopped forwarding fragment deleters
to the home core, but this is now handled at a higher level by
foreign_record_batch_reader. The respond() path in request_context
serializes into fresh local buffers, so the response iobuf never
contains foreign fragments.

This PR adds an oncore_auto helper (an oncore that debug-asserts
same-shard in its destructor), embeds it in response, and replaces
foreign_ptr<unique_ptr<response>> with a plain unique_ptr<response>.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v25.3.x
  • v25.2.x
  • v25.1.x

Release Notes

  • none

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR simplifies Kafka response lifetime management by removing the ss::foreign_ptr wrapper from response_ptr, based on the invariant that response objects are allocated and destroyed on the connection’s shard. It adds a debug-only cross-shard destruction check via a new oncore_auto helper embedded in kafka::response.

Changes:

  • Introduce oncore_auto (destructor verifies same-shard in debug builds) in base/oncore.h.
  • Embed oncore_auto into kafka::response and change response_ptr to std::unique_ptr<response>.
  • Update related includes/commentary to reflect the new ownership model (e.g. scattered message lifetime note).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
src/v/kafka/server/response.h Embed oncore_auto in response and switch response_ptr to std::unique_ptr.
src/v/kafka/server/protocol_utils.cc Clarify that the response must outlive scattered_message due to fragment references.
src/v/kafka/server/connection_context.h Remove local response_ptr alias and include response.h for the canonical definition.
src/v/base/oncore.h Add oncore_auto helper to assert same-shard destruction in debug builds.

@travisdowns travisdowns requested a review from ballard26 April 1, 2026 19:42
@vbotbuildovich
Copy link
Copy Markdown
Collaborator

vbotbuildovich commented Apr 1, 2026

CI test results

test results on build#82648
test_class test_method test_arguments test_kind job_url test_status passed reason test_history
DatalakeClusterRestoreTest test_restore_partition_spec {"catalog_type": "rest_hadoop", "cloud_storage_type": 1} integration https://buildkite.com/redpanda/redpanda/builds/82648#019d4aab-4f1b-4df3-94bd-130f12f4cfa5 FLAKY 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0000, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=DatalakeClusterRestoreTest&test_method=test_restore_partition_spec
test results on build#82682
test_class test_method test_arguments test_kind job_url test_status passed reason test_history
ControllerLogLimitMirrorMakerTests test_mirror_maker_with_limits null integration https://buildkite.com/redpanda/redpanda/builds/82682#019d4e90-48df-4de5-b927-35d292adc57b FLAKY 10/11 Test PASSES after retries.No significant increase in flaky rate(baseline=0.0312, p0=1.0000, reject_threshold=0.0100. adj_baseline=0.1000, p1=0.3487, trust_threshold=0.5000) https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=ControllerLogLimitMirrorMakerTests&test_method=test_mirror_maker_with_limits

StephanDollberg
StephanDollberg previously approved these changes Apr 2, 2026
pgellert
pgellert previously approved these changes Apr 2, 2026
Copy link
Copy Markdown
Contributor

@pgellert pgellert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, just one question about the no_unique_address

Comment thread src/v/base/oncore.h Outdated
// NOLINTNEXTLINE(cppcoreguidelines-special-member-functions,hicpp-special-member-functions)
struct oncore_auto final {
~oncore_auto() noexcept { oncore_debug_verify(_oncore); }
[[no_unique_address]] oncore _oncore;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For [[no_unique_address]] to work, are we relying here on the compiler optimizing away the unused oncore::_owner_shard field in release builds? Is that a safe assumption, or would we need to wrap the field in an ifndef NDEBUG macro as well?

Copy link
Copy Markdown
Member Author

@travisdowns travisdowns Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compiler will ~never optimize away unused fields in an object. So this annotation has no purpose here, as the object it is applied to is never empty. So I will remove it.

I think a better overall pattern for objects like this where you sometimes want to do it in debug, or sometimes in release, is to have the object itself (oncore) do the hard work, and have two variants oncore and oncore_debug, where the former is "always on" and the latter is on only in DEBUG but, both offer the same API regardless of compile mode (but "debug" version function is empty and object is empty as well in release mode), then the consumers can use them with little friction: no conditional compilation, only need [[no_unique_address]] on members.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it, it wasn't doing anything useful here since oncore is never empty.

An oncore that debug-asserts same-shard in its destructor. Embed as a
member to catch cross-shard destruction in debug builds.
@travisdowns travisdowns dismissed stale reviews from pgellert and StephanDollberg via 64a6b6a April 2, 2026 13:42
@travisdowns travisdowns force-pushed the td-remove-response-foreign-ptr branch from 71c8567 to 64a6b6a Compare April 2, 2026 13:42
Comment thread src/v/kafka/server/response.h Outdated
std::optional<tagged_fields> _tags;
iobuf _buf;
protocol::encoder _writer;
[[no_unique_address]] oncore_auto _oncore;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: there's one more of this

Suggested change
[[no_unique_address]] oncore_auto _oncore;
oncore_auto _oncore;

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, sorry about that — removed. Should have caught both in the first pass.

The response object is always created and destroyed on the connection's
shard. The foreign_ptr wrapper was a historical artifact or oversight,
since in general a response may contain data from many shards so a
single-response level foreign pointer doesn't work. Today cross shard
deletion is handled by things _inside_ the response, e.g.,
by foreign_record_batch_readers nested somewhere inside the response.

The respond() path in request_context always serializes into fresh local
buffers, so the response iobuf never contains foreign fragments.

Replace foreign_ptr with a plain unique_ptr and add an oncore_auto
member to response to assert same-shard destruction in debug builds.
@travisdowns travisdowns force-pushed the td-remove-response-foreign-ptr branch from 64a6b6a to f3a9454 Compare April 2, 2026 14:01
@travisdowns travisdowns merged commit 175d8d9 into redpanda-data:dev Apr 6, 2026
19 checks passed
@travisdowns travisdowns deleted the td-remove-response-foreign-ptr branch April 22, 2026 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants