fix: [SLES-2810] prune sorted_reparenting_info on context release to stop warning flood#1161
Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 1 commit intomainfrom Apr 6, 2026
Conversation
…stop warning flood After on_platform_report removes a context from context_buffer, the corresponding ReparentingInfo entry was left in sorted_reparenting_info indefinitely. Every subsequent trace batch caused update_reparenting to iterate all stale entries and emit a WARN for each one, producing a flood of "Mismatched request info. Context not found for request_id" messages in CloudWatch. Fix: retain only entries whose request_id matches a live context when releasing in on_platform_report. Add two regression tests covering the pruning behaviour and the update_reparenting read path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
duncanista
approved these changes
Apr 3, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
This pull request fixes SLES-2810, where stale ReparentingInfo entries were accumulating in the sorted_reparenting_info buffer and causing a flood of warning messages. When invocations completed and their contexts were removed, the corresponding reparenting entries were left behind, causing update_reparenting() to emit warnings for every subsequent trace batch.
Changes:
- Added pruning logic in
on_platform_report()to remove stale reparenting entries when their contexts are released - Added a helper function
make_trace_sender()to reduce test code duplication - Added two regression tests to verify the fix works correctly
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
on_platform_reportremoves a context fromcontext_buffer, the correspondingReparentingInfoentry was left insorted_reparenting_infoindefinitely (capacity 500). Every subsequent trace batch causedupdate_reparentingto iterate all stale entries and emit aWARNfor each one, producing a flood of"Mismatched request info. Context not found for request_id"messages in CloudWatch.sorted_reparenting_info.retain(...)immediately aftercontext_buffer.remove(request_id)inon_platform_reportto prune the completed invocation's entry.on_platform_report, one reproducing the exact production sequence (invoke → add_reparenting → report → trace batch) to confirm stale entries no longer appear inget_reparenting_info().Fixes https://datadoghq.atlassian.net/browse/SLES-2810
Test plan
cargo test test_reparenting_info_pruned_after_on_platform_reportpassescargo test test_update_reparenting_ignores_completed_invocationspasses🤖 Generated with Claude Code