perf(worker): Optimize flake processing by batching testruns#874
perf(worker): Optimize flake processing by batching testruns#874sentry[bot] wants to merge 1 commit intomainfrom
Conversation
| @@ -79,38 +78,18 @@ def handle_failure( | |||
| testrun.outcome = "flaky_fail" | |||
|
|
|||
|
|
|||
There was a problem hiding this comment.
Bug: Processing testruns globally by timestamp, instead of per-upload, can alter flake state calculations when testrun timestamps from different uploads overlap, leading to incorrect flake counts.
Severity: MEDIUM
Suggested Fix
To preserve the original processing logic while retaining the performance benefit of a single query, first fetch all testruns ordered by timestamp. Then, group the testruns by upload_id in memory. Finally, iterate through the uploads in a deterministic order (e.g., by upload_id) and process the testruns for each upload, ensuring the processing order remains consistent with the previous behavior.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: apps/worker/services/test_analytics/ta_process_flakes.py#L80
Potential issue: The logic was changed to process testruns from all associated uploads
in a single batch, ordered globally by timestamp. Previously, testruns were processed
sequentially for each upload. This change in ordering can lead to incorrect flake
statistics. If a 'pass' testrun from a later upload has an earlier timestamp than a
'failure' testrun from an earlier upload, the pass may be processed first. This can
cause a flake to be prematurely marked as resolved (e.g., by reaching 30 passes) and
deleted, only for the subsequent failure to create a new, separate flake record with
reset counts, leading to inaccurate analytics.
Did we get this right? 👍 / 👎 to inform future reviews.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #874 +/- ##
==========================================
- Coverage 92.25% 92.25% -0.01%
==========================================
Files 1307 1307
Lines 48017 48011 -6
Branches 1636 1636
==========================================
- Hits 44299 44293 -6
Misses 3407 3407
Partials 311 311
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Fixes WORKER-Y97. The issue was that: The
process_single_uploadfunction executes an N+1 query for testruns, causing excessive database load and task timeouts.get_testrunsto accept a list of upload IDs, enabling batch retrieval of testruns.process_single_uploadfunction, integrating its logic intoprocess_flakes_for_commit.process_flakes_for_committo fetch all relevant testruns across multiple uploads in a single query.process_flakes_for_commit, eliminating the per-upload iteration.bulk_updatefor all processed testruns.This fix was generated by Seer in Sentry, triggered automatically. 👁️ Run ID: 13573144
Not quite right? Click here to continue debugging with Seer.
Legal Boilerplate
Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.
Note
Medium Risk
Refactors flake processing to batch-fetch and bulk-update
Testruns across uploads, which could subtly change processing order/coverage if query semantics differ (e.g., empty upload sets) but is otherwise a contained performance change.Overview
Optimizes commit-level flake detection by removing per-upload testrun fetching and instead retrieving all recent
Testruns for the commit’s relevant uploads in a singleupload_id__inquery.Consolidates the per-upload processing loop into
process_flakes_for_commit, logging upload IDs directly and performing onebulk_updateof testrun outcomes after processing.Reviewed by Cursor Bugbot for commit ef6cfa4. Bugbot is set up for automated code reviews on this repo. Configure here.