perf(worker): Optimize flake processing with bulk testrun fetching by sentry[bot] · Pull Request #866 · codecov/umbrella

sentry · 2026-04-20T10:53:02Z

Fixes WORKER-Y90. The issue was that: Iterating ReportSession objects and fetching Testrun objects individually causes N+1 queries to ta_timeseries database.

Refactored flake processing to fetch all relevant testruns for a commit's uploads in a single query.
Grouped fetched testruns by upload ID to avoid N+1 queries when iterating through uploads.
Modified process_single_upload to accept pre-fetched testruns.
Moved Testrun bulk update to occur once at the end of process_flakes_for_commit for all modified testruns.
Renamed get_testruns to get_all_testruns to reflect its new functionality of fetching for multiple uploads.

This fix was generated by Seer in Sentry, triggered automatically. 👁️ Run ID: 13565747

Not quite right? Click here to continue debugging with Seer.

Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.

Note

Low Risk
Performance-focused refactor that changes query batching and when Testrun.outcome updates are persisted; low risk but could affect which testruns get updated if upload filtering/order assumptions are wrong.

Overview
Optimizes flake processing to eliminate N+1 DB queries by fetching all Testruns for a commit’s relevant uploads in one query and grouping them by upload_id before per-upload processing.

Refactors process_single_upload to accept pre-fetched testruns, renames get_testruns to get_all_testruns, and defers Testrun.objects.bulk_update(..., ["outcome"]) to run once per commit over the full fetched set.

^{Reviewed by Cursor Bugbot for commit 36d0691. Bugbot is set up for automated code reviews on this repo. Configure here.}

sentry · 2026-04-20T10:56:12Z

        update_fields=["end_date", "count", "recent_passes_count", "fail_count"],
    )

+    Testrun.objects.bulk_update(all_testruns, ["outcome"])


Bug: The bulk_update on Testrun objects uses timestamp as the key. Since multiple testruns can share the same timestamp, this can lead to updating incorrect rows.
_{Severity: HIGH}

Suggested Fix

To prevent incorrect updates, ensure a unique identifier is used for the bulk_update operation. Consider adding a composite unique constraint to the Testrun model, for example on timestamp and upload_id, to uniquely identify each row. If bulk_update cannot use a composite key, consider iterating and updating each object individually to ensure correctness.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: apps/worker/services/test_analytics/ta_process_flakes.py#L150 Potential issue: The `Testrun` model uses `timestamp` as its primary key, but a database migration drops the underlying primary key constraint to create a TimescaleDB hypertable. The code calls `Testrun.objects.bulk_update` to update the `outcome` field. Because multiple `Testrun` instances from different uploads can be created with identical timestamps, Django's `bulk_update` will generate an `UPDATE` statement that incorrectly modifies multiple rows that share the same `timestamp`. This leads to data corruption where the outcome of the wrong testrun is updated. This is confirmed by test data patterns where multiple testruns are created with `timezone.now()`.

_{Did we get this right? 👍 / 👎 to inform future reviews.}

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 36d0691. Configure here.}

cursor · 2026-04-20T10:59:01Z

 @sentry_sdk.trace
 def process_single_upload(
-    upload: ReportSession, curr_flakes: dict[bytes, Flake], repo_id: int
+    upload: ReportSession,


Unused upload parameter in process_single_upload

Low Severity

The upload parameter in process_single_upload is never used within the function body. It was previously needed to call get_testruns(upload), but now that testruns are passed directly as a parameter, upload serves no purpose. This dead parameter adds confusion about the function's interface and could mislead future maintainers into thinking the function depends on the upload object.

^{Reviewed by Cursor Bugbot for commit 36d0691. Configure here.}

sentry · 2026-04-20T11:00:45Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.25%. Comparing base (0ad8a0c) to head (36d0691).
✅ All tests successful. No failed tests found.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #866   +/-   ##
=======================================
  Coverage   92.25%   92.25%           
=======================================
  Files        1307     1307           
  Lines       48017    48021    +4     
  Branches     1636     1636           
=======================================
+ Hits        44299    44303    +4     
  Misses       3407     3407           
  Partials      311      311

Flag	Coverage Δ
workerintegration	`58.53% <10.00%> (-0.02%)`	⬇️
workerunit	`90.39% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

codecov-notifications · 2026-04-20T11:01:12Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

perf(worker): Optimize flake processing with bulk testrun fetching

36d0691

sentry Bot commented Apr 20, 2026

View reviewed changes

cursor Bot reviewed Apr 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(worker): Optimize flake processing with bulk testrun fetching#866

perf(worker): Optimize flake processing with bulk testrun fetching#866
sentry[bot] wants to merge 1 commit intomainfrom
seer/perf/bulk-flake-testruns

sentry Bot commented Apr 20, 2026 •

edited by cursor Bot

Loading

Uh oh!

sentry Bot Apr 20, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Apr 20, 2026

Uh oh!

sentry Bot commented Apr 20, 2026 •

edited

Loading

Uh oh!

codecov-notifications Bot commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

sentry Bot commented Apr 20, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Legal Boilerplate

Uh oh!

sentry Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Apr 20, 2026

Choose a reason for hiding this comment

Unused upload parameter in process_single_upload

Uh oh!

sentry Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

codecov-notifications Bot commented Apr 20, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

sentry Bot commented Apr 20, 2026 •

edited by cursor Bot

Loading

Unused `upload` parameter in `process_single_upload`

sentry Bot commented Apr 20, 2026 •

edited

Loading