Skip to content
This repository was archived by the owner on May 5, 2025. It is now read-only.

Allow unrolling subqueries in cleanup task#1156

Merged
Swatinem merged 1 commit intomainfrom
swatinem/unroll-delete-subqueries
Mar 20, 2025
Merged

Allow unrolling subqueries in cleanup task#1156
Swatinem merged 1 commit intomainfrom
swatinem/unroll-delete-subqueries

Conversation

@Swatinem
Copy link
Copy Markdown
Contributor

For some reason, our postgres is unable to properly optimize a delete query on certain tables, which are using a subquery in the where clause.

It turns out that unrolling that subquery within Python works just fine however. So lets do just that.

As we are building all of the delete queries dynamically based on django relations (or some manually defined ones as well), we have to use some queryset magic as well to reverse those automatically created subqueries.

So another method was added which can parse the IN (...) subquery out of a queryset, in order to execute that separately, and issue N+1 delete queries on the target table.

@Swatinem Swatinem requested a review from a team March 19, 2025 16:08
@Swatinem Swatinem self-assigned this Mar 19, 2025
Comment on lines +76 to +81
# This test factory implicitly creates:
# - Test with a Repository and an Owner, Upload, CommitReport.
# - An Upload with a CommitReport, a Commit that has a different Owner and
# one more Repository with yet another different Owner.
# And then also a Branch and a Pull via DB triggers because of the Commit.
remaining = TestInstanceFactory()
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol, these damn recursive factories :-D

@seer-by-sentry
Copy link
Copy Markdown
Contributor

✅ Sentry found no issues in your recent changes ✅

@codecov-notifications
Copy link
Copy Markdown

codecov-notifications Bot commented Mar 19, 2025

Codecov Report

Attention: Patch coverage is 98.48485% with 1 line in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
services/cleanup/relations.py 92.85% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 19, 2025

Codecov Report

Attention: Patch coverage is 98.48485% with 1 line in your changes missing coverage. Please review.

Project coverage is 97.76%. Comparing base (91fa89f) to head (0fec242).
Report is 6 commits behind head on main.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
services/cleanup/relations.py 92.85% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1156      +/-   ##
==========================================
- Coverage   97.77%   97.76%   -0.01%     
==========================================
  Files         444      443       -1     
  Lines       36574    36529      -45     
==========================================
- Hits        35760    35714      -46     
- Misses        814      815       +1     
Flag Coverage Δ
integration 42.97% <21.21%> (+0.03%) ⬆️
unit 90.48% <98.48%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Copy Markdown
Contributor

@giovanni-guidini giovanni-guidini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's some very advanced query magic

For some reason, our postgres is unable to properly optimize a delete query on certain tables, which are using a subquery in the where clause.

It turns out that unrolling that subquery within Python works just fine however.
So lets do just that.

As we are building all of the delete queries dynamically based on django relations (or some manually defined ones as well), we have to use some queryset magic as well to reverse those automatically created subqueries.

So another method was added which can parse the `IN (...)` subquery out of a queryset, in order to execute that separately, and issue N+1 delete queries on the target table.
@Swatinem Swatinem force-pushed the swatinem/unroll-delete-subqueries branch from 78dc751 to 0fec242 Compare March 20, 2025 11:53
@Swatinem Swatinem enabled auto-merge March 20, 2025 11:53
@Swatinem Swatinem added this pull request to the merge queue Mar 20, 2025
Merged via the queue into main with commit 868862f Mar 20, 2025
22 of 29 checks passed
@Swatinem Swatinem deleted the swatinem/unroll-delete-subqueries branch March 20, 2025 12:03
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants