Skip to content

perf: drop unused ORDER BY in old-user-record pruning queries#2302

Merged
taddes merged 3 commits into
masterfrom
perf/unneeded-order-by-STOR-117
May 14, 2026
Merged

perf: drop unused ORDER BY in old-user-record pruning queries#2302
taddes merged 3 commits into
masterfrom
perf/unneeded-order-by-STOR-117

Conversation

@taddes

@taddes taddes commented May 12, 2026

Copy link
Copy Markdown
Collaborator

Description

The two _GET_OLD_USER_RECORDS_FOR_SERVICE queries in database.py are used by
purge_old_records.py sort their entire matching set by
(replaced_at DESC, uid DESC) before applying LIMIT/OFFSET. In production
the matching set is a lot of rows, so the planner must sort them
all to hand back 10, repeated on every iteration of the purge loop, which is very much suboptimal.

This removes that ORDER BY from both queries.

Also had to update relevant tests in test_purge_old_recordssince the test enumerated service_requests by index and compared each fxa_kid positionally, relying on the dropped ORDER BY replaced_at DESC to make the order deterministic. The test's actual intent is "both replaced records were purged with valid HAWK-signed DELETE
requests against the storage node" order isn't part of it. Comparing as a set makes that intent explicit and keeps the test stable regardless of scan order.

Testing

How should reviewers test?

Issue(s)

Closes STOR-117.

@taddes taddes self-assigned this May 12, 2026
@taddes taddes requested review from chenba and pjenvey May 12, 2026 17:45
@chenba

chenba commented May 12, 2026

Copy link
Copy Markdown
Collaborator

The fetched records are checked in

if rows == previous_list:
raise Exception("Loop detected")
. Without sorting, that check works differently; the same set but in different order is not considered a "loop". Might be okay in practice though.

@taddes taddes force-pushed the perf/unneeded-order-by-STOR-117 branch from 1e74c20 to 52bdd11 Compare May 13, 2026 20:28
@taddes

taddes commented May 13, 2026

Copy link
Copy Markdown
Collaborator Author

The fetched records are checked in

if rows == previous_list:
raise Exception("Loop detected")

. Without sorting, that check works differently; the same set but in different order is not considered a "loop". Might be okay in practice though.

The fetched records are checked in

if rows == previous_list:
raise Exception("Loop detected")

. Without sorting, that check works differently; the same set but in different order is not considered a "loop". Might be okay in practice though.

Thanks for pointing this out. The test change plus this made me think then that you could have cases where the loop would not be detected if the lists are technically the same, but the items inside them are in a different order. Moved it to check on UID comparison and to use a set for comparison, like in the test change.

@pjenvey

pjenvey commented May 13, 2026

Copy link
Copy Markdown
Member

Looking at it more closely, similarly I'm also a little hesitant about this as pagination (limit+offset) is generally discouraged without an order by or something similar, but I'm not sure that matters here.

@taddes taddes force-pushed the perf/unneeded-order-by-STOR-117 branch from 52bdd11 to 6c77b44 Compare May 14, 2026 16:51

@pjenvey pjenvey left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at it more closely, similarly I'm also a little hesitant about this as pagination (limit+offset) is generally discouraged without an order by or something similar, but I'm not sure that matters here.

Per Taddes: "the way this script is doing this is iterative deletion, not read-only pagination. Get the rows, delete these rows, fetch next batch and repeat until empty. The processed rows are gone before the next query runs. So scan order doesn’t really matter"

On top of that we default to a hardcoded offset of 0, and when one is set it's a random offset anyway. So I don't see how potentially inconsistent ordering would ever come in to play here 👍

@taddes taddes merged commit 218f5a3 into master May 14, 2026
26 checks passed
@taddes taddes deleted the perf/unneeded-order-by-STOR-117 branch May 14, 2026 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants