perf: drop unused ORDER BY in old-user-record pruning queries#2302
Conversation
|
The fetched records are checked in syncstorage-rs/tools/tokenserver/purge_old_records.py Lines 81 to 82 in 004d027 |
1e74c20 to
52bdd11
Compare
Thanks for pointing this out. The test change plus this made me think then that you could have cases where the loop would not be detected if the lists are technically the same, but the items inside them are in a different order. Moved it to check on UID comparison and to use a |
|
Looking at it more closely, similarly I'm also a little hesitant about this as pagination ( |
52bdd11 to
6c77b44
Compare
pjenvey
left a comment
There was a problem hiding this comment.
Looking at it more closely, similarly I'm also a little hesitant about this as pagination (
limit+offset) is generally discouraged without anorder byor something similar, but I'm not sure that matters here.
Per Taddes: "the way this script is doing this is iterative deletion, not read-only pagination. Get the rows, delete these rows, fetch next batch and repeat until empty. The processed rows are gone before the next query runs. So scan order doesn’t really matter"
On top of that we default to a hardcoded offset of 0, and when one is set it's a random offset anyway. So I don't see how potentially inconsistent ordering would ever come in to play here 👍
Description
The two
_GET_OLD_USER_RECORDS_FOR_SERVICEqueries in database.py are used bypurge_old_records.pysort their entire matching set by(replaced_at DESC, uid DESC)before applying LIMIT/OFFSET. In productionthe matching set is a lot of rows, so the planner must sort them
all to hand back 10, repeated on every iteration of the purge loop, which is very much suboptimal.
This removes that
ORDER BYfrom both queries.Also had to update relevant tests in
test_purge_old_recordssince the test enumerated service_requests by index and compared each fxa_kid positionally, relying on the dropped ORDER BY replaced_at DESC to make the order deterministic. The test's actual intent is "both replaced records were purged with valid HAWK-signed DELETErequests against the storage node" order isn't part of it. Comparing as a set makes that intent explicit and keeps the test stable regardless of scan order.
Testing
How should reviewers test?
Issue(s)
Closes STOR-117.