Skip to content

[WIP] test restructure for improving sync_local performance#179

Draft
rkistner wants to merge 2 commits intomainfrom
optimize-sync-local-2
Draft

[WIP] test restructure for improving sync_local performance#179
rkistner wants to merge 2 commits intomainfrom
optimize-sync-local-2

Conversation

@rkistner
Copy link
Copy Markdown
Contributor

@rkistner rkistner commented Apr 22, 2026

See details here: #178

This is an AI-assisted implementation of part 1 of the proposal. This is not a fully stable implementation yet - it's just enough to do some performance testing. In particular:

  1. Migrations and/or the migration tests still have some issues.
  2. powersync_trigger_resync() isn't updated yet.
  3. The ps_oplog_opid index isn't removed yet (it does not affect these performance tests).
  4. Deletes should be skipped for partial checkpoints, but are processed with this change. This adds some complexity, since we don't have the distinction of "anything in ps_updated_rows should be skipped during partial updates" anymore.

Performance

For some background on the performance tests and previous optimizations, see #78.

Running the performance tests in dart/test/sync_local_performance_test.dart, before these changes:

00:05 +0: test filesystem operations with unique ids sync_local (full)                                   
2691ms Reads: 1289381 + 0 | Writes: 69023 + 0
00:08 +1: test filesystem operations with unique ids sync_local (partial)                                
1110ms Reads: 519383 + 0 | Writes: 27605 + 0

After:

00:05 +0: test filesystem operations with unique ids sync_local (full)                                   
2136ms Reads: 895294 + 0 | Writes: 69051 + 0
00:09 +1: test filesystem operations with unique ids sync_local (partial)                                
907ms Reads: 364909 + 0 | Writes: 34273 + 0

Test notes:

  1. These performance tests do not take into account the increased overhead from writes to ps_updated_rows during the initial download.
  2. The real running time is not that significant - it primarily focuses on recording the number of filesystem reads and writes.

Note the increased writes on the sync_local (partial) case - I assume this is since we now delete data from ps_updated_rows that we did not delete in that case before.

We can also compare the individual read queries directly, excluding the "write data" bits:

new query:     
1040ms Reads: 512974 + 0 | Writes: 0 + 0
previous query, but excluding `ps_updated_rows`, since those weren't there before:
1499ms Reads: 917631 + 0 | Writes: 0 + 0
experimental "full scan" query:
873ms Reads: 516964 + 0 | Writes: 0 + 0

Interesting notes here:

  1. The "full scan" query (part 2 of the proposal) is not significantly faster than the new query using ps_updated_rows, and uses more reads operations in this case. This is likely due to the test treating the first 10k/500k operations as "already applied".
  2. The difference here is bigger than the difference in sync_local before and after this change. This is likely SQLite's page cache making a difference here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant