fix library sync backfill O(N^2) hotspots on both sides of the protocol by jamiepine · Pull Request #3061 · spacedriveapp/spacedrive

jamiepine · 2026-04-18T05:05:59Z

Summary

Fix progressive slowdown during initial library sync by removing O(N^2) patterns on sender and receiver.
Addresses Library sync backfill slows progressively on large shared tables (O(N^2) sender work + per-record receiver event emission) #3058 (shared models, content_identity) and Entries (device-owned) sync backfill is slow: missing cursor index, per-record FK resolution, redundant per-entry closure rebuild #3060 (device-owned entries).
Observed slowdown from ~5k records/batch down to ~100/batch past 100k applied is gone — batch size holds steady at the configured backfill_batch_size.

Changes

Sender

PeerLog::get_since now takes limit: Option<usize> and pushes LIMIT ? into SQL. Callers fetch limit + 1 to derive has_more. Previously every SharedChangeRequest reloaded and parsed the entire remaining shared_changes log before truncating in memory.
Entry::query_for_sync and ContentIdentity::query_for_sync batch FK→UUID conversion across the whole batch via a new convert_fks_to_uuids_batch helper — one DB round trip per FK type instead of per record × FK.

Schema

New migration m20260417_000001_add_entries_sync_cursor_index adding idx_entries_indexed_at_uuid to back the cursor used by Entry::query_for_sync. Without it each batch request fell back to a full-table scan ordered by indexed_at.

Receiver

New infra::sync::backfill_context module exposes a task-local in_backfill scope. The backfill coordinator wraps phases 2 + 3 in it; per-record hooks check is_in_backfill() to skip work that's redundant with the post-backfill pass.
Entry::apply_state_change skips per-entry entry_closure rebuild during backfill. The existing post_backfill_rebuild pass already does a single bulk rebuild at the end.
ResourceManager::emit_batch_resource_events short-circuits during backfill (avoids the per-UUID fan-out queries in file.rs::route_from_dependency). The backfill coordinator emits one Event::Refresh after the post-backfill rebuild so the frontend drops stale cached views.
Entry FK resolution in backfill.rs splits mappings into self-referential (parent_id) and the rest. Non-self FKs resolve in one batch across the whole batch; only parent_id still runs per-entry so children can see just-inserted parents.

Test plan

Fresh pair + library sync with >100k content_identity records — batch throughput holds at configured backfill_batch_size end to end
Fresh pair + library sync with >100k entries — same
Incremental catch-up after offline period still updates entry_closure correctly (catch-up path is outside in_backfill)
UI views populate after initial backfill completes (the coarse Event::Refresh fires and invalidates caches)
Existing sync tests pass (cargo test -p sd-core --lib sync)

Closes #3058
Closes #3060

Note

Performance fix for library sync backfill: eliminates O(N²) query patterns on both sender and receiver sides. Sender-side: implements database-level pagination in PeerLog::get_since and batches FK→UUID resolution. Receiver-side: introduces backfill_context module to skip redundant per-record work during bulk sync, defers entry_closure rebuilds and resource event emissions to post-backfill pass. Result: batch throughput remains constant even with 100k+ records instead of degrading to 100 records/batch.
_{Written by Tembo for commit 17fdc0a.}

Addresses #3058 (shared models, content_identity) and #3060 (device-owned entries). The progressive slowdown on initial pair came from several independent hotspots that compound: Sender side - PeerLog::get_since now pushes LIMIT into SQL; callers fetch limit + 1 to derive has_more. Previously every SharedChangeRequest reloaded and parsed the entire remaining shared_changes log in memory before truncating, making sender work O(N * batches). - Entry and content_identity query_for_sync batch FK to UUID conversion across the whole batch via a new convert_fks_to_uuids_batch helper — one DB round trip per FK type instead of per record per FK. Schema - New index idx_entries_indexed_at_uuid backing the (indexed_at, uuid) cursor in Entry::query_for_sync. Without it every batch request fell back to a full-table scan. Receiver side - New task-local in_backfill scope wraps the backfill apply phases. Entry::apply_state_change uses it to skip per-entry entry_closure rebuild; the existing post_backfill_rebuild pass does a single bulk rebuild at the end. emit_batch_resource_events short-circuits during backfill and the coordinator emits one Event::Refresh after post- backfill rebuild so the UI invalidates cached views. - Entry FK resolution in backfill splits mappings into self-referential (parent_id) and the rest. Non-self FKs resolve in one batch across the whole batch; only parent_id still runs per-entry so children can see just-inserted parents. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor · 2026-04-18T05:06:06Z

PR Summary

Medium Risk
Touches core sync backfill send/receive paths and adds a new SQLite index; mistakes could cause missing updates, incorrect dependency resolution, or stale UI caches during/after initial sync.

Overview
Speeds up initial library sync/backfill by removing per-record DB work on both protocol sides.

On the sender, PeerLog::get_since now accepts an optional SQL LIMIT, and callers fetch limit + 1 to compute has_more without loading the full remaining log. Entry::query_for_sync and ContentIdentity::query_for_sync switch FK id → uuid conversion to a new batch helper (convert_fks_to_uuids_batch) to reduce DB round trips.

On the receiver, phases 2–3 of backfill run inside a new task-local in_backfill scope so hooks can short-circuit: Entry::apply_state_change skips per-entry entry_closure rebuilds and ResourceManager::emit_batch_resource_events suppresses per-record invalidations; the coordinator runs post-backfill rebuilds then emits a single Event::Refresh. Entry FK mapping during backfill is optimized to batch non-self FKs while still resolving parent_id per-entry.

Adds a migration creating idx_entries_indexed_at_uuid to support entries cursor pagination (ORDER BY indexed_at, uuid) and avoid full table scans during backfill.

^{Reviewed by Cursor Bugbot for commit 17fdc0a. Configure here.}

coderabbitai · 2026-04-18T05:07:50Z

Walkthrough

Adds a task-local backfill marker and uses it to: skip per-record resource event fan-out and per-entry closure rebuilds during backfill; batch FK→UUID resolution on sender and receiver; add SQL LIMIT to peer log pagination and a composite cursor index for entries; and adjust backfill orchestration to run inside the backfill scope.

Changes

Cohort / File(s)	Summary
Backfill Context Infrastructure `core/src/infra/sync/backfill_context.rs`, `core/src/infra/sync/mod.rs`	Introduce Tokio task-local `IN_BACKFILL`, `in_backfill(...)`, and `is_in_backfill()`; re-exported for sync code.
FK Resolution Batching `core/src/infra/sync/fk_mapper.rs`, `core/src/infra/sync/mod.rs`	Add `convert_fks_to_uuids_batch(...)` to convert integer FKs to UUIDs across a batch of JSON records; update public exports.
ContentIdentity & Entry Sync Query Changes `core/src/infra/db/entities/content_identity.rs`, `core/src/infra/db/entities/entry.rs`	`query_for_sync` refactored to stage rows, serialize JSON payloads, and perform per-FK-type batch FK→UUID conversion; drop/aggregate failed indices on batch failures; `entry::apply_state_change` skips per-entry closure rebuild when in backfill.
Database Cursor Index & Migration `core/src/infra/db/migration/m20260417_000001_add_entries_sync_cursor_index.rs`, `core/src/infra/db/migration/mod.rs`	Add migration creating/dropping composite index `idx_entries_indexed_at_uuid` on `entries(indexed_at, uuid)` and register migration.
Peer Log Pagination & Call Sites `core/src/infra/sync/peer_log.rs`, `core/src/service/sync/peer.rs`, `core/src/service/sync/protocol_handler.rs`	`PeerLog::get_since` gains `limit: Option<usize>` and pushes `LIMIT ?` into SQL; callers compute `fetch_limit = limit + 1`, reject `limit == 0`, and derive `has_more` and truncation from fetched rows.
Backfill Orchestration & Entry FK handling `core/src/service/sync/backfill.rs`	Backfill phases run inside `in_backfill(...)`. For `entry`, non-self FKs are batch-resolved across the batch and self-referential (`parent_id`) FKs handled per-entry; phase results returned from the scoped future; emits a coarse `Event::Refresh` after post-backfill rebuilds.
Resource Event Emission Optimization `core/src/domain/resource_manager.rs`	`emit_batch_resource_events` now early-returns when `is_in_backfill()` is true (emits only a trace message), avoiding per-record fan-out during backfill.

Sequence Diagram

sequenceDiagram
    autonumber
    actor BackfillLoop as Backfill Loop
    participant DB as Database
    participant FKBatch as FK Batch Resolver
    participant Applier as State Applier
    participant Closure as Closure Rebuilder
    participant EventBus as Event Bus

    BackfillLoop->>BackfillLoop: in_backfill(...) (set IN_BACKFILL)
    BackfillLoop->>DB: Fetch next batch (cursor + LIMIT)
    DB-->>BackfillLoop: Batch of rows (JSON payloads)

    BackfillLoop->>FKBatch: Batch-resolve non-self FKs
    FKBatch->>DB: Lookup UUIDs for collected local IDs
    DB-->>FKBatch: UUID mappings
    FKBatch-->>BackfillLoop: Resolved payloads

    BackfillLoop->>Applier: Apply batch (per-entry for self-FKs as needed)
    Applier->>DB: Insert/update rows
    DB-->>Applier: OK

    Applier->>Closure: check is_in_backfill()
    Closure-->>Applier: true (skip per-entry rebuild)

    Applier->>EventBus: emit_batch_resource_events()
    EventBus-->>Applier: trace-only (no per-record fan-out)

    BackfillLoop->>Closure: run_post_backfill_rebuild (bulk)
    Closure->>DB: Rebuild all closures
    DB-->>Closure: Done

    BackfillLoop->>EventBus: emit Event::Refresh
    BackfillLoop->>BackfillLoop: end in_backfill (clear IN_BACKFILL)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 I hop in scoped-backfill, flag held tight,

Batch my FKs and limit SQL's flight,
Skip per-row noise, defer the heavy chore,
Rebuild once, then nudge the UI door,
Tiny traces first — then sync restored. ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title accurately summarizes the main change: eliminating O(N²) performance hotspots on both sender and receiver sides during library sync backfill.
Description check	✅ Passed	The PR description provides comprehensive detail: addresses linked issues `#3058` and `#3060`, clearly explains sender/receiver/schema changes, cites expected improvements, and includes a concrete test plan.
Linked Issues check	✅ Passed	All primary coding objectives from `#3058` and `#3060` are met: SQL LIMIT in PeerLog, batch FK conversion in query_for_sync, composite index creation, backfill_context task-local, skipped closure rebuild during backfill, short-circuited resource event emission, and non-self FK batching.
Out of Scope Changes check	✅ Passed	All changes are tightly scoped to the linked issues: sender pagination, FK batching, index migration, backfill context infrastructure, and receiver-side optimizations. No extraneous modifications detected.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch spacedrive-sync-fixes

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

core/src/service/sync/backfill.rs (2)
212-247: ⚠️ Potential issue | 🟠 Major

Do not continue to Ready if deferred rebuilds fail.

Now that in_backfill suppresses per-record hooks, run_post_backfill_rebuilds is the only path that rebuilds derived tables for those records. The surrounding warn-and-continue path can leave entry_closure incomplete, then emit Refresh and transition the device to ready with stale hierarchy data.
Proposed change to make post-backfill rebuild mandatory
-		if let Err(e) =
-			crate::infra::sync::registry::run_post_backfill_rebuilds(self.peer_sync.db().clone())
-				.await
-		{
-			tracing::warn!("Post-backfill rebuild had errors: {}", e);
-			// Don't fail backfill, just warn
-		}
+		crate::infra::sync::registry::run_post_backfill_rebuilds(self.peer_sync.db().clone())
+			.await
+			.map_err(|e| anyhow::anyhow!("Post-backfill rebuild failed: {}", e))?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/src/service/sync/backfill.rs` around lines 212 - 247, The post-backfill
rebuild errors are currently only warned about, which lets the device continue
to Ready with incomplete derived tables; change the code so failures in
crate::infra::sync::registry::run_post_backfill_rebuilds(self.peer_sync.db().clone()).await
are propagated to abort the backfill: replace the if-let-warn block with a
fallible call (use ? or return Err(anyhow::Error::new(e))) so the function
returns the error, and only emit
self.peer_sync.event_bus().emit(crate::infra::event::Event::Refresh) after
successful rebuilds; ensure the in_backfill scope, run_post_backfill_rebuilds,
and event emission lines are updated accordingly.
627-690: ⚠️ Potential issue | 🟠 Major

Interleave self-FK resolution with applying parents.

This loop resolves every parent_uuid before any entry from pre_resolved is inserted by the later apply loop. Same-batch children therefore cannot “see” just-inserted parents; they get routed through the dependency tracker instead, which reintroduces extra per-record retry work and can strand deeper same-batch descendants if dependency resolution is not cascaded.

Consider applying entries immediately after successful self-FK resolution, or topologically processing parent chains so each parent is inserted before resolving its children.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/src/service/sync/backfill.rs` around lines 627 - 690, The loop currently
resolves self-referential FKs for all pre_resolved entries before any inserts,
causing same-batch children to miss newly-inserted parents; change it so that
after crate::infra::sync::batch_map_sync_json_to_local(vec![data.clone()],
self_ref_mappings.clone(), &db).await returns a non-empty result.succeeded you
immediately apply/insert those succeeded entries (i.e., invoke the same
per-entry "apply" logic used later in the file) instead of deferring to the
later apply loop or routing to
self.peer_sync.dependency_tracker().add_dependency; alternatively implement a
simple topological parent-first ordering of pre_resolved chains so parents
(identified by parent_id/parent_uuid in the JSON) are inserted before their
children, ensuring super::state::StateChangeMessage is only sent to the
dependency tracker for genuinely missing external parents.

🧹 Nitpick comments (1)

core/src/infra/db/migration/m20260417_000001_add_entries_sync_cursor_index.rs (1)
1-36: LGTM — index matches the cursor shape.

Composite (indexed_at, uuid) matches Entry::query_for_sync’s ORDER BY indexed_at ASC, uuid ASC + tie-breaker filter, so SQLite can serve pagination and the range predicate directly from the index. IF NOT EXISTS / IF EXISTS make up/down idempotent, which is appropriate here.

One optional consideration: on very large entries tables the initial CREATE INDEX can hold a write lock for a noticeable duration during migration. If that's a concern for upgrade UX, surface it in release notes — no code change needed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@core/src/infra/db/migration/m20260417_000001_add_entries_sync_cursor_index.rs`
around lines 1 - 36, No code changes required: the migration in
Migration::up/Down creates/drops the composite index idx_entries_indexed_at_uuid
that matches Entry::query_for_sync's ORDER BY indexed_at ASC, uuid ASC and uses
IF NOT EXISTS/IF EXISTS for idempotence; proceed to approve/merge this
migration, and optionally add a release note warning that CREATE INDEX on very
large entries tables can hold a brief write lock during upgrades.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/src/infra/db/entities/content_identity.rs`:
- Around line 268-298: The batch FK conversion in query_for_sync (seen in
content_identity.rs: variables fk_mappings, payloads, sync_results) blindly
replaces per-record JSON with the converted payloads from
convert_fks_to_uuids_batch, which leaves unresolved rows unchanged (local FK
still present) and leads map_sync_json_to_local to write sender-local FK ints
into the DB; fix by detecting unresolved conversions after the batch call (for
each fk: if resolved_payload lacks the uuid_field but still has local_field) and
either skip those records when zipping back into sync_results or return an error
— i.e., only overwrite *json in the zip-back loop for records that have the
uuid_field present (and remove or mark others), and apply the same change to the
identical logic in entry.rs (or alternatively change convert_fks_to_uuids_batch
to return failed indices and handle them here).

In `@core/src/infra/db/entities/entry.rs`:
- Around line 337-359: After calling convert_fks_to_uuids_batch for each fk
mapping, verify that no payload still contains the source/local FK field
(fk.local_field) and fail the whole sender batch if any remain; specifically,
after the call that produces payloads (before the assignment into staged)
iterate the payloads' serde_json::Value items and check for the presence of the
fk.local_field key (or nested path used), and if found log a warning including
fk.local_field and return Err(sea_orm::DbErr::Custom(...)) to stop the send;
update the block around convert_fks_to_uuids_batch / payloads -> staged
assignment to perform this guard for each fk in fk_mappings.

In `@core/src/infra/sync/fk_mapper.rs`:
- Around line 111-182: convert_fks_to_uuids_batch currently leaves
fk.local_field intact when a target ID cannot be resolved, making unresolved
records indistinguishable from resolved ones; change its behavior so that for
any unresolved case (missing mapping or non-i64 local value) the function sets
fk.uuid_field to Value::Null and removes fk.local_field from the record (and
still emit the existing tracing::warn), so downstream code cannot accidentally
treat a stale integer as a resolved FK; locate and update the match arms in
convert_fks_to_uuids_batch (referencing FKMapping::local_field,
FKMapping::uuid_field_name, and batch_lookup_uuids_for_local_ids) to remove the
local field and set the uuid field to Null for both the None branch of
id_to_uuid.get and for the Some(v) but non-i64 case.

---

Outside diff comments:
In `@core/src/service/sync/backfill.rs`:
- Around line 212-247: The post-backfill rebuild errors are currently only
warned about, which lets the device continue to Ready with incomplete derived
tables; change the code so failures in
crate::infra::sync::registry::run_post_backfill_rebuilds(self.peer_sync.db().clone()).await
are propagated to abort the backfill: replace the if-let-warn block with a
fallible call (use ? or return Err(anyhow::Error::new(e))) so the function
returns the error, and only emit
self.peer_sync.event_bus().emit(crate::infra::event::Event::Refresh) after
successful rebuilds; ensure the in_backfill scope, run_post_backfill_rebuilds,
and event emission lines are updated accordingly.
- Around line 627-690: The loop currently resolves self-referential FKs for all
pre_resolved entries before any inserts, causing same-batch children to miss
newly-inserted parents; change it so that after
crate::infra::sync::batch_map_sync_json_to_local(vec![data.clone()],
self_ref_mappings.clone(), &db).await returns a non-empty result.succeeded you
immediately apply/insert those succeeded entries (i.e., invoke the same
per-entry "apply" logic used later in the file) instead of deferring to the
later apply loop or routing to
self.peer_sync.dependency_tracker().add_dependency; alternatively implement a
simple topological parent-first ordering of pre_resolved chains so parents
(identified by parent_id/parent_uuid in the JSON) are inserted before their
children, ensuring super::state::StateChangeMessage is only sent to the
dependency tracker for genuinely missing external parents.

---

Nitpick comments:
In
`@core/src/infra/db/migration/m20260417_000001_add_entries_sync_cursor_index.rs`:
- Around line 1-36: No code changes required: the migration in
Migration::up/Down creates/drops the composite index idx_entries_indexed_at_uuid
that matches Entry::query_for_sync's ORDER BY indexed_at ASC, uuid ASC and uses
IF NOT EXISTS/IF EXISTS for idempotence; proceed to approve/merge this
migration, and optionally add a release note warning that CREATE INDEX on very
large entries tables can hold a brief write lock during upgrades.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 8711006e-011f-489d-bf89-36bce7cd25f9

📥 Commits

Reviewing files that changed from the base of the PR and between 9945639 and 17fdc0a.

📒 Files selected for processing (12)

core/src/domain/resource_manager.rs
core/src/infra/db/entities/content_identity.rs
core/src/infra/db/entities/entry.rs
core/src/infra/db/migration/m20260417_000001_add_entries_sync_cursor_index.rs
core/src/infra/db/migration/mod.rs
core/src/infra/sync/backfill_context.rs
core/src/infra/sync/fk_mapper.rs
core/src/infra/sync/mod.rs
core/src/infra/sync/peer_log.rs
core/src/service/sync/backfill.rs
core/src/service/sync/peer.rs
core/src/service/sync/protocol_handler.rs

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 17fdc0a. Configure here.}

convert_fks_to_uuids_batch now returns the set of record indices whose FK could not be resolved (missing target row or non-integer value). Callers in entry::query_for_sync and content_identity::query_for_sync drop those records from the outgoing sync batch instead of zipping the partially converted payload back. Previously the sender would ship a record with its local int field intact and no *_uuid field; the receiver's map_sync_json_to_local interpreted that as already-resolved and wrote the sender-local integer directly to the local DB, corrupting the FK. Also: reject limit == 0 on SharedChangeRequest / get_shared_changes up front (would have returned 0 rows with has_more = true and spun), and clamp the SQL LIMIT bind with i64::try_from(..).unwrap_or(i64::MAX) instead of a wrapping `as i64`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/src/infra/db/entities/content_identity.rs`:
- Around line 302-317: The current post-filtering of sync_results in
query_for_sync (the sync_results.zip(payloads).filter_map block) can shrink a
DB-limited page when unresolved FKs are dropped, causing empty/short pages and
no cursor to advance; instead either: 1) treat any presence of failed_indices as
a batch-level error and return an Err so the caller can advance/fix, or 2)
implement over-fetching/scanning: keep reading rows from the raw cursor beyond
the initial DB-limited page until you accumulate batch_size valid entries (using
failed_indices to skip invalid rows) or the cursor is exhausted. Update the
logic in query_for_sync (and the variables sync_results, payloads, batch_size,
failed_indices handling) to choose one approach and ensure the returned page
length is never silently reduced by dropping FK-failed rows.

In `@core/src/infra/db/entities/entry.rs`:
- Around line 326-327: The returned cursor uses
entry.indexed_at.unwrap_or(entry.modified_at) (assigned to timestamp and pushed
via staged.push((uuid, json, timestamp))) but the DB query filters/orders by
IndexedAt only, causing cursor mismatch for NULL indexed_at rows; fix by making
the DB query use the same COALESCE(indexed_at, modified_at) expression in its
SELECT/WHERE/ORDER BY so the returned timestamp matches the query cursor, or
alternatively change the code to skip rows with NULL entry.indexed_at (i.e.,
only use entry.indexed_at and do not fall back to modified_at) so the cursor
aligns with the IndexedAt predicate. Ensure the unique symbols entry.indexed_at,
entry.modified_at, timestamp, and staged.push are updated consistently.
- Around line 364-379: The current staged processing zips staged and payloads
and silently drops rows whose indices are in failed_indices, which can shrink a
SQL-limited page and cause the caller to stop scanning prematurely; instead,
when failed_indices is non-empty (the symbols to check are failed_indices,
staged, payloads, and the zip/enumerate/filter_map/collect block), either return
an explicit error for the batch (propagate Err) or implement over-fetching by
abandoning the filter_map approach and continuing to fetch/scan until the
returned page reaches the requested size or the raw cursor is exhausted; update
the block that currently does .zip(...).enumerate().filter_map(...) to detect
failed_indices first and handle it by returning Err or by looping to fetch
additional rows rather than dropping unresolved FK rows.

In `@core/src/infra/sync/fk_mapper.rs`:
- Around line 130-169: The None arm for local_field_value should mark the record
as failed instead of writing a null UUID; update the loop in fk_mapper.rs (the
block iterating records.iter_mut() that matches local_field_value) so that when
local_field_value is None you call failed.insert(idx) (and optionally
tracing::warn! with fk.local_field and idx) and continue, mirroring
convert_fk_to_uuid's behavior, rather than setting json[&uuid_field] =
Value::Null.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: e72066e9-25a0-43c7-ace9-171e218fb4bd

📥 Commits

Reviewing files that changed from the base of the PR and between 17fdc0a and 96debbd.

📒 Files selected for processing (6)

core/src/infra/db/entities/content_identity.rs
core/src/infra/db/entities/entry.rs
core/src/infra/sync/fk_mapper.rs
core/src/infra/sync/peer_log.rs
core/src/service/sync/peer.rs
core/src/service/sync/protocol_handler.rs

🚧 Files skipped from review as they are similar to previous changes (2)

core/src/service/sync/protocol_handler.rs
core/src/infra/sync/peer_log.rs

…exed_at convert_fks_to_uuids_batch previously treated a missing local_field as null. convert_fk_to_uuid errors in that case — only an explicit JSON null is a legitimate null FK — so the batch helper now flags absent fields as failed to match the per-record contract. Entry::query_for_sync no longer falls back to modified_at when indexed_at is NULL. The cursor filter/order uses indexed_at exclusively, so a returned cursor derived from modified_at wouldn't match the next query's predicate. The indexed_at backfill migration populated existing rows; any NULL here is a data bug, logged and skipped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

jamiepine · 2026-04-18T06:25:00Z

Pushed bd1c7130f addressing two of the four new CodeRabbit comments:

fk_mapper.rs absent-field handling — now treats None as failed, matching convert_fk_to_uuid's contract (only explicit JSON null is a legitimate null FK).
entry.rs cursor alignment — no longer falls back to modified_at when indexed_at is NULL. The query filters/orders on indexed_at only, so the returned cursor would otherwise misrepresent the scan position. The backfill migration populated existing rows; any NULL here is a data bug and is now logged + skipped.

Not addressing the two "SQL-limited page shrinkage" comments in this PR:

The concern is legitimate (tail-of-batch FK failures can stall the cursor) but the fix is out of scope here:

The pre-existing per-record convert_fk_to_uuid call sites had the same failure mode silently — they warned and continued the inner FK loop, then pushed the entry to sync_results anyway with the sender-local int intact. That silent FK-corruption path is what the earlier review already caught and what this PR fixes.
Proper resolution needs over-fetching in query_for_sync (or a parallel "max scanned cursor" return value), which is a bigger refactor than belongs in a perf PR.
In practice this triggers only when a record references a FK target that's missing from the sender's own DB — a data corruption bug, not normal operation.

Filed as a follow-up rather than blocking this PR. Open to doing the over-fetch refactor as a separate change if reviewers prefer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

🧹 Nitpick comments (1)

core/src/service/sync/backfill.rs (1)
624-687: Optional: avoid unnecessary clones in the per-entry self-ref loop.

Minor: data is already owned by the for data in pre_resolved iteration, so vec![data.clone()] allocates twice. self_ref_mappings.clone() per iteration is also avoidable if you lift it once (e.g., pre-clone a reusable Vec or take &[FKMapping] if the signature allows). None of this is correctness-affecting — it's a tight inner loop over potentially 100k entries, so every copy counts in a perf PR.
♻️ Proposed tweak
-						if self_ref_mappings.is_empty() {
-							succeeded.extend(pre_resolved);
-						} else {
-							for data in pre_resolved {
-								let result = crate::infra::sync::batch_map_sync_json_to_local(
-									vec![data.clone()],
-									self_ref_mappings.clone(),
-									&db,
-								)
+						if self_ref_mappings.is_empty() {
+							succeeded.extend(pre_resolved);
+						} else {
+							for data in pre_resolved {
+								let result = crate::infra::sync::batch_map_sync_json_to_local(
+									vec![data],
+									self_ref_mappings.clone(),
+									&db,
+								)
Also note: the if !result.succeeded.is_empty() { ... } else if !result.failed.is_empty() { ... } structure implicitly assumes a single-record input (exactly one of the two is non-empty). That holds today, but a plain if/if would be more robust if someone later lifts this to multi-record batching.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/src/service/sync/backfill.rs` around lines 624 - 687, The per-entry loop
currently does unnecessary cloning: when iterating `for data in pre_resolved`
move `data` directly into the call to
`crate::infra::sync::batch_map_sync_json_to_local` instead of
`vec![data.clone()]` (use `vec![data]` or construct the one-item Vec by moving),
and hoist a single clone of `self_ref_mappings` (or pass a `&[FKMapping]` if
`batch_map_sync_json_to_local` can accept a slice) outside the loop so you don't
clone it each iteration; also change the `if !result.succeeded.is_empty() { ...
} else if !result.failed.is_empty() { ... }` into two independent `if` checks to
avoid assuming single-record input.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@core/src/service/sync/backfill.rs`:
- Around line 624-687: The per-entry loop currently does unnecessary cloning:
when iterating `for data in pre_resolved` move `data` directly into the call to
`crate::infra::sync::batch_map_sync_json_to_local` instead of
`vec![data.clone()]` (use `vec![data]` or construct the one-item Vec by moving),
and hoist a single clone of `self_ref_mappings` (or pass a `&[FKMapping]` if
`batch_map_sync_json_to_local` can accept a slice) outside the loop so you don't
clone it each iteration; also change the `if !result.succeeded.is_empty() { ...
} else if !result.failed.is_empty() { ... }` into two independent `if` checks to
avoid assuming single-record input.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 75b075f0-0f9d-4d2c-a766-a9ac93eed003

📥 Commits

Reviewing files that changed from the base of the PR and between 96debbd and 50c392e.

📒 Files selected for processing (4)

core/src/infra/db/entities/content_identity.rs
core/src/infra/db/entities/entry.rs
core/src/infra/sync/fk_mapper.rs
core/src/service/sync/backfill.rs

🚧 Files skipped from review as they are similar to previous changes (1)

core/src/infra/sync/fk_mapper.rs

tembo Bot reviewed Apr 18, 2026

View reviewed changes

Comment thread core/src/infra/sync/fk_mapper.rs Outdated

Comment thread core/src/infra/sync/peer_log.rs Outdated

Comment thread core/src/service/sync/protocol_handler.rs

coderabbitai Bot reviewed Apr 18, 2026

View reviewed changes

Comment thread core/src/infra/db/entities/content_identity.rs

Comment thread core/src/infra/db/entities/entry.rs Outdated

Comment thread core/src/infra/sync/fk_mapper.rs

cursor Bot reviewed Apr 18, 2026

View reviewed changes

Comment thread core/src/infra/sync/fk_mapper.rs

coderabbitai Bot reviewed Apr 18, 2026

View reviewed changes

Comment thread core/src/infra/db/entities/content_identity.rs

Comment thread core/src/infra/db/entities/entry.rs Outdated

Comment thread core/src/infra/db/entities/entry.rs

Comment thread core/src/infra/sync/fk_mapper.rs Outdated

cargo fmt

50c392e

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai Bot reviewed Apr 18, 2026

View reviewed changes

jamiepine merged commit 75479df into main Apr 19, 2026
3 of 5 checks passed

jamiepine deleted the spacedrive-sync-fixes branch April 19, 2026 01:51

Uh oh!

Conversation

jamiepine commented Apr 18, 2026 • edited by tembo Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Uh oh!

cursor Bot commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

coderabbitai Bot commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jamiepine commented Apr 18, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jamiepine commented Apr 18, 2026 •

edited by tembo Bot

Loading

cursor Bot commented Apr 18, 2026 •

edited

Loading

coderabbitai Bot commented Apr 18, 2026 •

edited

Loading