Skip to content

fix: avoid OOM on large vaults by sharing EXISTING_IDS by reference#687

Open
dmantula wants to merge 1 commit into
ObsidianToAnki:masterfrom
dmantula:fix/oom-on-large-vault-existing-ids-clone
Open

fix: avoid OOM on large vaults by sharing EXISTING_IDS by reference#687
dmantula wants to merge 1 commit into
ObsidianToAnki:masterfrom
dmantula:fix/oom-on-large-vault-existing-ids-clone

Conversation

@dmantula
Copy link
Copy Markdown

Summary

FileManager.dataToFileData() deep-clones the entire data object once
per scanned file via JSON.parse(JSON.stringify(...)). data.EXISTING_IDS
holds every Anki note id in the collection, so on a vault with N
markdown files and M existing notes the scan allocates O(N*M) numbers
just from this single clone path.

On large vaults this pushes Chromium past its JS heap limit. The visible
symptom is the DevTools banner "Paused before potential out-of-memory
crash"
with getDefaultDeck on the call stack, followed by Obsidian
freezing into a black window once execution resumes.

EXISTING_IDS is set once in setting-to-data.ts and only ever read via
.includes() (3 call sites in files-manager.ts / file.ts), so it is
safe to share the same array reference across all per-file clones.

Fix

Detach EXISTING_IDS from this.data immediately before
JSON.stringify, run the clone, then re-attach the original reference
to both this.data and the cloned result. This also skips serializing
the array at all, saving CPU on top of the memory win.

Validation

I'm running this fix in a real environment against an 8,130-file vault
with 38,978 Anki notes. Without the patch the sync hits the OOM break
every time; with the patch it completes successfully.

Related issues

This is the most-likely root cause for several long-standing reports
where syncing a sufficiently large vault freezes or crashes Obsidian:

Test plan

  • npm run build succeeds
  • Patched plugin loads in Obsidian and completes a full vault scan
    that previously OOMed
  • Maintainer review

dataToFileData() deep-clones the entire data object once per scanned file
via JSON.parse(JSON.stringify(...)). data.EXISTING_IDS holds every Anki
note id in the collection, so on a vault with N markdown files and M
existing notes the scan allocates O(N*M) numbers just from this clone.
On vaults with several thousand files and tens of thousands of notes the
allocation pushes Chromium past its heap limit, which surfaces as the
"Paused before potential out-of-memory crash" debugger break and a
frozen Obsidian window.

EXISTING_IDS is set once in setting-to-data.ts and only read via
.includes() inside files-manager.ts and file.ts, so it's safe to share
the same array reference across all per-file clones. Detach it before
JSON.stringify and re-attach immediately after; this also avoids
serializing the array at all, saving CPU.

Refs ObsidianToAnki#432, ObsidianToAnki#428, ObsidianToAnki#448, ObsidianToAnki#655, ObsidianToAnki#354.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant