fix: avoid OOM on large vaults by sharing EXISTING_IDS by reference#687
Open
dmantula wants to merge 1 commit into
Open
fix: avoid OOM on large vaults by sharing EXISTING_IDS by reference#687dmantula wants to merge 1 commit into
dmantula wants to merge 1 commit into
Conversation
dataToFileData() deep-clones the entire data object once per scanned file via JSON.parse(JSON.stringify(...)). data.EXISTING_IDS holds every Anki note id in the collection, so on a vault with N markdown files and M existing notes the scan allocates O(N*M) numbers just from this clone. On vaults with several thousand files and tens of thousands of notes the allocation pushes Chromium past its heap limit, which surfaces as the "Paused before potential out-of-memory crash" debugger break and a frozen Obsidian window. EXISTING_IDS is set once in setting-to-data.ts and only read via .includes() inside files-manager.ts and file.ts, so it's safe to share the same array reference across all per-file clones. Detach it before JSON.stringify and re-attach immediately after; this also avoids serializing the array at all, saving CPU. Refs ObsidianToAnki#432, ObsidianToAnki#428, ObsidianToAnki#448, ObsidianToAnki#655, ObsidianToAnki#354.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
FileManager.dataToFileData()deep-clones the entiredataobject onceper scanned file via
JSON.parse(JSON.stringify(...)).data.EXISTING_IDSholds every Anki note id in the collection, so on a vault with N
markdown files and M existing notes the scan allocates O(N*M) numbers
just from this single clone path.
On large vaults this pushes Chromium past its JS heap limit. The visible
symptom is the DevTools banner "Paused before potential out-of-memory
crash" with
getDefaultDeckon the call stack, followed by Obsidianfreezing into a black window once execution resumes.
EXISTING_IDSis set once insetting-to-data.tsand only ever read via.includes()(3 call sites infiles-manager.ts/file.ts), so it issafe to share the same array reference across all per-file clones.
Fix
Detach
EXISTING_IDSfromthis.dataimmediately beforeJSON.stringify, run the clone, then re-attach the original referenceto both
this.dataand the clonedresult. This also skips serializingthe array at all, saving CPU on top of the memory win.
Validation
I'm running this fix in a real environment against an 8,130-file vault
with 38,978 Anki notes. Without the patch the sync hits the OOM break
every time; with the patch it completes successfully.
Related issues
This is the most-likely root cause for several long-standing reports
where syncing a sufficiently large vault freezes or crashes Obsidian:
Test plan
npm run buildsucceedsthat previously OOMed