Skip to content

Fix OOM in search-sync + schedule daily cron#313

Merged
JamesReate merged 3 commits into
mainfrom
fix/search-sync-oom
Apr 20, 2026
Merged

Fix OOM in search-sync + schedule daily cron#313
JamesReate merged 3 commits into
mainfrom
fix/search-sync-oom

Conversation

@JamesReate
Copy link
Copy Markdown
Member

Summary

  • Swap the cron job in values-prod.yaml to run sync-device-definitions-search daily at 13:00 UTC (previously sync-r1-compatibilty, now deprecated and removed).
  • Fix a pagination bug in QueryDefinitionsCustom (OFFSET was set to the raw page index, not pageIndex * pageSize) and a mismatched termination threshold in the sync loop (broke at <50 for a 500-row page size). Combined, every manufacturer with more than 50 definitions appended a near-duplicate 500-row page per iteration until the sliding window finally fell below 50 — the documents slice grew by roughly N*(N-50)/2 per make and OOM'd the job.
  • Refactor the sync command into a SearchIndexer interface plus two pure orchestration functions (buildManufacturerDocuments, runSearchSync). Uploads are flushed per manufacturer so steady-state memory is bounded by a single make, and per-document Upsert calls are replaced by batched Documents().Import(... upsert).
  • Add unit tests with gomock (no DB required) covering the year filter, field population, pagination termination, multi-page advance, error propagation, per-make flush, and skip-empty-make behaviour.

Test plan

  • go build ./... and go vet ./... clean
  • go test ./cmd/device-definitions-api/... — 9 new tests pass
  • Deploy to prod and confirm the daily job completes without OOM
  • Spot-check the device-definitions typesense index after the first run

🤖 Generated with Claude Code

JamesReate and others added 3 commits April 20, 2026 16:21
Replaces the r1-search-gsheet-sync cron entry with device-definitions-search-sync
(daily 13:00 UTC) and drops the sync-r1-compatibilty subcommand and its
registration. The runtime read path for r1_compatibility is untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
QueryDefinitionsCustom built its SQL with OFFSET set to the raw pageIndex
instead of pageIndex*500, so each page overlapped the previous by 499 rows.
The caller in sync-device-definitions-search also broke only when a page
returned fewer than 50 rows (vs. the 500-row page size), so for any
manufacturer with more than 50 definitions the loop kept advancing by one
row while appending a near-duplicate page each iteration. The resulting
documents slice grew by roughly N*(N-50)/2 per make and drove the job OOM.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ility

Splits Execute() into a SearchIndexer interface (with a typesense-backed
implementation) and two pure orchestration functions: buildManufacturerDocuments
converts a single manufacturer's tableland definitions into search entries, and
runSearchSync walks manufacturers and upserts one make at a time. The
per-make flush bounds steady-state memory to one make's documents instead of
retaining the entire catalog until the end, and switches uploads from
per-document Upsert calls to the batched Documents().Import upsert action.

Adds unit tests covering the year filter, field population, pagination
termination, error propagation, and per-make flush behaviour. Tests use the
existing gomock mocks for IdentityAPI and DeviceDefinitionOnChainService plus
a generated MockSearchIndexer; no database is required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@JamesReate JamesReate merged commit c8fce31 into main Apr 20, 2026
2 of 3 checks passed
@JamesReate JamesReate deleted the fix/search-sync-oom branch April 20, 2026 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant