Fix/semantic search build time & semantic search flow after pagination refactor#1109
Conversation
…y context When using git URL context with subdirectory (:apps/semantic-search), the file path must be relative to that subdirectory, not repo root. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The label should be app.kubernetes.io/name=semantic-search, not the full deployment name Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Pre-download BAAI/bge-base-en-v1.5 model during Docker build so container doesn't need to download 420MB on every startup - Increase startupProbe to 10 minutes (from 5) for safety Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ables in Docker Compose
…erms and (2) make index save in disk -> not deleted by every deployment
- Restore deleted semantic-search module files (client.ts, controller.ts, requirements.txt) - Re-add semantic search routes to express loader - Restore ClassBrowser AI search UI components - Update fuzzy-find imports to use @repo/common - Add semantic-search to typedef validation exclusions - Restore semantic search config in packages/common Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Change import from @repo/common to @repo/common/models - Add explicit type annotation for termsWithClasses.map Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Re-queue failed index builds with exponential backoff (up to 10 rounds) - Retry entire startup cycle when backend isn't ready yet - Enable PVC for dev environments so indexes persist across pod restarts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Datapuller needs this to call /refresh on the semantic search service after updating class data. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
No longer needed since we use hostPath instead of PVC. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nfra merge from main
|
@vaclisinc If build times are down, should be good to merge. I'll let @PineND confirm catalog is working correctly. Will be good to merge now and monitor how heavy it's hitting our servers. |
Linting FailedNote: The status check will always pass. Run Click to expand lint output |
Linting FailedNote: The status check will always pass. Run Click to expand lint output |
Linting FailedNote: The status check will always pass. Run Click to expand lint output |
Linting FailedNote: The status check will always pass. Run Click to expand lint output |
Linting FailedNote: The status check will always pass. Run Click to expand lint output |
Linting FailedNote: The status check will always pass. Run Click to expand lint output |
Linting FailedNote: The status check will always pass. Run Click to expand lint output |
Semantic Search: Redis Migration & Vector Search ImprovementsSummaryThese commits overhaul the semantic search service's storage and runtime model. The previous implementation stored FAISS vector indexes and course metadata as files on a PersistentVolumeClaim (PVC) and blocked HTTP requests while building indexes. This PR replaces that with Redis-backed vector search (via RedisVL), non-blocking background indexing, and several performance and reliability improvements made in follow-up commits. Changes1. Replace FAISS + PVC with Redis Stack vector searchCommits: The previous architecture serialized FAISS index files and pickled metadata to a shared PVC ( The new architecture uses RedisVL on top of Redis Stack (which ships the
2. Non-blocking background indexingCommit: Previously, Now, when an index is missing, Corresponding client-side changes:
3. Embedding batching and search result capCommit:
4. Race condition fix, HNSW algorithm, and query embedding cacheCommit: Three targeted improvements: Race condition in HNSW vector index algorithm: The RedisVL schema previously used Query embedding LRU cache: Every call to Architecture: Semantic Search ServiceOverviewThe semantic search service is a standalone FastAPI microservice that provides natural-language course search for Berkeleytime. It is deployed as a separate pod in the K8s cluster and communicates with the backend's GraphQL API to fetch course catalog data and with the shared Redis instance for index persistence. Embedding modelThe service uses Index structureFor each (year, semester, optional subject filter) combination, the service maintains a RedisVL
The composite Index names follow the pattern An in-process Python dict ( Index lifecycle
HTTP API
Search uses cosine similarity. Configuration (environment variables)
Dependencies
|
Overview
This PR resets and simplifies the semantic search implementation after several reversals during development. The previous PR history became difficult to review, so this PR consolidates the final working version into a cleaner state.
In addition, this PR significantly speeds up Docker builds and fixes the semantic search flow after the pagination refactor.
Changes
1. Remove model pre-download from Docker build
Previously the Docker image downloaded the
BAAI/bge-base-en-v1.5model (~400MB) during build time, which made every image build slow.This PR removes that step from the Dockerfile.
Instead, the model downloads on the first runtime start and the HuggingFace cache is persisted through a mounted volume.
Cache path:
/root/.cache/huggingface
This allows the model to persist across:
Local dev behavior:
docker compose up --build -d
First run → downloads the model
Later runs → loads directly from:
./data/semantic-search/model-cache/
This follows the same persistence pattern as the FAISS index.
2. Install CPU version of PyTorch
By default
sentence-transformersinstalls the GPU build of PyTorch (~4.3GB).Since the semantic search service only performs inference, this PR installs the CPU version of PyTorch (~1GB) to reduce image size.
3. Fix semantic search after pagination refactor
The previous implementation assumed the entire catalog was loaded on the frontend, allowing client-side filtering.
After the pagination refactor (25 courses per page), this approach no longer worked.
Semantic search is now fully handled by the backend.
Architecture
Unchanged Behavior
catalogSearchGraphQL response shapeResults
Semantic-search image build time improved significantly.
Before: ~20 minutes
After: ~2m 42s
≈ 86.5% faster CI build time
Review
Would appreciate a review from: