Skip to content

Fix/semantic search build time & semantic search flow after pagination refactor#1109

Merged
vaclisinc merged 57 commits into
mainfrom
fix/semantic-search-infra
Apr 4, 2026
Merged

Fix/semantic search build time & semantic search flow after pagination refactor#1109
vaclisinc merged 57 commits into
mainfrom
fix/semantic-search-infra

Conversation

@vaclisinc
Copy link
Copy Markdown
Contributor

Overview

This PR resets and simplifies the semantic search implementation after several reversals during development. The previous PR history became difficult to review, so this PR consolidates the final working version into a cleaner state.

In addition, this PR significantly speeds up Docker builds and fixes the semantic search flow after the pagination refactor.


Changes

1. Remove model pre-download from Docker build

Previously the Docker image downloaded the BAAI/bge-base-en-v1.5 model (~400MB) during build time, which made every image build slow.

This PR removes that step from the Dockerfile.

Instead, the model downloads on the first runtime start and the HuggingFace cache is persisted through a mounted volume.

Cache path:

/root/.cache/huggingface

This allows the model to persist across:

  • container rebuilds
  • pod restarts
  • local development runs

Local dev behavior:

docker compose up --build -d

First run → downloads the model
Later runs → loads directly from:

./data/semantic-search/model-cache/

This follows the same persistence pattern as the FAISS index.


2. Install CPU version of PyTorch

By default sentence-transformers installs the GPU build of PyTorch (~4.3GB).

Since the semantic search service only performs inference, this PR installs the CPU version of PyTorch (~1GB) to reduce image size.


3. Fix semantic search after pagination refactor

The previous implementation assumed the entire catalog was loaded on the frontend, allowing client-side filtering.

After the pagination refactor (25 courses per page), this approach no longer worked.

Semantic search is now fully handled by the backend.


Architecture

User clicks "Search with AI →"
        │
        ▼
GraphQL catalogSearch(search, semanticSearch: true)
        │
        ▼
Backend calls Python FAISS /search
        │
        ▼
FAISS returns ranked (subject, courseNumber)
        │
        ▼
Backend performs MongoDB query using those identifiers
        │
        ├─ applies user filters
        └─ retrieves course documents
        │
        ▼
Backend reorders results to match FAISS ranking
        │
        ▼
Returns CatalogResult { results, totalCount }
        │
        ▼
Frontend renders paginated results normally

Unchanged Behavior

  • Normal text search (Fuse.js)
  • Debounced search
  • Filters and sorting
  • Infinite scroll pagination
  • catalogSearch GraphQL response shape

Results

Semantic-search image build time improved significantly.

Before: ~20 minutes
After: ~2m 42s

86.5% faster CI build time


Review

Would appreciate a review from:

vaclisinc and others added 30 commits January 15, 2026 17:20
…y context

When using git URL context with subdirectory (:apps/semantic-search),
the file path must be relative to that subdirectory, not repo root.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The label should be app.kubernetes.io/name=semantic-search, not the full deployment name

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Pre-download BAAI/bge-base-en-v1.5 model during Docker build
  so container doesn't need to download 420MB on every startup
- Increase startupProbe to 10 minutes (from 5) for safety

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…erms and (2) make index save in disk -> not deleted by every deployment
- Restore deleted semantic-search module files (client.ts, controller.ts, requirements.txt)
- Re-add semantic search routes to express loader
- Restore ClassBrowser AI search UI components
- Update fuzzy-find imports to use @repo/common
- Add semantic-search to typedef validation exclusions
- Restore semantic search config in packages/common

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Change import from @repo/common to @repo/common/models
- Add explicit type annotation for termsWithClasses.map

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Re-queue failed index builds with exponential backoff (up to 10 rounds)
- Retry entire startup cycle when backend isn't ready yet
- Enable PVC for dev environments so indexes persist across pod restarts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Datapuller needs this to call /refresh on the semantic search service
after updating class data.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
No longer needed since we use hostPath instead of PVC.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ARtheboss
Copy link
Copy Markdown
Contributor

@vaclisinc If build times are down, should be good to merge. I'll let @PineND confirm catalog is working correctly. Will be good to merge now and monitor how heavy it's hitting our servers.

@github-actions
Copy link
Copy Markdown

Linting Failed

Note: The status check will always pass. Run npm run lint -- --continue to see the full output locally.

Click to expand lint output

> lint
> turbo run lint --continue --output-logs=errors-only


Attention:
Turborepo now collects completely anonymous telemetry regarding usage.
This information is used to shape the Turborepo roadmap and prioritize features.
You can learn more, including how to opt-out if you'd not like to participate in this anonymous program, by visiting the following URL:
https://turborepo.com/docs/telemetry

• Packages in scope: @repo/common, @repo/eslint-config, @repo/gql-typedefs, @repo/shared, @repo/sis-api, @repo/storybook, @repo/theme, @repo/typescript-config, ag-frontend, api-sandbox, backend, datapuller, frontend, staff-frontend
• Running lint in 14 packages
• Remote caching disabled
�[;31mfrontend:lint�[;0m
cache miss, executing 46ee2e706e153d92

> lint
> eslint src/


/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/BubbleCard/index.tsx
  106:10  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/Capacity/index.tsx
  9:14  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/Chart/ChartContext.tsx
  7:17  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/Chart/index.tsx
   2:10  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
   7:10  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
  10:3   warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
  11:3   warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
  12:3   warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
  13:3   warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/ClassBrowser/Header/index.tsx
  81:32  error  Class or exported property 'semanticError' not found  css-modules/no-undef-class

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/ScheduleSummary/index.tsx
  11:14  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

✖ 11 problems (1 error, 10 warnings)

npm error Lifecycle script `lint` failed with error:
npm error code 1
npm error path /home/runner/work/berkeleytime/berkeleytime/apps/frontend
npm error workspace frontend
npm error location /home/runner/work/berkeleytime/berkeleytime/apps/frontend
npm error command failed
npm error command sh -c eslint src/
[WARN] command finished with error, but continuing...
::error::frontend#lint: command (/home/runner/work/berkeleytime/berkeleytime/apps/frontend) /opt/hostedtoolcache/node/22.12.0/x64/bin/npm run lint exited (1)

 Tasks:    6 successful, 7 total
Cached:    0 cached, 7 total
  Time:    11.571s 
Failed:    frontend#lint

 ERROR  run failed: command  exited (1)

@github-actions
Copy link
Copy Markdown

Linting Failed

Note: The status check will always pass. Run npm run lint -- --continue to see the full output locally.

Click to expand lint output

> lint
> turbo run lint --continue --output-logs=errors-only


Attention:
Turborepo now collects completely anonymous telemetry regarding usage.
This information is used to shape the Turborepo roadmap and prioritize features.
You can learn more, including how to opt-out if you'd not like to participate in this anonymous program, by visiting the following URL:
https://turborepo.com/docs/telemetry

• Packages in scope: @repo/common, @repo/eslint-config, @repo/gql-typedefs, @repo/shared, @repo/sis-api, @repo/storybook, @repo/theme, @repo/typescript-config, ag-frontend, api-sandbox, backend, datapuller, frontend, staff-frontend
• Running lint in 14 packages
• Remote caching disabled
�[;31mfrontend:lint�[;0m
cache miss, executing 8d6f2fa89510afc3

> lint
> eslint src/


/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/BubbleCard/index.tsx
  106:10  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/Capacity/index.tsx
  9:14  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/Chart/ChartContext.tsx
  7:17  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/Chart/index.tsx
   2:10  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
   7:10  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
  10:3   warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
  11:3   warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
  12:3   warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
  13:3   warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/ClassBrowser/Header/index.tsx
  81:32  error  Class or exported property 'semanticError' not found  css-modules/no-undef-class

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/ScheduleSummary/index.tsx
  11:14  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

✖ 11 problems (1 error, 10 warnings)

npm error Lifecycle script `lint` failed with error:
npm error code 1
npm error path /home/runner/work/berkeleytime/berkeleytime/apps/frontend
npm error workspace frontend
npm error location /home/runner/work/berkeleytime/berkeleytime/apps/frontend
npm error command failed
npm error command sh -c eslint src/
[WARN] command finished with error, but continuing...
::error::frontend#lint: command (/home/runner/work/berkeleytime/berkeleytime/apps/frontend) /opt/hostedtoolcache/node/22.12.0/x64/bin/npm run lint exited (1)

 Tasks:    6 successful, 7 total
Cached:    0 cached, 7 total
  Time:    11.462s 
Failed:    frontend#lint

 ERROR  run failed: command  exited (1)

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 1, 2026

Linting Failed

Note: The status check will always pass. Run npm run lint -- --continue to see the full output locally.

Click to expand lint output

> lint
> turbo run lint --continue --output-logs=errors-only


Attention:
Turborepo now collects completely anonymous telemetry regarding usage.
This information is used to shape the Turborepo roadmap and prioritize features.
You can learn more, including how to opt-out if you'd not like to participate in this anonymous program, by visiting the following URL:
https://turborepo.com/docs/telemetry

• Packages in scope: @repo/common, @repo/eslint-config, @repo/gql-typedefs, @repo/shared, @repo/sis-api, @repo/storybook, @repo/theme, @repo/typescript-config, ag-frontend, api-sandbox, backend, datapuller, frontend, staff-frontend
• Running lint in 14 packages
• Remote caching disabled
�[;31mfrontend:lint�[;0m
cache miss, executing 9c6d787fe8985647

> lint
> eslint src/


/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/BubbleCard/index.tsx
  106:10  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/Capacity/index.tsx
  9:14  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/Chart/ChartContext.tsx
  7:17  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/Chart/index.tsx
   2:10  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
   7:10  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
  10:3   warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
  11:3   warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
  12:3   warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
  13:3   warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/ClassBrowser/Header/index.tsx
  81:32  error  Class or exported property 'semanticError' not found  css-modules/no-undef-class

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/ScheduleSummary/index.tsx
  11:14  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

✖ 11 problems (1 error, 10 warnings)

npm error Lifecycle script `lint` failed with error:
npm error code 1
npm error path /home/runner/work/berkeleytime/berkeleytime/apps/frontend
npm error workspace frontend
npm error location /home/runner/work/berkeleytime/berkeleytime/apps/frontend
npm error command failed
npm error command sh -c eslint src/
[WARN] command finished with error, but continuing...
::error::frontend#lint: command (/home/runner/work/berkeleytime/berkeleytime/apps/frontend) /opt/hostedtoolcache/node/22.12.0/x64/bin/npm run lint exited (1)

 Tasks:    6 successful, 7 total
Cached:    0 cached, 7 total
  Time:    11.44s 
Failed:    frontend#lint

 ERROR  run failed: command  exited (1)

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 1, 2026

Linting Failed

Note: The status check will always pass. Run npm run lint -- --continue to see the full output locally.

Click to expand lint output

> lint
> turbo run lint --continue --output-logs=errors-only


Attention:
Turborepo now collects completely anonymous telemetry regarding usage.
This information is used to shape the Turborepo roadmap and prioritize features.
You can learn more, including how to opt-out if you'd not like to participate in this anonymous program, by visiting the following URL:
https://turborepo.com/docs/telemetry

• Packages in scope: @repo/common, @repo/eslint-config, @repo/gql-typedefs, @repo/shared, @repo/sis-api, @repo/storybook, @repo/theme, @repo/typescript-config, ag-frontend, api-sandbox, backend, datapuller, frontend, staff-frontend
• Running lint in 14 packages
• Remote caching disabled
�[;31mbackend:lint�[;0m
cache miss, executing 687b84dfaed5ce0a

> lint
> eslint src/


/home/runner/work/berkeleytime/berkeleytime/apps/backend/src/modules/semantic-search/client.ts
  50:13  error  Empty block statement  no-empty

✖ 1 problem (1 error, 0 warnings)

npm error Lifecycle script `lint` failed with error:
npm error code 1
npm error path /home/runner/work/berkeleytime/berkeleytime/apps/backend
npm error workspace backend
npm error location /home/runner/work/berkeleytime/berkeleytime/apps/backend
npm error command failed
npm error command sh -c eslint src/
[WARN] command finished with error, but continuing...
::error::backend#lint: command (/home/runner/work/berkeleytime/berkeleytime/apps/backend) /opt/hostedtoolcache/node/22.12.0/x64/bin/npm run lint exited (1)

 Tasks:    6 successful, 7 total
Cached:    0 cached, 7 total
  Time:    11.429s 
Failed:    backend#lint

 ERROR  run failed: command  exited (1)

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 1, 2026

Linting Failed

Note: The status check will always pass. Run npm run lint -- --continue to see the full output locally.

Click to expand lint output

> lint
> turbo run lint --continue --output-logs=errors-only


Attention:
Turborepo now collects completely anonymous telemetry regarding usage.
This information is used to shape the Turborepo roadmap and prioritize features.
You can learn more, including how to opt-out if you'd not like to participate in this anonymous program, by visiting the following URL:
https://turborepo.com/docs/telemetry

• Packages in scope: @repo/common, @repo/eslint-config, @repo/gql-typedefs, @repo/shared, @repo/sis-api, @repo/storybook, @repo/theme, @repo/typescript-config, ag-frontend, api-sandbox, backend, datapuller, frontend, staff-frontend
• Running lint in 14 packages
• Remote caching disabled
�[;31mbackend:lint�[;0m
cache miss, executing 687b84dfaed5ce0a

> lint
> eslint src/


/home/runner/work/berkeleytime/berkeleytime/apps/backend/src/modules/semantic-search/client.ts
  50:13  error  Empty block statement  no-empty

✖ 1 problem (1 error, 0 warnings)

npm error Lifecycle script `lint` failed with error:
npm error code 1
npm error path /home/runner/work/berkeleytime/berkeleytime/apps/backend
npm error workspace backend
npm error location /home/runner/work/berkeleytime/berkeleytime/apps/backend
npm error command failed
npm error command sh -c eslint src/
[WARN] command finished with error, but continuing...
::error::backend#lint: command (/home/runner/work/berkeleytime/berkeleytime/apps/backend) /opt/hostedtoolcache/node/22.12.0/x64/bin/npm run lint exited (1)

 Tasks:    6 successful, 7 total
Cached:    0 cached, 7 total
  Time:    11.214s 
Failed:    backend#lint

 ERROR  run failed: command  exited (1)

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 1, 2026

Linting Failed

Note: The status check will always pass. Run npm run lint -- --continue to see the full output locally.

Click to expand lint output

> lint
> turbo run lint --continue --output-logs=errors-only


Attention:
Turborepo now collects completely anonymous telemetry regarding usage.
This information is used to shape the Turborepo roadmap and prioritize features.
You can learn more, including how to opt-out if you'd not like to participate in this anonymous program, by visiting the following URL:
https://turborepo.com/docs/telemetry

• Packages in scope: @repo/common, @repo/eslint-config, @repo/gql-typedefs, @repo/shared, @repo/sis-api, @repo/storybook, @repo/theme, @repo/typescript-config, ag-frontend, api-sandbox, backend, datapuller, frontend, staff-frontend
• Running lint in 14 packages
• Remote caching disabled
�[;31mbackend:lint�[;0m
cache miss, executing 687b84dfaed5ce0a

> lint
> eslint src/


/home/runner/work/berkeleytime/berkeleytime/apps/backend/src/modules/semantic-search/client.ts
  50:13  error  Empty block statement  no-empty

✖ 1 problem (1 error, 0 warnings)

npm error Lifecycle script `lint` failed with error:
npm error code 1
npm error path /home/runner/work/berkeleytime/berkeleytime/apps/backend
npm error workspace backend
npm error location /home/runner/work/berkeleytime/berkeleytime/apps/backend
npm error command failed
npm error command sh -c eslint src/
[WARN] command finished with error, but continuing...
�[;31mfrontend:lint�[;0m
cache miss, executing 48dea3c991d81b37

> lint
> eslint src/


/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/BubbleCard/index.tsx
  106:10  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/Capacity/index.tsx
  9:14  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/Chart/ChartContext.tsx
  7:17  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/Chart/index.tsx
   2:10  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
   7:10  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
  10:3   warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
  11:3   warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
  12:3   warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components
  13:3   warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/ClassBrowser/browser.ts
  1:10  error  'ClassGradingBasis' is defined but never used  @typescript-eslint/no-unused-vars

/home/runner/work/berkeleytime/berkeleytime/apps/frontend/src/components/ScheduleSummary/index.tsx
  11:14  warning  Fast refresh only works when a file only exports components. Use a new file to share constants or functions between components  react-refresh/only-export-components

✖ 11 problems (1 error, 10 warnings)

npm error Lifecycle script `lint` failed with error:
npm error code 1
npm error path /home/runner/work/berkeleytime/berkeleytime/apps/frontend
npm error workspace frontend
npm error location /home/runner/work/berkeleytime/berkeleytime/apps/frontend
npm error command failed
npm error command sh -c eslint src/
[WARN] command finished with error, but continuing...
::error::backend#lint: command (/home/runner/work/berkeleytime/berkeleytime/apps/backend) /opt/hostedtoolcache/node/22.12.0/x64/bin/npm run lint exited (1)
::error::frontend#lint: command (/home/runner/work/berkeleytime/berkeleytime/apps/frontend) /opt/hostedtoolcache/node/22.12.0/x64/bin/npm run lint exited (1)

 Tasks:    5 successful, 7 total
Cached:    0 cached, 7 total
  Time:    11.932s 
Failed:    backend#lint, frontend#lint

 ERROR  run failed: command  exited (1)

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 2, 2026

Linting Failed

Note: The status check will always pass. Run npm run lint -- --continue to see the full output locally.

Click to expand lint output

> lint
> turbo run lint --continue --output-logs=errors-only


Attention:
Turborepo now collects completely anonymous telemetry regarding usage.
This information is used to shape the Turborepo roadmap and prioritize features.
You can learn more, including how to opt-out if you'd not like to participate in this anonymous program, by visiting the following URL:
https://turborepo.com/docs/telemetry

• Packages in scope: @repo/BtLL, @repo/common, @repo/eslint-config, @repo/gql-typedefs, @repo/shared, @repo/sis-api, @repo/storybook, @repo/theme, @repo/typescript-config, ag-frontend, api-sandbox, backend, datapuller, frontend, staff-frontend
• Running lint in 15 packages
• Remote caching disabled
�[;31mbackend:lint�[;0m
cache miss, executing 96747c547a079aa4

> lint
> eslint src/


/home/runner/work/berkeleytime/berkeleytime/apps/backend/src/modules/semantic-search/client.ts
  50:13  error  Empty block statement  no-empty

✖ 1 problem (1 error, 0 warnings)

npm error Lifecycle script `lint` failed with error:
npm error code 1
npm error path /home/runner/work/berkeleytime/berkeleytime/apps/backend
npm error workspace backend
npm error location /home/runner/work/berkeleytime/berkeleytime/apps/backend
npm error command failed
npm error command sh -c eslint src/
[WARN] command finished with error, but continuing...
::error::backend#lint: command (/home/runner/work/berkeleytime/berkeleytime/apps/backend) /opt/hostedtoolcache/node/22.12.0/x64/bin/npm run lint exited (1)

 Tasks:    6 successful, 7 total
Cached:    0 cached, 7 total
  Time:    11.391s 
Failed:    backend#lint

 ERROR  run failed: command  exited (1)

@johngerving
Copy link
Copy Markdown
Contributor

Semantic Search: Redis Migration & Vector Search Improvements

Summary

These commits overhaul the semantic search service's storage and runtime model. The previous implementation stored FAISS vector indexes and course metadata as files on a PersistentVolumeClaim (PVC) and blocked HTTP requests while building indexes. This PR replaces that with Redis-backed vector search (via RedisVL), non-blocking background indexing, and several performance and reliability improvements made in follow-up commits.


Changes

1. Replace FAISS + PVC with Redis Stack vector search

Commits: bbb443294, 5bada68d7

The previous architecture serialized FAISS index files and pickled metadata to a shared PVC (/app/indexes). This had several problems: PVC provisioning added infrastructure complexity, pickle files are fragile across Python version upgrades, and the FAISS dependency added ~4 GB to the image before the CPU-only pin reduced it.

The new architecture uses RedisVL on top of Redis Stack (which ships the redisearch module for vector search). Each term's courses are stored as Redis hashes under a namespaced index (semantic_search:{year}:{semester}:{subjects}), and a small JSON metadata blob (semantic_search:meta:...) is stored alongside it. On startup the service loads any pre-existing indexes from Redis rather than disk.

  • Removed: faiss-cpu, pickle, pathlib.Path index storage, PVC volume mounts
  • Added: redisvl, redis, array_to_buffer for float32 serialization
  • docker-compose.yml: upgraded Redis image from redis:7.2.4 to redis/redis-stack-server:7.2.0-v14 (required for the Search module); removed ./data/semantic-search/indexes and model-cache volume mounts; added REDIS_URI env var
  • Helm chart: added REDIS_URI to the semantic-search ConfigMap

2. Non-blocking background indexing

Commit: 180fbb986

Previously, _get_or_build_index would call self.refresh(...) synchronously if the index wasn't available, meaning the first HTTP request for an uncached term would block for minutes while the index was built. Under K8s, this caused request timeouts and confused clients.

Now, when an index is missing, _get_or_build_index kicks off refresh_async (which starts a background thread) and immediately raises a RuntimeError. The FastAPI layer catches this and returns a 503 with a human-readable detail message. Clients are expected to retry.

Corresponding client-side changes:

  • Backend TypeScript client (client.ts): parses the JSON body on non-2xx responses and forwards the detail field as the error message, so the 503 reason reaches the frontend
  • Frontend hook (useCatalogQuery.ts): added optional chaining on error.graphQLErrors?.[0] to avoid a TypeError when the error has no GraphQL errors array
  • Frontend CSS (Header.module.scss): added .semanticError style for displaying the error message in the search header

3. Embedding batching and search result cap

Commit: 45f7168c8

  • model.encode(...) now passes batch_size=128, which improves GPU/CPU throughput during index builds by processing courses in chunks rather than all at once
  • search_k (the number of candidates returned from Redis) was reduced from 500 to 50. The old value was left over from the FAISS implementation where over-fetching before threshold filtering was needed; with Redis the threshold filter can be applied directly on the distance score, so 50 candidates is sufficient

4. Race condition fix, HNSW algorithm, and query embedding cache

Commit: a7287e5fe

Three targeted improvements:

Race condition in refresh_async: The check-then-act on self._building and self._build_thread.is_alive() was unprotected, meaning two concurrent requests could both pass the guard and launch duplicate background builds. The entire guard-and-launch block is now wrapped in with self._lock.

HNSW vector index algorithm: The RedisVL schema previously used "algorithm": "flat", which performs brute-force KNN (O(N) per query). Switched to "algorithm": "hnsw" (Hierarchical Navigable Small World), which gives approximate nearest-neighbor search in sub-linear time with minimal recall loss. Existing indexes in Redis will need a one-time /refresh to be rebuilt with the new algorithm.

Query embedding LRU cache: Every call to /search previously ran model.encode() even for repeated queries (e.g., from autocomplete or retries). A @lru_cache(maxsize=256) on _encode_query caches the resulting vector by the prefixed query string, skipping inference on cache hits.


Architecture: Semantic Search Service

Overview

The semantic search service is a standalone FastAPI microservice that provides natural-language course search for Berkeleytime. It is deployed as a separate pod in the K8s cluster and communicates with the backend's GraphQL API to fetch course catalog data and with the shared Redis instance for index persistence.

User
 └─► Frontend (React)
      └─► Backend
           ├─► Semantic Search Service  ◄──► Redis Stack

Embedding model

The service uses BAAI/bge-base-en-v1.5 via sentence-transformers. This is a 109M-parameter retrieval-optimized model that produces 768-dimensional float32 vectors. The model is baked into the Docker image at build time (pre-downloaded in the Dockerfile) to avoid runtime download races on K8s. BGE models require an instruction prefix for query vectors: "Represent this sentence for searching relevant passages: ".

Index structure

For each (year, semester, optional subject filter) combination, the service maintains a RedisVL SearchIndex with the following schema:

Field Type Notes
subject tag e.g. "CS"
course_number tag e.g. "61A"
title text course title
description text catalog description
course_text text composite text used during indexing
embedding vector (HNSW, float32, cosine) 768-dim BGE embedding of course_text

The composite course_text field fed to the embedding model is:

SUBJECT: {subject} NUMBER: {number}
TITLE: {title}
DESCRIPTION: {description}

Index names follow the pattern semantic_search:{year}:{semester}:{subjects|"all"}, e.g. semantic_search:2026:Spring:all. A companion metadata key (semantic_search:meta:...) stores last_refreshed, size, year, semester, and allowed_subjects as JSON.

An in-process Python dict (_indices) acts as a hot cache in front of Redis, keyed by {year}:{semester}:{subjects|"__all__"}.

Index lifecycle

  1. Startup (build_index in main.py): A daemon thread calls engine.build_startup_indexes(), which queries the backend's terms GraphQL endpoint to discover available semesters. The 4 most recent terms are targeted; any already in Redis are loaded immediately, the rest are queued. process_build_queue then builds each queued term sequentially. The startup thread retries the entire cycle up to 10 times (with increasing delays up to 5 minutes) to handle the case where the backend pod starts after the semantic-search pod.

  2. On-demand (_get_or_build_index): If a search arrives for a term not in the hot cache, the service first checks Redis. If the index exists there it's loaded and cached in-process. If it doesn't exist, refresh_async is called to start a background build and the request immediately receives a 503 asking the client to retry.

  3. Manual refresh (POST /refresh): Accepts {year, semester, allowed_subjects} and calls refresh_async, which starts a background thread calling refresh(). Returns immediately; check GET /health for build status.

  4. refresh(): Fetches the course catalog via GraphQL (with 12-attempt exponential-backoff retry to survive K8s startup races), deduplicates courses, optionally filters by subject, builds composite text, encodes with the BGE model in batches of 128, writes vectors to Redis via search_index.load(), and saves metadata.

HTTP API

Endpoint Method Description
/health GET Returns status (ok, building, queued, error, waiting), active build, queue, and index list
/search POST Semantic search; body: {query, threshold, year?, semester?, allowed_subjects?}
/refresh POST Trigger manual index (re)build; body: {year, semester, allowed_subjects?}

Search uses cosine similarity. threshold (default 0.3) is the minimum similarity score (0–1); internally converted to a RedisVL distance threshold as 1 - threshold.

Configuration (environment variables)

Variable Default Description
REDIS_URI redis://redis:6379 Redis Stack connection URL
BACKEND_URL http://backend:5001 Backend GraphQL base URL
SEMANTIC_SEARCH_YEAR Default term year (used when request omits year/semester)
SEMANTIC_SEARCH_SEMESTER Default term semester
SEMANTIC_SEARCH_ALLOWED_SUBJECTS Comma-separated subject filter applied to all indexes
SEMANTIC_SEARCH_LOG_LEVEL INFO Python logging level

Dependencies

  • sentence-transformers — model loading and inference
  • redisvl — RedisVL client (schema definition, index creation, vector queries)
  • redis — raw Redis client (metadata reads/writes, health checks)
  • fastapi / uvicorn — HTTP server
  • numpy — float32 array manipulation

@vaclisinc vaclisinc merged commit 99bc46e into main Apr 4, 2026
18 checks passed
@vaclisinc vaclisinc deleted the fix/semantic-search-infra branch April 4, 2026 07:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants