Commit bffb994
committed
ci: pre-build FAISS indices in Docker image to fix ~60 min graph init
Graph initialisation during the docker-eval CI job was taking ~60 minutes
because all 6 FAISS vector indices were rebuilt from scratch at container
startup on every run (HuggingFace CPU inference over the full corpus, or
Google Gemini embedding API exhausting its quota and retrying with 60 s
minimum backoff).
Root cause:
- HybridRetrieverChain.create_hybrid_retriever() takes the slow embed_docs()
path when faiss_db/<name> does not exist on disk.
- The faiss_data named volume is empty on every CI run (docker compose down
--volumes is called between jobs), so the indices are never reused.
Fix:
- Add backend/scripts/build_faiss.py: runs RetrieverTools.initialize() with
EMBEDDINGS_TYPE=HF, FAST_MODE=true, and contextual_rerank=False at Docker
build time, saving all 6 FAISS indices into the image layer.
- Add a RUN step in the Dockerfile that calls the script after the dataset
is downloaded. Docker layer caching means the step is skipped on re-runs
where neither source nor data changed.
- Set ENV EMBEDDINGS_TYPE=HF / HF_EMBEDDINGS=thenlper/gte-large as container
defaults so runtime matches the pre-built indices (override in .env or via
docker run -e if a different model is needed).
- Add contextual_rerank: bool = True param to RetrieverTools.initialize() so
the build script can skip loading the cross-encoder model, keeping the
Docker build dependency-light.
On first CI run with an empty faiss_data volume Docker copies the pre-built
indices from the image into the volume automatically, so the container finds
faiss_db/<name> at startup and takes the load_db() path instead. Graph init
drops from ~60 min to a few seconds.
Note: ensure backend/.env (or ci-secret.yaml) sets EMBEDDINGS_TYPE=HF to
match the pre-built indices; using a different model at runtime causes a
vector dimension mismatch.
Signed-off-by: Jack Luar <jluar@precisioninno.com>1 parent 052d03a commit bffb994
3 files changed
Lines changed: 76 additions & 6 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
30 | 47 | | |
31 | 48 | | |
32 | 49 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
| 48 | + | |
48 | 49 | | |
49 | 50 | | |
50 | 51 | | |
| |||
97 | 98 | | |
98 | 99 | | |
99 | 100 | | |
100 | | - | |
| 101 | + | |
101 | 102 | | |
102 | 103 | | |
103 | 104 | | |
| |||
112 | 113 | | |
113 | 114 | | |
114 | 115 | | |
115 | | - | |
| 116 | + | |
116 | 117 | | |
117 | 118 | | |
118 | 119 | | |
| |||
128 | 129 | | |
129 | 130 | | |
130 | 131 | | |
131 | | - | |
| 132 | + | |
132 | 133 | | |
133 | 134 | | |
134 | 135 | | |
| |||
143 | 144 | | |
144 | 145 | | |
145 | 146 | | |
146 | | - | |
| 147 | + | |
147 | 148 | | |
148 | 149 | | |
149 | 150 | | |
| |||
158 | 159 | | |
159 | 160 | | |
160 | 161 | | |
161 | | - | |
| 162 | + | |
162 | 163 | | |
163 | 164 | | |
164 | 165 | | |
| |||
173 | 174 | | |
174 | 175 | | |
175 | 176 | | |
176 | | - | |
| 177 | + | |
177 | 178 | | |
178 | 179 | | |
179 | 180 | | |
| |||
0 commit comments