Skip to content

Commit 6f84edc

Browse files
authored
feat(docker): slim/full image variants + cached deps layer + [full] extra rename (#138)
* perf(docker): split install into stable deps + per-release layers; add GHA cache Dockerfile previously installed cocoindex-code, cocoindex, torch, sentence-transformers, and all transitive deps in one RUN. Any change to the source tree (via COPY . /ccc-src) invalidated that single layer, forcing a full re-install — ~1 GB of wheels for torch + friends — on every release. Under QEMU for the arm64 cross-build this was slow enough to be painful. Split into two stages: - `deps`: install cocoindex + cocoindex-code[default] from PyPI. Cache key is just the RUN command string, so this layer is reused across releases until we bump the pins. - `builder`: overlay the release version via `CCC_INSTALL_SPEC=/ccc-src[default]` with `--no-deps --force-reinstall` — only the cocoindex-code package is touched; the heavy deps layer stays untouched. Also add BuildKit layer cache (`type=gha`) to the publish-docker job so the deps layer persists across workflow runs, not just within a single build. * feat(docker,packaging): slim/full image variants; rename [default]→[full] extra Build two Docker image variants per release: - slim (:latest, default) — ~450 MB. LiteLLM-only. cocoindex + cocoindex-code without sentence-transformers. Targets cloud-backed embeddings. - full (:full) — ~5 GB. Bundles sentence-transformers + torch + a pre-baked default model. Targets offline-ready local embeddings. Dockerfile gains a CCC_VARIANT build arg that gates stage 1's sentence-transformers install and stage 3's model bake. Release workflow matrices on {slim, full}; each variant has its own GHA cache scope so layer reuse works across releases without the variants evicting each other. Also rename the PyPI `[default]` umbrella extra to `[full]` so pip and Docker names match. `[embeddings-local]` remains the canonical primary extra (the one that specifically pulls in sentence-transformers); `[full]` is its umbrella alias that may bundle additional optional niceties later. CLI hints that point at missing sentence-transformers continue to name `[embeddings-local]` directly — the most specific pointer for that case. README documents both image variants with a comparison table and narrows the Mac-on-Docker MPS note to only :full users (slim + LiteLLM is unaffected).
1 parent 745dcd6 commit 6f84edc

9 files changed

Lines changed: 177 additions & 56 deletions

File tree

.github/workflows/release.yml

Lines changed: 53 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -112,7 +112,7 @@ jobs:
112112
--repo '${{ github.repository }}'
113113
114114
publish-docker:
115-
name: Build & push Docker image to Docker Hub and GHCR
115+
name: Build & push Docker image (${{ matrix.variant }})
116116
# Runs on real releases, and on manual dispatch with `test_docker=true`
117117
# for verifying registry credentials before the first release.
118118
if: github.event_name == 'release' || (github.event_name == 'workflow_dispatch' && inputs.test_docker)
@@ -123,6 +123,17 @@ jobs:
123123
permissions:
124124
contents: read
125125
packages: write
126+
strategy:
127+
fail-fast: false
128+
matrix:
129+
include:
130+
# slim (default) — LiteLLM-only, ~300 MB. Publishes as `:latest`.
131+
- variant: slim
132+
install_spec: /ccc-src
133+
# full — bundles sentence-transformers + torch + baked model,
134+
# ~2 GB. Publishes as `:full`.
135+
- variant: full
136+
install_spec: /ccc-src[full]
126137
steps:
127138
- uses: actions/checkout@v4
128139

@@ -151,25 +162,43 @@ jobs:
151162

152163
- name: Compute image tags
153164
id: tags
154-
# Real releases: push `:latest` and `:<version>` to both registries.
155-
# Manual dispatches: push only `:test` so we don't clobber `:latest`.
165+
# Tag scheme:
166+
# slim on release: :latest, :<version>
167+
# full on release: :full, :<version>-full
168+
# slim on dispatch: :test
169+
# full on dispatch: :test-full
170+
# Dispatched tags stay out of the `:latest` / `:<version>` namespace
171+
# so manual test runs don't clobber what users pull.
156172
run: |
173+
variant="${{ matrix.variant }}"
174+
if [ "$variant" = "slim" ]; then
175+
slim_suffix=""
176+
else
177+
slim_suffix="-$variant"
178+
fi
157179
if [ "${{ github.event_name }}" = "release" ]; then
158-
{
159-
echo "tags<<EOF"
160-
echo "cocoindex/cocoindex-code:latest"
161-
echo "cocoindex/cocoindex-code:${{ github.ref_name }}"
162-
echo "ghcr.io/cocoindex-io/cocoindex-code:latest"
163-
echo "ghcr.io/cocoindex-io/cocoindex-code:${{ github.ref_name }}"
164-
echo "EOF"
165-
} >> "$GITHUB_OUTPUT"
180+
version="${{ github.ref_name }}"
181+
if [ "$variant" = "slim" ]; then
182+
latest_tag="latest"
183+
else
184+
latest_tag="$variant"
185+
fi
186+
{
187+
echo "tags<<EOF"
188+
echo "cocoindex/cocoindex-code:${latest_tag}"
189+
echo "cocoindex/cocoindex-code:${version}${slim_suffix}"
190+
echo "ghcr.io/cocoindex-io/cocoindex-code:${latest_tag}"
191+
echo "ghcr.io/cocoindex-io/cocoindex-code:${version}${slim_suffix}"
192+
echo "EOF"
193+
} >> "$GITHUB_OUTPUT"
166194
else
167-
{
168-
echo "tags<<EOF"
169-
echo "cocoindex/cocoindex-code:test"
170-
echo "ghcr.io/cocoindex-io/cocoindex-code:test"
171-
echo "EOF"
172-
} >> "$GITHUB_OUTPUT"
195+
test_tag="test${slim_suffix}"
196+
{
197+
echo "tags<<EOF"
198+
echo "cocoindex/cocoindex-code:${test_tag}"
199+
echo "ghcr.io/cocoindex-io/cocoindex-code:${test_tag}"
200+
echo "EOF"
201+
} >> "$GITHUB_OUTPUT"
173202
fi
174203
175204
- name: Build and push to both registries
@@ -186,5 +215,11 @@ jobs:
186215
# PyPI's CDN yet (which happened on v0.2.24 release), and ensures
187216
# the image matches the tagged commit byte-for-byte.
188217
build-args: |
189-
CCC_INSTALL_SPEC=/ccc-src[default]
218+
CCC_VARIANT=${{ matrix.variant }}
219+
CCC_INSTALL_SPEC=${{ matrix.install_spec }}
190220
tags: ${{ steps.tags.outputs.tags }}
221+
# Per-variant BuildKit cache so slim and full don't evict each
222+
# other's layers. The heavy `deps` layer (torch + friends for
223+
# full; empty for slim) reuses across releases.
224+
cache-from: type=gha,scope=${{ matrix.variant }}
225+
cache-to: type=gha,mode=max,scope=${{ matrix.variant }}

README.md

Lines changed: 26 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -46,18 +46,18 @@ A lightweight, effective **(AST-based)** semantic code search tool for your code
4646

4747
Using [pipx](https://pipx.pypa.io/stable/installation/):
4848
```bash
49-
pipx install 'cocoindex-code[default]' # batteries included (local embeddings)
49+
pipx install 'cocoindex-code[full]' # batteries included (local embeddings)
5050
pipx upgrade cocoindex-code # upgrade
5151
```
5252

5353
Using [uv](https://docs.astral.sh/uv/getting-started/installation/):
5454
```bash
55-
uv tool install --upgrade 'cocoindex-code[default]' --prerelease explicit --with "cocoindex>=1.0.0a24"
55+
uv tool install --upgrade 'cocoindex-code[full]' --prerelease explicit --with "cocoindex>=1.0.0a24"
5656
```
5757

58-
Two install styles:
59-
- `cocoindex-code[default]` — batteries-included. Pulls in `sentence-transformers` so local embeddings (no API key required) work out of the box. The `ccc init` interactive prompt defaults to [Snowflake/snowflake-arctic-embed-xs](https://huggingface.co/Snowflake/snowflake-arctic-embed-xs).
60-
- `cocoindex-code` slim. LiteLLM-only; requires a cloud embedding provider and API key. Use when you don't want the local-embedding deps (~1 GB of torch + transformers).
58+
Two install styles — they mirror the Docker image variants of the same names:
59+
- `cocoindex-code[full]` — batteries-included. Pulls in `sentence-transformers` so local embeddings (no API key required) work out of the box. The `ccc init` interactive prompt defaults to [Snowflake/snowflake-arctic-embed-xs](https://huggingface.co/Snowflake/snowflake-arctic-embed-xs).
60+
- `cocoindex-code` (slim) — LiteLLM-only; requires a cloud embedding provider and API key. Use when you don't want the local-embedding deps (~1 GB of torch + transformers).
6161

6262
Next, set up your [coding agent integration](#coding-agent-integration) — or jump to [Manual CLI Usage](#manual-cli-usage) if you prefer direct control.
6363

@@ -198,6 +198,25 @@ The recommended approach is a **persistent container**: start it once, and use
198198
`docker exec` to run CLI commands or connect MCP sessions to it. The daemon
199199
inside stays warm across sessions, so the embedding model is loaded only once.
200200

201+
### Choosing an image
202+
203+
Two variants are published from each release:
204+
205+
| Tag | Size | Embedding backends | When to pick |
206+
|---|---|---|---|
207+
| `cocoindex/cocoindex-code:latest` (slim, default) | ~450 MB | LiteLLM (cloud: OpenAI, Voyage, Gemini, Ollama, …) | Most users. Cloud-backed embeddings, smaller image, fast pulls. |
208+
| `cocoindex/cocoindex-code:full` | ~5 GB | sentence-transformers (local) + LiteLLM | When you want local embeddings without an API key, or an offline-ready container. Heavier because of torch + transformers. |
209+
210+
The rest of this section uses `:latest` — substitute `:full` in the `image:` /
211+
`docker run` commands if you want the full variant.
212+
213+
> **Mac users running the `:full` variant:** local embedding inference is
214+
> CPU-only inside Docker, because Docker on macOS can't access Apple's Metal
215+
> (MPS) GPU. If you want local embeddings and fast inference, install
216+
> natively instead: `pipx install 'cocoindex-code[full]'`. The `:latest`
217+
> (slim) variant is unaffected — LiteLLM runs the model on the provider's
218+
> side, so Docker vs. native makes no difference.
219+
201220
### Quick start — `docker compose up -d`
202221

203222
Grab [`docker/docker-compose.yml`](./docker/docker-compose.yml) from this repo and run:
@@ -352,7 +371,7 @@ docker build -t cocoindex-code:local -f docker/Dockerfile .
352371
- **Ultra Performant**: ⚡ Built on top of ultra performant [Rust indexing engine](https://github.com/cocoindex-io/cocoindex). Only re-indexes changed files for fast updates.
353372
- **Multi-Language Support**: Python, JavaScript/TypeScript, Rust, Go, Java, C/C++, C#, SQL, Shell, and more.
354373
- **Embedded**: Portable and just works, no database setup required!
355-
- **Flexible Embeddings**: Local SentenceTransformers via the `[default]` extra (free, no API key!) or 100+ cloud providers via LiteLLM.
374+
- **Flexible Embeddings**: Local SentenceTransformers via the `[full]` extra (free, no API key!) or 100+ cloud providers via LiteLLM.
356375

357376
## Configuration
358377

@@ -439,7 +458,7 @@ See [`src/cocoindex_code/chunking.py`](./src/cocoindex_code/chunking.py) for the
439458

440459
## Embedding Models
441460

442-
With the `[default]` extra installed, `ccc init` defaults to a local SentenceTransformers model ([Snowflake/snowflake-arctic-embed-xs](https://huggingface.co/Snowflake/snowflake-arctic-embed-xs)) — no API key required. To use a different model, edit `~/.cocoindex_code/global_settings.yml`.
461+
With the `[full]` extra installed, `ccc init` defaults to a local SentenceTransformers model ([Snowflake/snowflake-arctic-embed-xs](https://huggingface.co/Snowflake/snowflake-arctic-embed-xs)) — no API key required. To use a different model, edit `~/.cocoindex_code/global_settings.yml`.
443462

444463
> The `envs` entries below are only needed if the key isn't already in your shell environment — the daemon inherits your environment automatically.
445464

docker/Dockerfile

Lines changed: 66 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,37 +1,85 @@
1-
# ─── Stage 1: install dependencies ───────────────────────────────────────────
1+
# ─── Stage 1: heavy stable dependencies (variant-aware) ──────────────────────
2+
# Two image variants are published from this Dockerfile:
3+
# - slim (default, `:latest`) — ~450 MB. cocoindex-code + LiteLLM only.
4+
# For users who'll point the embedding at a cloud provider (OpenAI,
5+
# Voyage, Gemini, …).
6+
# - full (`:full`) — ~5 GB. Also bundles sentence-transformers
7+
# + torch + a pre-baked default model. For users who want offline-ready
8+
# local embeddings without an API key.
9+
#
10+
# This stage installs only the big, slow-changing deps that are shared across
11+
# releases:
12+
# - full: `sentence-transformers` (pulls torch + transformers + tokenizers
13+
# transitively, ~1 GB of wheels).
14+
# - slim: nothing — cocoindex-code's LiteLLM deps get installed in stage 2.
15+
#
16+
# The cache key is the RUN command string, which changes with CCC_VARIANT, so
17+
# BuildKit keeps separate cache entries per variant and reuses each across
18+
# releases until we bump the deps.
19+
#
20+
# `cocoindex` and `cocoindex-code` are deliberately NOT installed here —
21+
# they bump often, so pinning them at this layer would invalidate the heavy
22+
# cache on every release. Stage 2 installs them on top; transitive deps are
23+
# already satisfied, so uv only fetches the two packages themselves.
24+
#
225
# Use slim (glibc-based) — cocoindex ships pre-built Rust wheels that need glibc.
326
# Alpine / musl-libc would require building from source.
4-
FROM python:3.12-slim AS builder
27+
#
28+
# `--system` tells uv to install into the base Python at
29+
# /usr/local/lib/python3.12/... since there's no virtualenv in the image.
30+
FROM python:3.12-slim AS deps
531

632
RUN pip install --quiet uv
733

34+
ARG CCC_VARIANT=slim
35+
RUN if [ "$CCC_VARIANT" = "full" ]; then \
36+
uv pip install --system --prerelease=allow sentence-transformers; \
37+
fi
38+
39+
# ─── Stage 2: install cocoindex + cocoindex-code (per release) ───────────────
40+
# Cheap relative to stage 1: transitive deps like torch are already in place
41+
# for the full variant; for slim there are no heavy deps to pull. uv only
42+
# needs to fetch the cocoindex + cocoindex-code wheels themselves.
43+
FROM deps AS builder
844
WORKDIR /build
45+
ARG CCC_VARIANT=slim
946

10-
# Default: install the released cocoindex-code from PyPI (release flow).
11-
# Tests/local dev override with:
12-
# --build-arg CCC_INSTALL_SPEC=/ccc-src[default]
13-
# which installs from the copied-in source tree instead. The COPY always runs;
14-
# with .dockerignore trimming build artifacts it adds ~nothing.
15-
ARG CCC_INSTALL_SPEC="cocoindex-code[default]"
47+
# Default behaviour: install cocoindex-code from PyPI, picking the extras
48+
# that match CCC_VARIANT.
49+
# Release workflow / local tests override with (respectively):
50+
# --build-arg CCC_INSTALL_SPEC=/ccc-src
51+
# --build-arg CCC_INSTALL_SPEC=/ccc-src[full]
52+
ARG CCC_INSTALL_SPEC=""
1653
COPY . /ccc-src
54+
RUN if [ -z "$CCC_INSTALL_SPEC" ]; then \
55+
if [ "$CCC_VARIANT" = "full" ]; then \
56+
CCC_INSTALL_SPEC="cocoindex-code[full]"; \
57+
else \
58+
CCC_INSTALL_SPEC="cocoindex-code"; \
59+
fi; \
60+
fi; \
61+
uv pip install --system --prerelease=allow \
62+
"cocoindex>=1.0.0a33" \
63+
"${CCC_INSTALL_SPEC}"
1764

18-
RUN uv pip install --system --prerelease=allow \
19-
"cocoindex>=1.0.0a33" \
20-
"${CCC_INSTALL_SPEC}"
21-
22-
# ─── Stage 2: pre-bake the default embedding model ────────────────────────────
23-
# Bakes Snowflake/snowflake-arctic-embed-xs into the merged data directory at
24-
# /var/cocoindex/cache/..., so on first run Docker's volume copy-up populates
25-
# the cocoindex-data volume with the model — no network fetch needed.
65+
# ─── Stage 3: pre-bake the default embedding model (full only) ───────────────
66+
# For the full variant, bakes Snowflake/snowflake-arctic-embed-xs into
67+
# /var/cocoindex/cache/... so Docker's first-mount copy-up populates the
68+
# cocoindex-data volume with the model — no network fetch on first start.
69+
# For slim, just creates empty cache dirs so the runtime stage's COPY works
70+
# regardless of variant.
2671
FROM builder AS model_cache
72+
ARG CCC_VARIANT=slim
2773

2874
ENV HF_HOME=/var/cocoindex/cache/huggingface \
2975
SENTENCE_TRANSFORMERS_HOME=/var/cocoindex/cache/sentence-transformers
3076

3177
RUN mkdir -p /var/cocoindex/cache/huggingface /var/cocoindex/cache/sentence-transformers \
32-
&& python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('Snowflake/snowflake-arctic-embed-xs'); print('Model cached.')"
78+
&& if [ "$CCC_VARIANT" = "full" ]; then \
79+
python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('Snowflake/snowflake-arctic-embed-xs'); print('Model cached.')"; \
80+
fi
3381

34-
# ─── Stage 3: runtime ─────────────────────────────────────────────────────────
82+
# ─── Stage 4: runtime ─────────────────────────────────────────────────────────
3583
FROM python:3.12-slim AS runtime
3684

3785
# gosu for privilege-drop (PUID/PGID pattern); create non-root coco user.

pyproject.toml

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,10 +36,20 @@ dependencies = [
3636
]
3737

3838
[project.optional-dependencies]
39+
# `embeddings-local` is the primary feature extra: it pulls in
40+
# `sentence-transformers` (via cocoindex) so local embeddings work without
41+
# an API key.
3942
embeddings-local = [
4043
"cocoindex[sentence-transformers]==1.0.0a43",
4144
]
42-
default = [
45+
# `full` is the umbrella "batteries-included" alias. Today it's just
46+
# `embeddings-local`, but we expect to bundle more optional niceties under
47+
# it over time — users who want everything can keep using `[full]` and pick
48+
# up the additions automatically. The name also matches the Docker
49+
# `:full` image variant for consistency across install paths. Contents are
50+
# inlined rather than self-referencing `cocoindex-code[embeddings-local]`
51+
# to avoid resolver edge cases with older pip.
52+
full = [
4353
"cocoindex[sentence-transformers]==1.0.0a43",
4454
]
4555
dev = [

skills/ccc/references/management.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,11 @@
55
Install CocoIndex Code via pipx. Two install styles:
66

77
```bash
8-
pipx install 'cocoindex-code[default]' # batteries included (local embeddings via sentence-transformers)
8+
pipx install 'cocoindex-code[full]' # batteries included (local embeddings via sentence-transformers)
99
pipx install cocoindex-code # slim (LiteLLM-only; requires a cloud embedding provider + API key)
1010
```
1111

12-
The `[default]` extra pulls in `sentence-transformers` so the first-run default (local embeddings, no API key) works out of the box. The slim install is for environments where you don't want the torch/transformers deps and plan to use a LiteLLM-supported cloud provider instead.
12+
The `[full]` extra pulls in `sentence-transformers` so the first-run default (local embeddings, no API key) works out of the box. The slim install is for environments where you don't want the torch/transformers deps and plan to use a LiteLLM-supported cloud provider instead.
1313

1414
To upgrade to the latest version:
1515

src/cocoindex_code/cli.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -327,7 +327,7 @@ def _resolve_embedding_choice(
327327
return EmbeddingSettings(provider="sentence-transformers", model=DEFAULT_ST_MODEL)
328328
_typer.echo(
329329
"Error: sentence-transformers is not installed and stdin is not a TTY.\n"
330-
"Either install the extra (`pip install cocoindex-code[embeddings-local]`)\n"
330+
"Either install the extra (`pip install 'cocoindex-code[embeddings-local]'`)\n"
331331
"or pass `--litellm-model MODEL` to select a LiteLLM model.",
332332
err=True,
333333
)

tests/e2e_docker/conftest.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,9 @@ def docker_image() -> str:
2626
"""Build the image once per test session, installing cocoindex-code from the
2727
local source tree (not PyPI) so tests exercise the current changes. Returns the tag.
2828
"""
29+
# Tests exercise the `full` variant so `ccc init -f` in non-TTY mode can
30+
# fall back to sentence-transformers (the slim variant requires
31+
# `--litellm-model`, which would add setup boilerplate to every test).
2932
tag = "cocoindex-code:pytest"
3033
subprocess.run(
3134
[
@@ -34,7 +37,9 @@ def docker_image() -> str:
3437
"-f",
3538
str(DOCKERFILE),
3639
"--build-arg",
37-
"CCC_INSTALL_SPEC=/ccc-src[default]",
40+
"CCC_VARIANT=full",
41+
"--build-arg",
42+
"CCC_INSTALL_SPEC=/ccc-src[full]",
3843
"-t",
3944
tag,
4045
str(REPO_ROOT),

tests/test_e2e.py

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -838,12 +838,16 @@ async def embed(self, text: str) -> object: # noqa: ARG002
838838
# ---------------------------------------------------------------------------
839839

840840

841-
def test_dockerfile_install_line_uses_default_extra() -> None:
842-
"""Dockerfile should install via `cocoindex-code[default]`, no separate ST pin."""
841+
def test_dockerfile_install_line_uses_full_extra() -> None:
842+
"""Dockerfile should install via `cocoindex-code[full]` (not the old
843+
`[default]` alias) and should not hard-pin sentence-transformers.
844+
"""
843845
repo_root = Path(__file__).resolve().parent.parent
844846
content = (repo_root / "docker" / "Dockerfile").read_text()
845-
assert "cocoindex-code[default]" in content
847+
assert "cocoindex-code[full]" in content
848+
assert "cocoindex-code[default]" not in content
846849
assert "sentence-transformers>=" not in content
850+
assert "sentence-transformers==" not in content
847851

848852

849853
# ---------------------------------------------------------------------------

0 commit comments

Comments
 (0)