Skip to content

Commit 9a1d493

Browse files
committed
docs(ci): explain the GGUF model cache (purpose, no flag, vs sccache)
Expand the inline comment on the model-cache step: it exists to avoid re-downloading ~5 GB of GGUF test models from HuggingFace every run (and to dodge HF rate-limits). It is always ON by design — no on/off flag — unlike the sccache compiler cache, which the use_cache input / USE_CACHE env toggles. Notes it uses GitHub's free cache, not Depot. Comment-only; no behaviour change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01LjWiKSyNzqqpobSKYRiew5
1 parent 8f064c7 commit 9a1d493

1 file changed

Lines changed: 32 additions & 16 deletions

File tree

.github/workflows/publish.yml

Lines changed: 32 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -586,14 +586,18 @@ jobs:
586586
with:
587587
name: Linux-x86_64-libraries
588588
path: ${{ github.workspace }}/src/main/resources/net/ladenthin/llama/
589+
# GGUF model cache — introduced to stop re-downloading ~5 GB of test models from
590+
# HuggingFace on every run (also dodges HF rate-limits). Complements the sccache compiler
591+
# cache but is always ON: there is intentionally NO on/off flag for it (it is GitHub's
592+
# free cache, safe + free), whereas the sccache cache is toggled by the `use_cache`
593+
# workflow_dispatch input / USE_CACHE env. Not Depot — GB-scale blobs are usage-priced
594+
# there and its file cache needs Depot-hosted runners. See CLAUDE.md.
589595
- name: Cache GGUF models (GitHub Actions cache; avoids re-downloading from HuggingFace)
590596
uses: actions/cache@v5
591597
with:
592598
path: models/
593-
# Shared, stable key across all test jobs (GGUF files are platform-independent, so
594-
# ubuntu + macOS share one entry). Bump the suffix when the model set/URLs change.
595-
# Uses GitHub's free 10 GB/repo cache — NOT Depot: these are GB-scale blobs and Depot
596-
# is usage-priced + its file cache needs Depot-hosted runners (see CLAUDE.md).
599+
# GGUF is platform-independent, so ubuntu + macOS share one entry;
600+
# bump the suffix when the model set / URLs change.
597601
key: gguf-models-v1
598602
- name: Download text generation model
599603
run: test -f models/${MODEL_NAME} || curl -L --proto =https --proto-redir =https --fail --retry 5 --retry-all-errors ${MODEL_URL} --create-dirs -o models/${MODEL_NAME}
@@ -719,14 +723,18 @@ jobs:
719723
with:
720724
name: macos-14-libraries
721725
path: ${{ github.workspace }}/src/main/resources/net/ladenthin/llama/
726+
# GGUF model cache — introduced to stop re-downloading ~5 GB of test models from
727+
# HuggingFace on every run (also dodges HF rate-limits). Complements the sccache compiler
728+
# cache but is always ON: there is intentionally NO on/off flag for it (it is GitHub's
729+
# free cache, safe + free), whereas the sccache cache is toggled by the `use_cache`
730+
# workflow_dispatch input / USE_CACHE env. Not Depot — GB-scale blobs are usage-priced
731+
# there and its file cache needs Depot-hosted runners. See CLAUDE.md.
722732
- name: Cache GGUF models (GitHub Actions cache; avoids re-downloading from HuggingFace)
723733
uses: actions/cache@v5
724734
with:
725735
path: models/
726-
# Shared, stable key across all test jobs (GGUF files are platform-independent, so
727-
# ubuntu + macOS share one entry). Bump the suffix when the model set/URLs change.
728-
# Uses GitHub's free 10 GB/repo cache — NOT Depot: these are GB-scale blobs and Depot
729-
# is usage-priced + its file cache needs Depot-hosted runners (see CLAUDE.md).
736+
# GGUF is platform-independent, so ubuntu + macOS share one entry;
737+
# bump the suffix when the model set / URLs change.
730738
key: gguf-models-v1
731739
- name: Download text generation model
732740
run: test -f models/${MODEL_NAME} || curl -L --proto =https --proto-redir =https --fail --retry 5 --retry-all-errors ${MODEL_URL} --create-dirs -o models/${MODEL_NAME}
@@ -795,14 +803,18 @@ jobs:
795803
with:
796804
name: macos-15-libraries
797805
path: ${{ github.workspace }}/src/main/resources/net/ladenthin/llama/
806+
# GGUF model cache — introduced to stop re-downloading ~5 GB of test models from
807+
# HuggingFace on every run (also dodges HF rate-limits). Complements the sccache compiler
808+
# cache but is always ON: there is intentionally NO on/off flag for it (it is GitHub's
809+
# free cache, safe + free), whereas the sccache cache is toggled by the `use_cache`
810+
# workflow_dispatch input / USE_CACHE env. Not Depot — GB-scale blobs are usage-priced
811+
# there and its file cache needs Depot-hosted runners. See CLAUDE.md.
798812
- name: Cache GGUF models (GitHub Actions cache; avoids re-downloading from HuggingFace)
799813
uses: actions/cache@v5
800814
with:
801815
path: models/
802-
# Shared, stable key across all test jobs (GGUF files are platform-independent, so
803-
# ubuntu + macOS share one entry). Bump the suffix when the model set/URLs change.
804-
# Uses GitHub's free 10 GB/repo cache — NOT Depot: these are GB-scale blobs and Depot
805-
# is usage-priced + its file cache needs Depot-hosted runners (see CLAUDE.md).
816+
# GGUF is platform-independent, so ubuntu + macOS share one entry;
817+
# bump the suffix when the model set / URLs change.
806818
key: gguf-models-v1
807819
- name: Download text generation model
808820
run: test -f models/${MODEL_NAME} || curl -L --proto =https --proto-redir =https --fail --retry 5 --retry-all-errors ${MODEL_URL} --create-dirs -o models/${MODEL_NAME}
@@ -871,14 +883,18 @@ jobs:
871883
with:
872884
name: macos-15-metal-libraries
873885
path: ${{ github.workspace }}/src/main/resources/net/ladenthin/llama/
886+
# GGUF model cache — introduced to stop re-downloading ~5 GB of test models from
887+
# HuggingFace on every run (also dodges HF rate-limits). Complements the sccache compiler
888+
# cache but is always ON: there is intentionally NO on/off flag for it (it is GitHub's
889+
# free cache, safe + free), whereas the sccache cache is toggled by the `use_cache`
890+
# workflow_dispatch input / USE_CACHE env. Not Depot — GB-scale blobs are usage-priced
891+
# there and its file cache needs Depot-hosted runners. See CLAUDE.md.
874892
- name: Cache GGUF models (GitHub Actions cache; avoids re-downloading from HuggingFace)
875893
uses: actions/cache@v5
876894
with:
877895
path: models/
878-
# Shared, stable key across all test jobs (GGUF files are platform-independent, so
879-
# ubuntu + macOS share one entry). Bump the suffix when the model set/URLs change.
880-
# Uses GitHub's free 10 GB/repo cache — NOT Depot: these are GB-scale blobs and Depot
881-
# is usage-priced + its file cache needs Depot-hosted runners (see CLAUDE.md).
896+
# GGUF is platform-independent, so ubuntu + macOS share one entry;
897+
# bump the suffix when the model set / URLs change.
882898
key: gguf-models-v1
883899
- name: Download text generation model
884900
run: test -f models/${MODEL_NAME} || curl -L --proto =https --proto-redir =https --fail --retry 5 --retry-all-errors ${MODEL_URL} --create-dirs -o models/${MODEL_NAME}

0 commit comments

Comments
 (0)