Skip to content

Cache Hugging Face tokenizers in CI#1542

Merged
xeophon merged 1 commit into
mainfrom
codex/hf-cache-tokenizers-ci
Jun 4, 2026
Merged

Cache Hugging Face tokenizers in CI#1542
xeophon merged 1 commit into
mainfrom
codex/hf-cache-tokenizers-ci

Conversation

@xeophon
Copy link
Copy Markdown
Member

@xeophon xeophon commented Jun 4, 2026

Summary

  • add a Hugging Face cache directory for the Test workflow
  • restore/save that cache in the existing Python test matrix
  • pass HF_TOKEN through to the main Verifiers test job to reduce public Hub rate-limit pressure

Context

This is the smaller follow-up for the Hugging Face 429 flake observed on #1529. It keeps the existing Test workflow topology and avoids the separate cache-warm job.

Validation

  • ruby YAML parse for .github/workflows/test.yml
  • git diff --check

Note

Low Risk
Workflow-only changes (cache path, secrets env, cache step) with no application runtime or auth logic changes.

Overview
The Test workflow now pins Hugging Face artifacts under a workspace-local HF_HOME and restores/saves that directory with actions/cache, keyed off uv.lock and the renderer test files that drive tokenizer downloads.

The main Verifiers matrix job also receives HF_TOKEN from repo secrets so Hub access is less likely to hit public rate limits (addressing CI 429 flakes on tokenizer loads).

Reviewed by Cursor Bugbot for commit 1811e61. Bugbot is set up for automated code reviews on this repo. Configure here.

Note

Cache Hugging Face tokenizers in CI to speed up test runs

Adds a caching step in test.yml that stores and restores the Hugging Face tokenizer directory (HF_HOME) across CI runs. The cache key is derived from the OS and hashes of uv.lock and the relevant test files, with a fallback restore key. HF_TOKEN is also now exposed to the job via repository secrets.

Macroscope summarized 1811e61.

@macroscopeapp
Copy link
Copy Markdown

macroscopeapp Bot commented Jun 4, 2026

Approvability

Verdict: Approved

CI-only change that adds caching for Hugging Face tokenizers to speed up test runs. Standard GitHub Actions caching pattern with no impact on production code or runtime behavior.

You can customize Macroscope's approvability policy. Learn more.

@xeophon xeophon merged commit 3420677 into main Jun 4, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant