Fix HuggingFace API rate limiting in CI (#1291) by ak91456 · Pull Request #1296 · TransformerLensOrg/TransformerLens

ak91456 · 2026-05-09T08:30:32Z

Description

Fixes #1291 — CI jobs were hitting HTTP 429 (Too Many Requests) from HuggingFace Hub when multiple PRs or pushes triggered simultaneous workflow runs.

Even when model files are cached locally, huggingface_hub makes lightweight "resolve" API calls by default to check freshness. With 3 matrix Python
versions + coverage + notebook jobs all running in parallel across multiple PRs, these calls exceeded HF's rate limit.

Changes:

Concurrency group (checks.yml): Cancels the stale in-progress run on the same PR branch when a new push arrives. Push-to-main, tags, and
workflow_call (release) events are exempt and never cancelled.
HF offline mode on cache hit (checks.yml): After a warm cache restore, sets HF_HUB_OFFLINE=1 for all subsequent steps in compatibility-checks
and coverage-test, preventing resolve API calls when models are already local. Auth still runs before offline mode activates.
Retry with backoff (hf_utils.py): download_file_from_hf now retries up to 3 times (10s / 20s / 30s waits) on HTTP 429, matching the pattern
already used in hf_scraper.py.

Fixes #1291

Type of change

Bug fix (non-breaking change which fixes an issue)

Checklist:

I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have not rewritten tests relating to key interfaces which would affect backward compatibility

* Fix type of HookedTransformerConfig.device This is typed as `Optional[str]` but sometimes returns `torch.device`. Updated the code to just return the `str` instead of wrapping with a device. I'm not confident that every function which takes a device will always be passed a string, so I didn't change functions like warn_if_mps. Found while working on TransformerLensOrg#1219 * more cleanup * 3.0 CI Bugs (TransformerLensOrg#1261) * Fixing `utils` imports * skip gated notebooks on PR from forks * Updating notebooks * Ensure LLaMA only runs when HF_TOKEN is available --------- Co-authored-by: jlarson4 <jonahalarson@comcast.net>

TransformerLens 3.1.0

Release v3.2.0

brendanlong and others added 4 commits April 20, 2026 14:50

Merge pull request TransformerLensOrg#1277 from TransformerLensOrg/dev

6f56518

TransformerLens 3.1.0

Merge pull request TransformerLensOrg#1294 from TransformerLensOrg/dev

31d4f6a

Release v3.2.0

Fix HuggingFace API rate limiting in CI (TransformerLensOrg#1291)

70eaa97

ak91456 marked this pull request as draft May 9, 2026 17:42

ak91456 marked this pull request as ready for review May 9, 2026 17:42

updates checks

5d7e3a3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix HuggingFace API rate limiting in CI (#1291)#1296

Fix HuggingFace API rate limiting in CI (#1291)#1296
ak91456 wants to merge 5 commits intoTransformerLensOrg:devfrom
ak91456:main

ak91456 commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ak91456 commented May 9, 2026

Description

Type of change

Checklist:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants