Skip to content

Fix HuggingFace API rate limiting in CI (#1291)#1296

Open
ak91456 wants to merge 5 commits intoTransformerLensOrg:devfrom
ak91456:main
Open

Fix HuggingFace API rate limiting in CI (#1291)#1296
ak91456 wants to merge 5 commits intoTransformerLensOrg:devfrom
ak91456:main

Conversation

@ak91456
Copy link
Copy Markdown

@ak91456 ak91456 commented May 9, 2026

Description

Fixes #1291 — CI jobs were hitting HTTP 429 (Too Many Requests) from HuggingFace Hub when multiple PRs or pushes triggered simultaneous workflow runs.

Even when model files are cached locally, huggingface_hub makes lightweight "resolve" API calls by default to check freshness. With 3 matrix Python
versions + coverage + notebook jobs all running in parallel across multiple PRs, these calls exceeded HF's rate limit.

Changes:

  • Concurrency group (checks.yml): Cancels the stale in-progress run on the same PR branch when a new push arrives. Push-to-main, tags, and
    workflow_call (release) events are exempt and never cancelled.
  • HF offline mode on cache hit (checks.yml): After a warm cache restore, sets HF_HUB_OFFLINE=1 for all subsequent steps in compatibility-checks
    and coverage-test, preventing resolve API calls when models are already local. Auth still runs before offline mode activates.
  • Retry with backoff (hf_utils.py): download_file_from_hf now retries up to 3 times (10s / 20s / 30s waits) on HTTP 429, matching the pattern
    already used in hf_scraper.py.

Fixes #1291

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have not rewritten tests relating to key interfaces which would affect backward compatibility

brendanlong and others added 4 commits April 20, 2026 14:50
* Fix type of HookedTransformerConfig.device

This is typed as `Optional[str]` but sometimes returns `torch.device`.
Updated the code to just return the `str` instead of wrapping with a
device.

I'm not confident that every function which takes a device will
always be passed a string, so I didn't change functions like
warn_if_mps.

Found while working on TransformerLensOrg#1219

* more cleanup

* 3.0 CI Bugs (TransformerLensOrg#1261)

* Fixing `utils` imports

* skip gated notebooks on PR from forks

* Updating notebooks

* Ensure LLaMA only runs when HF_TOKEN is available

---------

Co-authored-by: jlarson4 <jonahalarson@comcast.net>
@ak91456 ak91456 marked this pull request as draft May 9, 2026 17:42
@ak91456 ak91456 marked this pull request as ready for review May 9, 2026 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants