feat(deps): upgrade transformers to 5.x and sentence-transformers to 5.2+ by voorhs · Pull Request #341 · deeppavlov/AutoIntent

voorhs · 2026-06-27T19:44:42Z

Closes #295.

Why

transformers 4.57.x calls huggingface_hub.model_info() on every
tokenizer load for any model with vocab_size > 100000 (e.g. our default
intfloat/multilingual-e5-* family). That probe is uncacheable, fires
through the SHA pin, and 429s under CI matrix / parallel load — and
because the tests-only conftest monkey-patch isn't in production code,
real users on the classic/zero-shot-encoder presets hit it too.

transformers 5.0+ caches the probe per-process and respects
local_files_only / HF_HUB_OFFLINE (HF #45444). Per transformers
release notes the v5 fix is intentionally not backported to 4.x —
upgrading is the only path.

Scope of the bump

This is necessarily a coordinated two-package migration. sentence-transformers
3.x pins transformers<5.0.0, and the cap persists through ST 5.1.x;
ST 5.2.0 is the first release that lifts it to transformers<6.0.0.
Resolved versions: transformers==5.12.1, sentence-transformers==5.6.0.

Changes

pyproject.toml: bump both extras.
src/autointent/_wrappers/ranker.py — ST 5 restructured CrossEncoder
into a nn.Sequential of modules:
- cross_encoder.model.classifier → cross_encoder[0].auto_model.classifier
- predict(activation_fct=...) → predict(activation_fn=...)
- cross_encoder.model.cpu() → cross_encoder.cpu() (wrapper is itself a nn.Module).
src/autointent/_wrappers/embedder/sentence_transformers.py:
- Import losses / training_args from sentence_transformers.sentence_transformer
  (top-level submodule path is deprecated in 5.x).
- warmup_ratio= → warmup_steps= (v5 TrainingArguments accepts a
  float < 1.0 there as a fraction of total training steps).
tests/conftest.py: remove _disable_transformers_mistral_regex_patch
(the underlying bug is fixed in v5).
tests/embedder/test_sentence_transformers_backend.py:
get_sentence_embedding_dimension() → get_embedding_dimension().

Test plan

Verified locally on Python 3.14:

pytest tests/embedder — 83 passed
pytest tests/modules/scoring/{test_dnnc,test_description_cross,test_rerank_scorer} — 7 passed (Ranker / CrossEncoder paths)
pytest tests/modules/test_dumper.py — 8 passed (HF model save/load)
pytest --collect-only — 611 tests collect cleanly
ruff check on changed files — clean
Full CI matrix — pending (intentionally pushed to let CI exercise the long suites)

🤖 Generated with Claude Code

…5.2+ (#295) The 4.57.x mistral-regex codepath called `huggingface_hub.model_info()` on every tokenizer load with vocab >100k (e.g. `intfloat/multilingual-e5-*`), hammering HF's rate limit in CI and in production. transformers 5.0+ caches that probe per-process and respects `local_files_only`/`HF_HUB_OFFLINE`. The bump is necessarily a coordinated two-package migration: ST 5.2.0 is the first release that lifts the `transformers<5.0.0` cap. Resolved versions: transformers 5.12.1, sentence-transformers 5.6.0. Adjusts the v5.x surfaces that actually broke: - ranker.py: `cross_encoder.model.classifier` → `cross_encoder[0].auto_model.classifier` (ST 5 restructured CrossEncoder into a nn.Sequential of modules). - ranker.py: CrossEncoder.predict() renamed `activation_fct` → `activation_fn`. - ranker.py: `cross_encoder.model.cpu()` → `cross_encoder.cpu()` (the wrapper is itself an nn.Module now, no underlying `.model` attribute). - embedder/sentence_transformers.py: import `losses`/`training_args` from `sentence_transformers.sentence_transformer` (top-level path deprecated). - embedder/sentence_transformers.py: `warmup_ratio=` → `warmup_steps=` (v5 TrainingArguments accepts a float <1.0 there as a ratio). - test_sentence_transformers_backend.py: `get_sentence_embedding_dimension()` → `get_embedding_dimension()`. Removes the `_disable_transformers_mistral_regex_patch` workaround from tests/conftest.py — the underlying bug is fixed in v5. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- Bump sentence-transformers lower bound 5.2.0 → 5.4.0. The new ranker / embedder paths (cross_encoder[0] subscript, sentence_transformers. sentence_transformer subpackage, get_embedding_dimension) all landed in 5.4.0; the previous floor would have ModuleNotFoundError'd / AttributeError'd anyone resolving 5.2.x–5.3.x. - Constrain EmbedderFineTuningConfig.warmup_ratio to (0, 1). v5 TrainingArguments interprets warmup_steps>=1 as a raw step count and <1 as a fraction, so a stray warmup_ratio=1.0 would silently produce one warmup step instead of full-training warmup. - Refresh tests/test_deps.py synthetic metadata fixtures to v5 version strings so the resolver tests exercise the version range we ship, not the v4 range we just left behind. - Trim the v4→v5 narrating comments down to the WHY of the current code; per-line migration history belongs in the commit log. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Reviewer flagged that `gt=0` rejects the legal `warmup_ratio=0.0` config (disable warmup). Relax to `ge=0`; `lt=1` is kept because that's the v5 boundary where warmup_steps flips from ratio to raw step count. Regenerate the published JSON schema so it reflects the constraint — otherwise YAML authoring against the schema would pass schema validation and fail at runtime. Pushed back on the reviewer's claim that `warmup_steps=0.1` runs zero warmup: transformers v5 typed `warmup_steps: float` and `get_warmup_steps` branches on `>= 1`, not `> 0` — `0.1` takes the `math.ceil(N * 0.1)` fraction branch (training_args.py:2089 in v5.12.1). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

- _bert.py: coerce label2id/id2label keys to str. huggingface_hub 1.x StrictDataclassFieldValidationError rejects int-keyed label2id; the v5 AutoModelForSequenceClassification.from_pretrained pipeline now routes through that validator, so the previous {int: int} mapping raised on every BertScorer.fit (and cascaded into a fallback hf_hub_download call that the test guard caught as 'unpinned'). - ranker.py: cast cross_encoder[0] to Any for auto_model.classifier access (nn.Sequential.__getitem__ is typed Tensor | Module on v5); add arg-type ignores on CrossEncoder.predict(list[tuple[str,str]]) calls — the v5 stub demands the much wider Sequence type but the list-of-pairs form is the documented call shape. - Drop type: ignore comments mypy now reports as unused (AutoTokenizer.from_pretrained gained a typed stub in transformers v5; max_length matches TokenizerConfig.max_length cleanly). - conftest.py: SentenceTransformer's constructor is typed Any on v5, so add no-any-return ignore at the fixture boundary. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

A future refactor sees `{str(i): i}` as a no-op coercion and "simplifies" back to `{i: i}`; mypy passes, then BertScorer.fit raises StrictDataclassFieldValidationError at runtime. Comment makes the WHY explicit at the call site, matching the WHY-only comment policy from 14f9576. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

When PEFT is installed, transformers v5 calls find_adapter_config_file on every AutoModelForSequenceClassification.from_pretrained. The auto_factory only propagates `_commit_hash` (used for the cache lookup) but NOT the outer `revision` to the fall-through hf_hub_download. On a cold cache — i.e. our CI warm-cache job, which populates model files but no negative marker for adapter_config.json — that probe fires `hf_hub_download(repo_id, adapter_config.json, revision=None)` and our test guard rightly flagged it as unpinned. Pass `adapter_kwargs={"revision": revision}` so the adapter probe inherits the pin. The first run still writes a `.no_exist` marker, but all subsequent runs (and CI's pinned-only contract) stay clean. Reproduces with: rm -rf ~/.cache/huggingface/hub/models--prajjwal1--bert-tiny/.no_exist then pytest tests/pipeline/test_inference.py::test_inference_from_config[multiclass]. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

PEFT's get_peft_model_state_dict (save_and_load.py:380-384) runs an embedding-resize sanity check on every Trainer.save_checkpoint by calling model.config.__class__.from_pretrained(base_model_name_or_path) with no revision. transformers fills in revision='main' as the default, so the call hits hf_hub_download('prajjwal1/bert-tiny', 'config.json', revision='main') — unpinned, which our CI guard correctly flags. On a cold cache (CI), this trips on every LoRA/PTuning trial that runs through Trainer. Clear base_model_name_or_path on the peft_config after get_peft_model so the vocab check short-circuits at `if model_id is not None`. Our dumper (PeftModelDumper / HFModelDumper) saves the base model separately and the load path passes it explicitly, so the adapter config doesn't need to remember it. Reproduces with: rm -rf ~/.cache/huggingface/hub/models--prajjwal1--bert-tiny/.no_exist pytest tests/pipeline/test_inference.py::test_inference_from_config[multiclass] Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

voorhs and others added 7 commits June 27, 2026 22:42

voorhs added the full-ci Run test suite on full OS and Python matrix label Jun 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(deps): upgrade transformers to 5.x and sentence-transformers to 5.2+#341

feat(deps): upgrade transformers to 5.x and sentence-transformers to 5.2+#341
voorhs wants to merge 7 commits into
devfrom
migrate-295-transformers-v5

voorhs commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

voorhs commented Jun 27, 2026

Why

Scope of the bump

Changes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant