LCORE-2539: bump-up transformers package [konflux]#1892
Conversation
WalkthroughThis PR updates pinned Python dependencies and their associated hash values across the build configuration, including bumping huggingface-hub, trl, and transformers versions; adding new dependencies (hf-xet and typer-slim); and synchronizing Tekton pipeline prefetch configuration to align with the dependency changes. ChangesDependency Updates and Build Configuration
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related issues
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
70a96d9 to
cfecef6
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.konflux/requirements.hashes.wheel.txt:
- Around line 217-218: The requirement pins transformers==5.0.0 which is a major
upgrade; locate the entry "transformers==5.0.0" in
.konflux/requirements.hashes.wheel.txt and (1) add a short runtime validation in
the app startup to assert transformers.__version__ startswith "5." and that
required libs (check imports for "transformers" and "sentence_transformers") and
compatible versions of huggingface_hub and trl are installed (raise clear error
if not), (2) run the HF v5 migration checklist against any code paths using
transformers (search for imports of transformers / sentence-transformers,
model/tokenizer loading and Trainer/TrainingArguments usage) and apply migration
fixes (tokenizer/model loading calls, serialization, Trainer arg changes), (3)
add or update pinned compatible versions for huggingface-hub and trl in
requirements (or add a compatibility note next to "transformers==5.0.0") and (4)
add automated tests that load a representative checkpoint + tokenizer and run a
simple inference (and training loop if used) to verify model/tokenizer behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 687ff909-629e-4f69-8ce6-0efa99bda1da
📒 Files selected for processing (5)
.konflux/requirements.hashes.source.txt.konflux/requirements.hashes.wheel.txt.konflux/requirements.overrides.txt.tekton/lightspeed-stack-pull-request.yaml.tekton/lightspeed-stack-push.yaml
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
- GitHub Check: Pylinter
- GitHub Check: build-pr
- GitHub Check: unit_tests (3.12)
- GitHub Check: unit_tests (3.13)
- GitHub Check: mypy
- GitHub Check: E2E: library mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E Tests for Lightspeed Evaluation job
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
🔇 Additional comments (7)
.konflux/requirements.overrides.txt (1)
3-3: LGTM!.tekton/lightspeed-stack-pull-request.yaml (1)
61-61: LGTM!.tekton/lightspeed-stack-push.yaml (1)
57-57: LGTM!.konflux/requirements.hashes.source.txt (4)
1120-1122: Validate TRL 1.5.1 ↔ transformers 5.0.0 for the specific TRL components you use.TRL 1.5.1 is designed to work with transformers 5.0.0, but some TRL components impose stricter version constraints (e.g.,
PairRMJudgerequires transformers < 5.0.0). Also, there are no directtrlimports in the repo’s Python sources, so confirm which TRL classes are exercised by your training stack; if any constrained components are used, adjust thetransformers/trlpins accordingly.
550-552: Check compatibility ofhuggingface-hub==1.18.0with the pinned stackRepo scan found no direct
huggingface_hubimports in the Python code, so any breakage would come via transitive usage (e.g.,transformers).huggingface-hubv1.x includes breaking behavior/behavior changes such as the HTTP backend migration (requests→httpx), removal ofconfigure_http_backendin favor of client factories,hf_hub_downloadswitching to fresh temp-file downloads (no resuming partial files), stricter token/dir permissions, and a raised minimum Python version (>=3.9).Confirm the pinned
transformers(and Python) versions are compatible with these changes, and that your expected download/resume and filesystem permission behavior matches your runtime environments.
1123-1125: Clarify whytyper-slim==0.24.0was added.
typer-slim==0.24.0is pinned only in.konflux/requirements.hashes.source.txt(lines 1123-1125) and isn’t mentioned in the PR scope. Please confirm:
- Whether
typer-slimis a transitive dependency of an upgraded package (e.g.,transformers/huggingface-hub) and which one pulls it in.- Whether it’s required for runtime functionality or only for ancillary tooling/CLI entrypoints.
524-549: Validate hf-xet 1.4.3 source build & runtime compatibility with pinned transformers/hub
hf-xet==1.4.3is pinned in.konflux/requirements.hashes.source.txt, andscripts/konflux_requirements.shroutes PyPI packages torequirements.hashes.source.txt(wheel-prefetch list comes from the Red Hat index), sohf-xetshould no longer be treated as a wheel-prefetch binary in Tekton..tekton/lightspeed-stack-pull-request.yamllistshf_xetamong deps “need cargo to build”, andcargois present in.konflux/rpms.lock.yaml, so the build environment must provide Rust/cargo for the source install.transformers==5.0.0(wheel) andhuggingface-hub==1.18.0(source) are pinned; PyPI metadata forhf-xet==1.4.3doesn’t provide runtime dependency constraints beyond apytestextra, so compatibility can’t be inferred—run the build + a minimal runtime/import check on all target platforms.
| transformers==5.0.0 \ | ||
| --hash=sha256:d17545021abc14a7e631ce23023a50cd6962968fc6bdbae5ed3ede2b9bb4b9d7 |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
What are the breaking changes in transformers 5.0.0 compared to 4.57.6?
💡 Result:
Transformers v5.0.0 represents a major architectural shift for the library, focusing on simplification, performance, and a unified PyTorch-centric backend. Key breaking changes compared to v4.57.6 include: Library-wide Changes: - Removal of TensorFlow and JAX: The library now focuses exclusively on PyTorch as the primary backend to reduce maintenance complexity [1][2][3]. - HTTP Backend Shift: The underlying HTTP client has been switched from requests to httpx [1][4]. Users catching requests.HTTPError must update to catch httpx.HTTPError [1][4]. - Hub Integration: The hf_transfer environment variable (HF_HUB_ENABLE_HF_TRANSFER) has been dropped in favor of hf_xet [1][4]. Additionally, the library now pins huggingface_hub to >=1.0.0 [1][5]. - CLI Changes: The transformers-cli command has been deprecated; users should move to the new CLI entry point [6]. - Training Arguments: The report_to argument now defaults to "none", meaning logging integrations are no longer auto-detected by default and must be explicitly enabled [6][7]. Tokenization: - Backend Unification: The distinction between "Fast" and "Slow" tokenizers has been removed; the library now focuses on a unified tokenizer backend [4][8][3]. - File Structure: Tokenizer definitions are consolidated into a single file per model (tokenization_.py) [4][9][10]. Command Line Interface (CLI): - transformers chat: This command has been overhauled. The feature to start a Chat Completion server has been removed from transformers chat and is now handled by the new transformers serve command to separate client and server concerns [6][7][3]. General API & Cleanup: - Extensive removal of long-deprecated arguments and methods throughout the Trainer, TrainingArguments, and modeling files [6][7]. - Removal of legacy configuration file saving [6][7]. - Stricter serialization requirements, including the removal of support for safe_serialization=False [1][2]. Users are advised to consult the official v5 Migration Guide for detailed instructions on handling these updates, as this release removes many legacy features that had been deprecated over several years [6][1][5].
Citations:
- 1: https://github.com/huggingface/transformers/blob/eb981ae8688d459f40f35b0e0c352b5ca3cb3613/MIGRATION_GUIDE_V5.md
- 2: https://github.com/huggingface/transformers/blob/v5.5.3/MIGRATION_GUIDE_V5.md
- 3: https://huggingface.co/blog/transformers-v5
- 4: https://github.com/huggingface/transformers/blob/6f90cbb5729ed94ae05d385c359fa277172663b9/MIGRATION_GUIDE_V5.md
- 5: https://newreleases.io/project/pypi/transformers/release/5.0.0
- 6: https://github.com/huggingface/transformers/releases/tag/v5.0.0
- 7: https://github.com/huggingface/transformers/releases/tag/v5.0.0rc0
- 8: https://github.com/huggingface/blog/blob/main/transformers-v5.md
- 9: https://github.com/huggingface/transformers/blob/v5.0.0rc1/MIGRATION_GUIDE_V5.md
- 10: https://github.com/huggingface/transformers/blob/main/MIGRATION_GUIDE_V5.md
🌐 Web query:
Is transformers 5.0.0 officially released and stable?
💡 Result:
Yes. Hugging Face’s Transformers 5.0.0 is officially released (as a final/non-RC release) and is marked stable on PyPI. Evidence: - The official GitHub release tag “huggingface/transformers v5.0.0” exists and describes it as the “full V5 release” (not an RC). [1] - On PyPI, the package version 5.0.0 has “Status: 5 - Production/Stable”. [2] - The release announcement also indicates they consider it a full major release and start weekly minor releases from v5 onward (which is consistent with a stable production release rather than an RC). [1] Note on “stable” in practice: “Production/Stable” is the PyPI classifier, but, like any major release, you should still test your specific models/pipelines when upgrading from an earlier Transformers version.
Citations:
- 1: https://github.com/huggingface/transformers/releases/tag/v5.0.0
- 2: https://pypi.org/project/transformers/5.0.0/
🏁 Script executed:
#!/bin/bash
# Search for transformers usage patterns that may be affected
rg -n --type=py 'from transformers import|import transformers' -A 3
# Look for common API patterns that often change in major versions
rg -n --type=py 'AutoModel|AutoTokenizer|pipeline|Trainer|PreTrainedModel' -C 3
# Check for deprecated API usage
rg -n --type=py '\.from_pretrained|\.save_pretrained|\.generate' -C 2Repository: lightspeed-core/lightspeed-stack
Length of output: 16042
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show the exact transformers lines in the hash requirements file
echo "=== transformers entries in .konflux/requirements.hashes.wheel.txt ==="
rg -n "^(transformers==|transformers==5\.0\.0)" .konflux/requirements.hashes.wheel.txt -n || true
sed -n '190,235p' .konflux/requirements.hashes.wheel.txt | nl -ba | sed -n '1,120p'
# Check whether this repo imports/uses transformers anywhere
echo "=== rg transformers usage in repo (Python) ==="
rg -n --type=py "(from transformers import|import transformers|transformers\.)" . || true
# Also search for requirements/lockfiles usage
echo "=== rg transformers in dependency files ==="
rg -n "transformers==|transformers>=" . || trueRepository: lightspeed-core/lightspeed-stack
Length of output: 234
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show the exact transformers lines in the hash requirements file
echo "=== transformers entries in .konflux/requirements.hashes.wheel.txt ==="
rg -n "^(transformers==|transformers==5\.0\.0)" .konflux/requirements.hashes.wheel.txt -n || true
sed -n '190,235p' .konflux/requirements.hashes.wheel.txt | nl -ba | sed -n '1,120p'
# Check whether this repo imports/uses transformers anywhere
echo "=== rg transformers usage in repo (Python) ==="
rg -n --type=py "(from transformers import|import transformers|transformers\.)" . || true
# Also search for requirements/lockfiles usage
echo "=== rg transformers in dependency files ==="
rg -n "transformers==|transformers>=" . || trueRepository: lightspeed-core/lightspeed-stack
Length of output: 234
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE=".konflux/requirements.hashes.wheel.txt"
echo "=== transformers entries (context) in $FILE ==="
rg -n "^(transformers==|transformers==5\.0\.0)" "$FILE" -n || true
# Print the exact snippet around the transformers line
LINE_NUM=$(rg -n "^transformers==5\.0\.0" "$FILE" | cut -d: -f1 | head -n1 || true)
if [ -n "${LINE_NUM:-}" ]; then
start=$((LINE_NUM-3))
end=$((LINE_NUM+3))
sed -n "${start},${end}p" "$FILE" | nl -ba 2>/dev/null || sed -n "${start},${end}p" "$FILE"
fi
echo "=== pinned torch / huggingface-hub / trl entries in $FILE ==="
rg -n "^(torch==|huggingface-hub==|trl==)" "$FILE" -n || true
echo "=== transformers usage in repo (Python) ==="
rg -n --type=py "(from transformers import|import transformers|transformers\.)" . || trueRepository: lightspeed-core/lightspeed-stack
Length of output: 851
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE=".konflux/requirements.hashes.wheel.txt"
echo "=== pinned sentence-transformers in $FILE ==="
rg -n "^(sentence-transformers==)" "$FILE" -n || true
echo "=== pinned huggingface-hub in $FILE (any version) ==="
rg -n "^(huggingface-hub==)" "$FILE" -n || true
rg -n "huggingface-hub" "$FILE" -n | head -n 50 || true
echo "=== pinned trl in $FILE (any version) ==="
rg -n "^(trl==)" "$FILE" -n || true
rg -n "trl" "$FILE" -n | head -n 50 || true
echo "=== any direct hf/transformers imports in python (broader match) ==="
rg -n --type=py "(from\s+transformers\s+import|import\s+transformers|transformers\.)" src tests || true
echo "=== check around the config line mentioning sentence-transformers ==="
sed -n '2050,2085p' src/models/config.py | nl -ba | head -n 80Repository: lightspeed-core/lightspeed-stack
Length of output: 633
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE=".konflux/requirements.hashes.wheel.txt"
echo "=== transformers blocks around all occurrences ==="
rg -n "^(transformers==5\.0\.0)" "$FILE" -n
# show context around each occurrence (within a small window)
for L in $(rg -n "^(transformers==5\.0\.0)" "$FILE" | cut -d: -f1); do
start=$((L-5))
end=$((L+6))
echo "--- around transformers==5.0.0 at line $L ---"
sed -n "${start},${end}p" "$FILE" | cat -n
done
echo "=== torch / huggingface-hub / trl pins in this file (exact) ==="
rg -n "^(torch==|huggingface-hub==|trl==)" "$FILE" -n || true
echo "=== other exact pins elsewhere in repo ==="
rg -n "huggingface-hub==1\.18\.0|trl==1\.5\.1" . || true
rg -n "huggingface-hub==|trl==" . || true
echo "=== direct transformers imports (beyond docstring matches) ==="
rg -n --type=py "(from\s+transformers\s+import|import\s+transformers|transformers\.)" src tests || trueRepository: lightspeed-core/lightspeed-stack
Length of output: 1484
Require validation for transformers 5.0.0 major upgrade
transformers is pinned to 5.0.0 in .konflux/requirements.hashes.wheel.txt (lines 217-218); Transformers v5 includes multiple documented breaking changes, so this is high-risk without targeted runtime checks.
transformers==5.0.0 \
--hash=sha256:d17545021abc14a7e631ce23023a50cd6962968fc6bdbae5ed3ede2b9bb4b9d7- Confirm what part of this stack actually uses
transformers(repo shows no direct imports; only a docstring mention ofsentence-transformers) - Apply the official v5 migration guide and address breaking areas relevant to your usage (backend/tokenizer/trainer+training-args cleanup/serialization behavior)
- Test model loading + tokenizer behavior + inference (and training loops if applicable) against representative checkpoints/tokenizers
- Verify dependency compatibility using the resolved
huggingface-hubandtrlversions for this environment (this file only pinstorch==2.9.1;huggingface-hub==1.18.0andtrl==1.5.1are not present here)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.konflux/requirements.hashes.wheel.txt around lines 217 - 218, The
requirement pins transformers==5.0.0 which is a major upgrade; locate the entry
"transformers==5.0.0" in .konflux/requirements.hashes.wheel.txt and (1) add a
short runtime validation in the app startup to assert transformers.__version__
startswith "5." and that required libs (check imports for "transformers" and
"sentence_transformers") and compatible versions of huggingface_hub and trl are
installed (raise clear error if not), (2) run the HF v5 migration checklist
against any code paths using transformers (search for imports of transformers /
sentence-transformers, model/tokenizer loading and Trainer/TrainingArguments
usage) and apply migration fixes (tokenizer/model loading calls, serialization,
Trainer arg changes), (3) add or update pinned compatible versions for
huggingface-hub and trl in requirements (or add a compatibility note next to
"transformers==5.0.0") and (4) add automated tests that load a representative
checkpoint + tokenizer and run a simple inference (and training loop if used) to
verify model/tokenizer behavior.
Description
LCORE-2539: bump-up
transformerspackage [konflux]Type of change
Tools used to create PR
Related Tickets & Documents
Summary by CodeRabbit