Skip to content

LCORE-2539: bump-up transformers package [konflux]#1892

Closed
tisnik wants to merge 1 commit into
lightspeed-core:mainfrom
tisnik:lcore-2593
Closed

LCORE-2539: bump-up transformers package [konflux]#1892
tisnik wants to merge 1 commit into
lightspeed-core:mainfrom
tisnik:lcore-2593

Conversation

@tisnik

@tisnik tisnik commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Description

LCORE-2539: bump-up transformers package [konflux]

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement
  • Benchmarks improvement

Tools used to create PR

  • Assisted-by: N/A
  • Generated by: N/A

Related Tickets & Documents

  • Related Issue #LCORE-2539

Summary by CodeRabbit

  • Chores
    • Updated core Python dependencies including transformers (4.57.6 → 5.0.0), huggingface-hub (0.36.2 → 1.18.0), and trl (1.0.0 → 1.5.1).
    • Added new dependencies: hf-xet (1.4.3) and typer-slim (0.24.0).
    • Updated build pipeline configurations for dependency prefetching.

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Walkthrough

This PR updates pinned Python dependencies and their associated hash values across the build configuration, including bumping huggingface-hub, trl, and transformers versions; adding new dependencies (hf-xet and typer-slim); and synchronizing Tekton pipeline prefetch configuration to align with the dependency changes.

Changes

Dependency Updates and Build Configuration

Layer / File(s) Summary
Source dependency updates with hashes
.konflux/requirements.hashes.source.txt
huggingface-hub upgraded from 0.36.2 to 1.18.0 with updated hashes; hf-xet version 1.4.3 added with hash entries; trl upgraded from 1.0.0 to 1.5.1 with updated hashes; typer-slim version 0.24.0 added with hash entries.
Generated wheel requirements and dependency overrides
.konflux/requirements.hashes.wheel.txt, .konflux/requirements.overrides.txt
Autogenerated wheel requirements list removes hf-xet entry; transformers updated from 4.57.6 to 5.0.0 in wheel hashes with new hash values; transformers override updated to 5.0.0 in overrides file.
Tekton pipeline prefetch configuration
.tekton/lightspeed-stack-pull-request.yaml, .tekton/lightspeed-stack-push.yaml
hf-xet removed from binary.packages prefetch list in both pull-request and push Tekton pipeline configurations.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related issues

  • #1381 — Both changes modify Tekton pipeline configuration fields in .tekton/lightspeed-stack-pull-request.yaml and .tekton/lightspeed-stack-push.yaml binary.packages lists to manage prefetched dependencies.
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main change: updating the transformers package version (from 4.57.6 to 5.0.0) with Konflux configuration updates.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@tisnik tisnik changed the title LCORE-2593: bump-up transformers package [konflux] LCORE-2539: bump-up transformers package [konflux] Jun 10, 2026
@tisnik tisnik force-pushed the lcore-2593 branch 2 times, most recently from 70a96d9 to cfecef6 Compare June 10, 2026 07:31

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.konflux/requirements.hashes.wheel.txt:
- Around line 217-218: The requirement pins transformers==5.0.0 which is a major
upgrade; locate the entry "transformers==5.0.0" in
.konflux/requirements.hashes.wheel.txt and (1) add a short runtime validation in
the app startup to assert transformers.__version__ startswith "5." and that
required libs (check imports for "transformers" and "sentence_transformers") and
compatible versions of huggingface_hub and trl are installed (raise clear error
if not), (2) run the HF v5 migration checklist against any code paths using
transformers (search for imports of transformers / sentence-transformers,
model/tokenizer loading and Trainer/TrainingArguments usage) and apply migration
fixes (tokenizer/model loading calls, serialization, Trainer arg changes), (3)
add or update pinned compatible versions for huggingface-hub and trl in
requirements (or add a compatibility note next to "transformers==5.0.0") and (4)
add automated tests that load a representative checkpoint + tokenizer and run a
simple inference (and training loop if used) to verify model/tokenizer behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 687ff909-629e-4f69-8ce6-0efa99bda1da

📥 Commits

Reviewing files that changed from the base of the PR and between ccebaff and 9244517.

📒 Files selected for processing (5)
  • .konflux/requirements.hashes.source.txt
  • .konflux/requirements.hashes.wheel.txt
  • .konflux/requirements.overrides.txt
  • .tekton/lightspeed-stack-pull-request.yaml
  • .tekton/lightspeed-stack-push.yaml
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: Pylinter
  • GitHub Check: build-pr
  • GitHub Check: unit_tests (3.12)
  • GitHub Check: unit_tests (3.13)
  • GitHub Check: mypy
  • GitHub Check: E2E: library mode / ci / group 3
  • GitHub Check: E2E: library mode / ci / group 1
  • GitHub Check: E2E: server mode / ci / group 2
  • GitHub Check: E2E: library mode / ci / group 2
  • GitHub Check: E2E: server mode / ci / group 3
  • GitHub Check: E2E: server mode / ci / group 1
  • GitHub Check: E2E Tests for Lightspeed Evaluation job
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
🔇 Additional comments (7)
.konflux/requirements.overrides.txt (1)

3-3: LGTM!

.tekton/lightspeed-stack-pull-request.yaml (1)

61-61: LGTM!

.tekton/lightspeed-stack-push.yaml (1)

57-57: LGTM!

.konflux/requirements.hashes.source.txt (4)

1120-1122: Validate TRL 1.5.1 ↔ transformers 5.0.0 for the specific TRL components you use.

TRL 1.5.1 is designed to work with transformers 5.0.0, but some TRL components impose stricter version constraints (e.g., PairRMJudge requires transformers < 5.0.0). Also, there are no direct trl imports in the repo’s Python sources, so confirm which TRL classes are exercised by your training stack; if any constrained components are used, adjust the transformers/trl pins accordingly.


550-552: Check compatibility of huggingface-hub==1.18.0 with the pinned stack

Repo scan found no direct huggingface_hub imports in the Python code, so any breakage would come via transitive usage (e.g., transformers). huggingface-hub v1.x includes breaking behavior/behavior changes such as the HTTP backend migration (requests→httpx), removal of configure_http_backend in favor of client factories, hf_hub_download switching to fresh temp-file downloads (no resuming partial files), stricter token/dir permissions, and a raised minimum Python version (>=3.9).

Confirm the pinned transformers (and Python) versions are compatible with these changes, and that your expected download/resume and filesystem permission behavior matches your runtime environments.


1123-1125: Clarify why typer-slim==0.24.0 was added.

typer-slim==0.24.0 is pinned only in .konflux/requirements.hashes.source.txt (lines 1123-1125) and isn’t mentioned in the PR scope. Please confirm:

  • Whether typer-slim is a transitive dependency of an upgraded package (e.g., transformers/huggingface-hub) and which one pulls it in.
  • Whether it’s required for runtime functionality or only for ancillary tooling/CLI entrypoints.

524-549: Validate hf-xet 1.4.3 source build & runtime compatibility with pinned transformers/hub

  • hf-xet==1.4.3 is pinned in .konflux/requirements.hashes.source.txt, and scripts/konflux_requirements.sh routes PyPI packages to requirements.hashes.source.txt (wheel-prefetch list comes from the Red Hat index), so hf-xet should no longer be treated as a wheel-prefetch binary in Tekton.
  • .tekton/lightspeed-stack-pull-request.yaml lists hf_xet among deps “need cargo to build”, and cargo is present in .konflux/rpms.lock.yaml, so the build environment must provide Rust/cargo for the source install.
  • transformers==5.0.0 (wheel) and huggingface-hub==1.18.0 (source) are pinned; PyPI metadata for hf-xet==1.4.3 doesn’t provide runtime dependency constraints beyond a pytest extra, so compatibility can’t be inferred—run the build + a minimal runtime/import check on all target platforms.

Comment on lines +217 to +218
transformers==5.0.0 \
--hash=sha256:d17545021abc14a7e631ce23023a50cd6962968fc6bdbae5ed3ede2b9bb4b9d7

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

What are the breaking changes in transformers 5.0.0 compared to 4.57.6?

💡 Result:

Transformers v5.0.0 represents a major architectural shift for the library, focusing on simplification, performance, and a unified PyTorch-centric backend. Key breaking changes compared to v4.57.6 include: Library-wide Changes: - Removal of TensorFlow and JAX: The library now focuses exclusively on PyTorch as the primary backend to reduce maintenance complexity [1][2][3]. - HTTP Backend Shift: The underlying HTTP client has been switched from requests to httpx [1][4]. Users catching requests.HTTPError must update to catch httpx.HTTPError [1][4]. - Hub Integration: The hf_transfer environment variable (HF_HUB_ENABLE_HF_TRANSFER) has been dropped in favor of hf_xet [1][4]. Additionally, the library now pins huggingface_hub to >=1.0.0 [1][5]. - CLI Changes: The transformers-cli command has been deprecated; users should move to the new CLI entry point [6]. - Training Arguments: The report_to argument now defaults to "none", meaning logging integrations are no longer auto-detected by default and must be explicitly enabled [6][7]. Tokenization: - Backend Unification: The distinction between "Fast" and "Slow" tokenizers has been removed; the library now focuses on a unified tokenizer backend [4][8][3]. - File Structure: Tokenizer definitions are consolidated into a single file per model (tokenization_.py) [4][9][10]. Command Line Interface (CLI): - transformers chat: This command has been overhauled. The feature to start a Chat Completion server has been removed from transformers chat and is now handled by the new transformers serve command to separate client and server concerns [6][7][3]. General API & Cleanup: - Extensive removal of long-deprecated arguments and methods throughout the Trainer, TrainingArguments, and modeling files [6][7]. - Removal of legacy configuration file saving [6][7]. - Stricter serialization requirements, including the removal of support for safe_serialization=False [1][2]. Users are advised to consult the official v5 Migration Guide for detailed instructions on handling these updates, as this release removes many legacy features that had been deprecated over several years [6][1][5].

Citations:


🌐 Web query:

Is transformers 5.0.0 officially released and stable?

💡 Result:

Yes. Hugging Face’s Transformers 5.0.0 is officially released (as a final/non-RC release) and is marked stable on PyPI. Evidence: - The official GitHub release tag “huggingface/transformers v5.0.0” exists and describes it as the “full V5 release” (not an RC). [1] - On PyPI, the package version 5.0.0 has “Status: 5 - Production/Stable”. [2] - The release announcement also indicates they consider it a full major release and start weekly minor releases from v5 onward (which is consistent with a stable production release rather than an RC). [1] Note on “stable” in practice: “Production/Stable” is the PyPI classifier, but, like any major release, you should still test your specific models/pipelines when upgrading from an earlier Transformers version.

Citations:


🏁 Script executed:

#!/bin/bash
# Search for transformers usage patterns that may be affected
rg -n --type=py 'from transformers import|import transformers' -A 3

# Look for common API patterns that often change in major versions
rg -n --type=py 'AutoModel|AutoTokenizer|pipeline|Trainer|PreTrainedModel' -C 3

# Check for deprecated API usage
rg -n --type=py '\.from_pretrained|\.save_pretrained|\.generate' -C 2

Repository: lightspeed-core/lightspeed-stack

Length of output: 16042


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show the exact transformers lines in the hash requirements file
echo "=== transformers entries in .konflux/requirements.hashes.wheel.txt ==="
rg -n "^(transformers==|transformers==5\.0\.0)" .konflux/requirements.hashes.wheel.txt -n || true
sed -n '190,235p' .konflux/requirements.hashes.wheel.txt | nl -ba | sed -n '1,120p'

# Check whether this repo imports/uses transformers anywhere
echo "=== rg transformers usage in repo (Python) ==="
rg -n --type=py "(from transformers import|import transformers|transformers\.)" . || true

# Also search for requirements/lockfiles usage
echo "=== rg transformers in dependency files ==="
rg -n "transformers==|transformers>=" . || true

Repository: lightspeed-core/lightspeed-stack

Length of output: 234


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show the exact transformers lines in the hash requirements file
echo "=== transformers entries in .konflux/requirements.hashes.wheel.txt ==="
rg -n "^(transformers==|transformers==5\.0\.0)" .konflux/requirements.hashes.wheel.txt -n || true
sed -n '190,235p' .konflux/requirements.hashes.wheel.txt | nl -ba | sed -n '1,120p'

# Check whether this repo imports/uses transformers anywhere
echo "=== rg transformers usage in repo (Python) ==="
rg -n --type=py "(from transformers import|import transformers|transformers\.)" . || true

# Also search for requirements/lockfiles usage
echo "=== rg transformers in dependency files ==="
rg -n "transformers==|transformers>=" . || true

Repository: lightspeed-core/lightspeed-stack

Length of output: 234


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE=".konflux/requirements.hashes.wheel.txt"

echo "=== transformers entries (context) in $FILE ==="
rg -n "^(transformers==|transformers==5\.0\.0)" "$FILE" -n || true
# Print the exact snippet around the transformers line
LINE_NUM=$(rg -n "^transformers==5\.0\.0" "$FILE" | cut -d: -f1 | head -n1 || true)
if [ -n "${LINE_NUM:-}" ]; then
  start=$((LINE_NUM-3))
  end=$((LINE_NUM+3))
  sed -n "${start},${end}p" "$FILE" | nl -ba 2>/dev/null || sed -n "${start},${end}p" "$FILE"
fi

echo "=== pinned torch / huggingface-hub / trl entries in $FILE ==="
rg -n "^(torch==|huggingface-hub==|trl==)" "$FILE" -n || true

echo "=== transformers usage in repo (Python) ==="
rg -n --type=py "(from transformers import|import transformers|transformers\.)" . || true

Repository: lightspeed-core/lightspeed-stack

Length of output: 851


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE=".konflux/requirements.hashes.wheel.txt"

echo "=== pinned sentence-transformers in $FILE ==="
rg -n "^(sentence-transformers==)" "$FILE" -n || true

echo "=== pinned huggingface-hub in $FILE (any version) ==="
rg -n "^(huggingface-hub==)" "$FILE" -n || true
rg -n "huggingface-hub" "$FILE" -n | head -n 50 || true

echo "=== pinned trl in $FILE (any version) ==="
rg -n "^(trl==)" "$FILE" -n || true
rg -n "trl" "$FILE" -n | head -n 50 || true

echo "=== any direct hf/transformers imports in python (broader match) ==="
rg -n --type=py "(from\s+transformers\s+import|import\s+transformers|transformers\.)" src tests || true

echo "=== check around the config line mentioning sentence-transformers ==="
sed -n '2050,2085p' src/models/config.py | nl -ba | head -n 80

Repository: lightspeed-core/lightspeed-stack

Length of output: 633


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE=".konflux/requirements.hashes.wheel.txt"

echo "=== transformers blocks around all occurrences ==="
rg -n "^(transformers==5\.0\.0)" "$FILE" -n
# show context around each occurrence (within a small window)
for L in $(rg -n "^(transformers==5\.0\.0)" "$FILE" | cut -d: -f1); do
  start=$((L-5))
  end=$((L+6))
  echo "--- around transformers==5.0.0 at line $L ---"
  sed -n "${start},${end}p" "$FILE" | cat -n
done

echo "=== torch / huggingface-hub / trl pins in this file (exact) ==="
rg -n "^(torch==|huggingface-hub==|trl==)" "$FILE" -n || true

echo "=== other exact pins elsewhere in repo ==="
rg -n "huggingface-hub==1\.18\.0|trl==1\.5\.1" . || true
rg -n "huggingface-hub==|trl==" . || true

echo "=== direct transformers imports (beyond docstring matches) ==="
rg -n --type=py "(from\s+transformers\s+import|import\s+transformers|transformers\.)" src tests || true

Repository: lightspeed-core/lightspeed-stack

Length of output: 1484


Require validation for transformers 5.0.0 major upgrade

transformers is pinned to 5.0.0 in .konflux/requirements.hashes.wheel.txt (lines 217-218); Transformers v5 includes multiple documented breaking changes, so this is high-risk without targeted runtime checks.

transformers==5.0.0 \
    --hash=sha256:d17545021abc14a7e631ce23023a50cd6962968fc6bdbae5ed3ede2b9bb4b9d7
  • Confirm what part of this stack actually uses transformers (repo shows no direct imports; only a docstring mention of sentence-transformers)
  • Apply the official v5 migration guide and address breaking areas relevant to your usage (backend/tokenizer/trainer+training-args cleanup/serialization behavior)
  • Test model loading + tokenizer behavior + inference (and training loops if applicable) against representative checkpoints/tokenizers
  • Verify dependency compatibility using the resolved huggingface-hub and trl versions for this environment (this file only pins torch==2.9.1; huggingface-hub==1.18.0 and trl==1.5.1 are not present here)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.konflux/requirements.hashes.wheel.txt around lines 217 - 218, The
requirement pins transformers==5.0.0 which is a major upgrade; locate the entry
"transformers==5.0.0" in .konflux/requirements.hashes.wheel.txt and (1) add a
short runtime validation in the app startup to assert transformers.__version__
startswith "5." and that required libs (check imports for "transformers" and
"sentence_transformers") and compatible versions of huggingface_hub and trl are
installed (raise clear error if not), (2) run the HF v5 migration checklist
against any code paths using transformers (search for imports of transformers /
sentence-transformers, model/tokenizer loading and Trainer/TrainingArguments
usage) and apply migration fixes (tokenizer/model loading calls, serialization,
Trainer arg changes), (3) add or update pinned compatible versions for
huggingface-hub and trl in requirements (or add a compatibility note next to
"transformers==5.0.0") and (4) add automated tests that load a representative
checkpoint + tokenizer and run a simple inference (and training loop if used) to
verify model/tokenizer behavior.

@tisnik tisnik marked this pull request as draft June 10, 2026 09:40
@tisnik tisnik closed this Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant