Skip to content

Upgrade transformers to 5.9 and huggingface-hub to 1.16#1472

Merged
dxqb merged 4 commits into
Nerogar:mergefrom
dxqb:transformers5
Jun 4, 2026
Merged

Upgrade transformers to 5.9 and huggingface-hub to 1.16#1472
dxqb merged 4 commits into
Nerogar:mergefrom
dxqb:transformers5

Conversation

@dxqb

@dxqb dxqb commented May 24, 2026

Copy link
Copy Markdown
Collaborator
  • Remove HF_HUB_DISABLE_XET workaround from startup scripts; Xet is stable in hub 1.16

  • test

  • Remove _prepare_sub_modules / snapshot_download prefetching; XET should use your full downstream even on a single large file. prefetching multiple files only so that we have full download speed isn't necessary anymore

  • test

  • Suppress httpx INFO logs; hub 1.16 uses httpx internally and logs every HTTP request

Remove workarounds for transformer v4:

  • Delete thread_safety.py and apply_thread_safe_forward calls; workaround for transformers#42673 was fixed upstream in v5
  • Switch ErnieModelLoader to AutoTokenizer; eliminates the tokenization-logger suppress workaround

This is the only real conflict we had with transformers v5:

  • Replace _remove_added_embeddings_from_tokenizer (relied on internal Trie, removed in v5) with orig_tokenizer deep-copies stored at load time; model savers pass use_original_tokenizers=True to create_pipeline() so saved checkpoints use the unmodified tokenizer

previous code used internal API of transformers v4 for training embeddings. But there was already a solution in OneTrainer:
the tokenizers of HiDream and Hunyuan already didn't support this internal API, even in transformers v4. Copied this solution to all other models that support embedding training

- Bump requirements: transformers 4.57.6 → 5.9, huggingface-hub 0.34.4 → 1.16.1
- Remove HF_HUB_DISABLE_XET workaround from startup scripts; Xet is stable in hub 1.16
- Remove _prepare_sub_modules / snapshot_download prefetching; hub 1.16 fetches lazily on demand
- Delete thread_safety.py and apply_thread_safe_forward calls; workaround for transformers#42673
  was fixed upstream in v5
- Replace _remove_added_embeddings_from_tokenizer (relied on internal Trie, removed in v5) with
  orig_tokenizer deep-copies stored at load time; model savers pass use_original_tokenizers=True
  to create_pipeline() so saved checkpoints use the unmodified tokenizer
- Switch ErnieModelLoader to AutoTokenizer; eliminates the tokenization-logger suppress workaround
- Suppress httpx INFO logs; hub 1.16 uses httpx internally and logs every HTTP request

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@dxqb dxqb added the preview merged in the preview branch label May 29, 2026
@dxqb

dxqb commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator Author

@O-J1 XET looks good to me on these versions:

  • download at 100 mbit/s even at 1 file; no need anymore to download multiple files in parallel
  • no problems so far with XET stalling

@O-J1

O-J1 commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

@O-J1 XET looks good to me on these versions:

* download at 100 mbit/s even at 1 file; no need anymore to download multiple files in parallel

* no problems so far with XET stalling

Yep I agree. HF staff reached out after I complained a bunch lol. Its much better now and doesnt DDOS the network devices or computer. I see no issue with removing Xet stuff, which we can do as part of pixi upgrade

@dxqb

dxqb commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator Author

this PR already removes the XET env toggles

@O-J1

O-J1 commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

this PR already removes the XET env toggles

Yes but it also does transformers upgrade which was completely fucked last time we tested. Cant be merged unless every single model and behaviour with Transformers has been checked, has this changed?

@dxqb

dxqb commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator Author

this PR already removes the XET env toggles

Yes but it also does transformers upgrade which was completely fucked last time we tested. Cant be merged unless every single model and behaviour with Transformers has been checked, has this changed?

that was this point. OneTrainer has used an internal API of transformers v4 that wasn't for public use.

This is the only real conflict we had with transformers v5:

Replace _remove_added_embeddings_from_tokenizer (relied on internal Trie, removed in v5) with orig_tokenizer deep-copies stored at load time; model savers pass use_original_tokenizers=True to create_pipeline() so saved checkpoints use the unmodified tokenizer

previous code used internal API of transformers v4 for training embeddings. But there was already a solution in OneTrainer:
the tokenizers of HiDream and Hunyuan already didn't support this internal API, even in transformers v4. Copied this solution to all other models that support embedding training

this PR is ready for merge after the remaining test above.

dxqb added a commit to TheForgotten69/OneTrainer that referenced this pull request Jun 3, 2026
dxqb and others added 2 commits June 4, 2026 21:09
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dxqb

dxqb commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator Author
* [x]  test [[Bug]: Request improved rate limit/worker limit for Hugging Face datasets #1409](https://github.com/Nerogar/OneTrainer/issues/1409)

needed a small change because the huggingface login API has changed, but now it downloads 1000 small files no errors. At a good speed but not at full speed, so I guess huggingface rate limits number of files - but no errors is the important point.

@dxqb dxqb changed the base branch from master to merge June 4, 2026 19:26
@dxqb dxqb merged commit a64f579 into Nerogar:merge Jun 4, 2026
1 check passed
@dxqb dxqb deleted the transformers5 branch June 4, 2026 19:27
@dxqb

dxqb commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator Author

test of XET vs. multi-file download workaround that this PR removes on a very fast RunPod:

without XET: ~ 700 mbyte/s
with XET: between 600 and 800 mbyte/s. not as stable, but this is explainable because I was downloading Flux2 9B. At these speeds, this model is small and the files are downloaded sequentially now

@O-J1

@dxqb dxqb restored the transformers5 branch June 5, 2026 17:58
@dxqb dxqb mentioned this pull request Jun 5, 2026
dxqb added a commit that referenced this pull request Jun 14, 2026
5.5.4 is the last release before CLIP flattening in 5.6, which avoids
the full CLIP-compat migration while still picking up the general v5
fixes from #1472 (Trie removal, thread-safety, hub 1.16/xet cleanup).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
BitcrushedHeart pushed a commit to BitcrushedHeart/OneTrainer that referenced this pull request Jun 20, 2026
* Upgrade transformers to 5.9 and huggingface-hub to 1.16

- Remove HF_HUB_DISABLE_XET workaround from startup scripts; Xet is stable in hub 1.16
- Remove _prepare_sub_modules / snapshot_download prefetching; hub 1.16 fetches lazily on demand
- Delete thread_safety.py and apply_thread_safe_forward calls; workaround for transformers#42673
  was fixed upstream in v5
- Replace _remove_added_embeddings_from_tokenizer (relied on internal Trie, removed in v5) with
  orig_tokenizer deep-copies stored at load time; model savers pass use_original_tokenizers=True
  to create_pipeline() so saved checkpoints use the unmodified tokenizer
- Switch ErnieModelLoader to AutoTokenizer; eliminates the tokenization-logger suppress workaround
- Suppress httpx INFO logs; hub 1.16 uses httpx internally and logs every HTTP request
* fix: drop new_session arg removed from huggingface_hub.login()
* fix: skip huggingface login when no token is configured

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
BitcrushedHeart pushed a commit to BitcrushedHeart/OneTrainer that referenced this pull request Jun 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

preview merged in the preview branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Request improved rate limit/worker limit for Hugging Face datasets [Feat]: Renable Xet under hf_hub once they get their bugs sorted out.

2 participants