Update dependency transformers to v5 [SECURITY]#1907
Open
renovate[bot] wants to merge 1 commit intomasterfrom
Open
Update dependency transformers to v5 [SECURITY]#1907renovate[bot] wants to merge 1 commit intomasterfrom
renovate[bot] wants to merge 1 commit intomasterfrom
Conversation
7dd9b53 to
a548d3d
Compare
a548d3d to
2bebcae
Compare
2bebcae to
885338b
Compare
885338b to
bcdadbb
Compare
bcdadbb to
2cd73a5
Compare
2cd73a5 to
246b41a
Compare
246b41a to
35b3f4a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
>=4.46.0→>=5.0.0>=4.45,<5→>=5.0.0,<6Transformers Regular Expression Denial of Service (ReDoS) vulnerability
CVE-2024-12720 / GHSA-6rvg-6v2m-4j46
More information
Details
A Regular Expression Denial of Service (ReDoS) vulnerability was identified in the huggingface/transformers library, specifically in the file tokenization_nougat_fast.py. The vulnerability occurs in the post_process_single() function, where a regular expression processes specially crafted input. The issue stems from the regex exhibiting exponential time complexity under certain conditions, leading to excessive backtracking. This can result in significantly high CPU usage and potential application downtime, effectively creating a Denial of Service (DoS) scenario. The affected version is v4.46.3.
Severity
CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:LReferences
This data is provided by OSV and the GitHub Advisory Database (CC-BY 4.0).
Deserialization of Untrusted Data in Hugging Face Transformers
CVE-2024-11394 / GHSA-hxxf-235m-72v3 / PYSEC-2024-229
More information
Details
Hugging Face Transformers Trax Model Deserialization of Untrusted Data Remote Code Execution Vulnerability. This vulnerability allows remote attackers to execute arbitrary code on affected installations of Hugging Face Transformers. User interaction is required to exploit this vulnerability in that the target must visit a malicious page or open a malicious file.
The specific flaw exists within the handling of model files. The issue results from the lack of proper validation of user-supplied data, which can result in deserialization of untrusted data. An attacker can leverage this vulnerability to execute code in the context of the current user. Was ZDI-CAN-25012.
Severity
CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:HReferences
This data is provided by OSV and the GitHub Advisory Database (CC-BY 4.0).
Deserialization of Untrusted Data in Hugging Face Transformers
CVE-2024-11392 / GHSA-qxrp-vhvm-j765 / PYSEC-2024-227
More information
Details
Hugging Face Transformers MobileViTV2 Deserialization of Untrusted Data Remote Code Execution Vulnerability. This vulnerability allows remote attackers to execute arbitrary code on affected installations of Hugging Face Transformers. User interaction is required to exploit this vulnerability in that the target must visit a malicious page or open a malicious file.
The specific flaw exists within the handling of configuration files. The issue results from the lack of proper validation of user-supplied data, which can result in deserialization of untrusted data. An attacker can leverage this vulnerability to execute code in the context of the current user. Was ZDI-CAN-24322.
Severity
CVSS:3.0/AV:N/AC:H/PR:N/UI:R/S:U/C:H/I:H/A:HReferences
This data is provided by OSV and the GitHub Advisory Database (CC-BY 4.0).
Deserialization of Untrusted Data in Hugging Face Transformers
CVE-2024-11393 / GHSA-wrfc-pvp9-mr9g / PYSEC-2024-228
More information
Details
Hugging Face Transformers MaskFormer Model Deserialization of Untrusted Data Remote Code Execution Vulnerability. This vulnerability allows remote attackers to execute arbitrary code on affected installations of Hugging Face Transformers. User interaction is required to exploit this vulnerability in that the target must visit a malicious page or open a malicious file.
The specific flaw exists within the parsing of model files. The issue results from the lack of proper validation of user-supplied data, which can result in deserialization of untrusted data. An attacker can leverage this vulnerability to execute code in the context of the current user. Was ZDI-CAN-25191.
Severity
CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:HReferences
This data is provided by OSV and the GitHub Advisory Database (CC-BY 4.0).
CVE-2024-11392 / GHSA-qxrp-vhvm-j765 / PYSEC-2024-227
More information
Details
Hugging Face Transformers MobileViTV2 Deserialization of Untrusted Data Remote Code Execution Vulnerability. This vulnerability allows remote attackers to execute arbitrary code on affected installations of Hugging Face Transformers. User interaction is required to exploit this vulnerability in that the target must visit a malicious page or open a malicious file.
The specific flaw exists within the handling of configuration files. The issue results from the lack of proper validation of user-supplied data, which can result in deserialization of untrusted data. An attacker can leverage this vulnerability to execute code in the context of the current user. Was ZDI-CAN-24322.
Severity
CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:HReferences
This data is provided by OSV and the PyPI Advisory Database (CC-BY 4.0).
CVE-2024-11393 / GHSA-wrfc-pvp9-mr9g / PYSEC-2024-228
More information
Details
Hugging Face Transformers MaskFormer Model Deserialization of Untrusted Data Remote Code Execution Vulnerability. This vulnerability allows remote attackers to execute arbitrary code on affected installations of Hugging Face Transformers. User interaction is required to exploit this vulnerability in that the target must visit a malicious page or open a malicious file.
The specific flaw exists within the parsing of model files. The issue results from the lack of proper validation of user-supplied data, which can result in deserialization of untrusted data. An attacker can leverage this vulnerability to execute code in the context of the current user. Was ZDI-CAN-25191.
Severity
CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:HReferences
This data is provided by OSV and the PyPI Advisory Database (CC-BY 4.0).
CVE-2024-11394 / GHSA-hxxf-235m-72v3 / PYSEC-2024-229
More information
Details
Hugging Face Transformers Trax Model Deserialization of Untrusted Data Remote Code Execution Vulnerability. This vulnerability allows remote attackers to execute arbitrary code on affected installations of Hugging Face Transformers. User interaction is required to exploit this vulnerability in that the target must visit a malicious page or open a malicious file.
The specific flaw exists within the handling of model files. The issue results from the lack of proper validation of user-supplied data, which can result in deserialization of untrusted data. An attacker can leverage this vulnerability to execute code in the context of the current user. Was ZDI-CAN-25012.
Severity
CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:HReferences
This data is provided by OSV and the PyPI Advisory Database (CC-BY 4.0).
CVE-2025-2099 / GHSA-qq3j-4f4f-9583 / PYSEC-2025-40
More information
Details
A vulnerability in the
preprocess_string()function of thetransformers.testing_utilsmodule in huggingface/transformers version v4.48.3 allows for a Regular Expression Denial of Service (ReDoS) attack. The regular expression used to process code blocks in docstrings contains nested quantifiers, leading to exponential backtracking when processing input with a large number of newline characters. An attacker can exploit this by providing a specially crafted payload, causing high CPU usage and potential application downtime, effectively resulting in a Denial of Service (DoS) scenario.Severity
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:HReferences
This data is provided by OSV and the PyPI Advisory Database (CC-BY 4.0).
Transformers Regular Expression Denial of Service (ReDoS) vulnerability
CVE-2025-1194 / GHSA-fpwr-67px-3qhx
More information
Details
A Regular Expression Denial of Service (ReDoS) vulnerability was identified in the huggingface/transformers library, specifically in the file
tokenization_gpt_neox_japanese.pyof the GPT-NeoX-Japanese model. The vulnerability occurs in the SubWordJapaneseTokenizer class, where regular expressions process specially crafted inputs. The issue stems from a regex exhibiting exponential complexity under certain conditions, leading to excessive backtracking. This can result in high CPU usage and potential application downtime, effectively creating a Denial of Service (DoS) scenario. The affected version is v4.48.1 (latest).Severity
CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:LReferences
This data is provided by OSV and the GitHub Advisory Database (CC-BY 4.0).
Hugging Face Transformers Regular Expression Denial of Service
CVE-2025-2099 / GHSA-qq3j-4f4f-9583 / PYSEC-2025-40
More information
Details
A Regular Expression Denial of Service (ReDoS) exists in the
preprocess_string()function of thetransformers.testing_utilsmodule. In versions before 4.50.0, the regex used to process code blocks in docstrings contains nested quantifiers that can trigger catastrophic backtracking when given inputs with many newline characters. An attacker who can supply such input topreprocess_string()(or code paths that call it) can force excessive CPU usage and degrade availability.Fix: released in 4.50.0, which rewrites the regex to avoid the inefficient pattern. ([GitHub][1])
< 4.50.04.50.0Severity
CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:LReferences
This data is provided by OSV and the GitHub Advisory Database (CC-BY 4.0).
Transformers vulnerable to ReDoS attack through its get_imports() function
CVE-2025-3264 / GHSA-jjph-296x-mrcr
More information
Details
A Regular Expression Denial of Service (ReDoS) vulnerability was discovered in the Hugging Face Transformers library, specifically in the
get_imports()function withindynamic_module_utils.py. This vulnerability affects versions 4.49.0 and is fixed in version 4.51.0. The issue arises from a regular expression pattern\s*try\s*:.*?except.*?:used to filter out try/except blocks from Python code, which can be exploited to cause excessive CPU consumption through crafted input strings due to catastrophic backtracking. This vulnerability can lead to remote code loading disruption, resource exhaustion in model serving, supply chain attack vectors, and development pipeline disruption.Severity
CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:LReferences
This data is provided by OSV and the GitHub Advisory Database (CC-BY 4.0).
Transformers's ReDoS vulnerability in get_configuration_file can lead to catastrophic backtracking
CVE-2025-3263 / GHSA-q2wp-rjmx-x6x9
More information
Details
A Regular Expression Denial of Service (ReDoS) vulnerability was discovered in the Hugging Face Transformers library, specifically in the
get_configuration_file()function within thetransformers.configuration_utilsmodule. The affected version is 4.49.0, and the issue is resolved in version 4.51.0. The vulnerability arises from the use of a regular expression patternconfig\.(.*)\.jsonthat can be exploited to cause excessive CPU consumption through crafted input strings, leading to catastrophic backtracking. This can result in model serving disruption, resource exhaustion, and increased latency in applications using the library.Severity
CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:LReferences
This data is provided by OSV and the GitHub Advisory Database (CC-BY 4.0).
Transformers is vulnerable to ReDoS attack through its DonutProcessor class
CVE-2025-3933 / GHSA-37mw-44qp-f5jm
More information
Details
A Regular Expression Denial of Service (ReDoS) vulnerability was discovered in the Hugging Face Transformers library, specifically within the DonutProcessor class's
token2json()method. This vulnerability affects versions 4.51.3 and earlier, and is fixed in version 4.52.1. The issue arises from the regex pattern<s_(.*?)>which can be exploited to cause excessive CPU consumption through crafted input strings due to catastrophic backtracking. This vulnerability can lead to service disruption, resource exhaustion, and potential API service vulnerabilities, impacting document processing tasks using the Donut model.Severity
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:LReferences
This data is provided by OSV and the GitHub Advisory Database (CC-BY 4.0).
Transformers's Improper Input Validation vulnerability can be exploited through username injection
CVE-2025-3777 / GHSA-phhr-52qp-3mj4
More information
Details
Hugging Face Transformers versions up to 4.49.0 are affected by an improper input validation vulnerability in the
image_utils.pyfile. The vulnerability arises from insecure URL validation using thestartswith()method, which can be bypassed through URL username injection. This allows attackers to craft URLs that appear to be from YouTube but resolve to malicious domains, potentially leading to phishing attacks, malware distribution, or data exfiltration. The issue is fixed in version 4.52.1.Severity
CVSS:3.0/AV:N/AC:L/PR:L/UI:R/S:U/C:L/I:N/A:NReferences
This data is provided by OSV and the GitHub Advisory Database (CC-BY 4.0).
Hugging Face Transformers vulnerable to Regular Expression Denial of Service (ReDoS) in the AdamWeightDecay optimizer
CVE-2025-6921 / GHSA-4w7r-h757-3r74
More information
Details
The huggingface/transformers library, versions prior to 4.53.0, is vulnerable to Regular Expression Denial of Service (ReDoS) in the AdamWeightDecay optimizer. The vulnerability arises from the _do_use_weight_decay method, which processes user-controlled regular expressions in the include_in_weight_decay and exclude_from_weight_decay lists. Malicious regular expressions can cause catastrophic backtracking during the re.search call, leading to 100% CPU utilization and a denial of service. This issue can be exploited by attackers who can control the patterns in these lists, potentially causing the machine learning task to hang and rendering services unresponsive.
Severity
CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:LReferences
This data is provided by OSV and the GitHub Advisory Database (CC-BY 4.0).
Hugging Face Transformers is vulnerable to ReDoS through its MarianTokenizer
CVE-2025-6638 / GHSA-59p9-h35m-wg4g
More information
Details
A Regular Expression Denial of Service (ReDoS) vulnerability was discovered in the Hugging Face Transformers library, specifically affecting the MarianTokenizer's
remove_language_code()method. This vulnerability is present in version 4.52.4 and has been fixed in version 4.53.0. The issue arises from inefficient regex processing, which can be exploited by crafted input strings containing malformed language code patterns, leading to excessive CPU consumption and potential denial of service.Severity
CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:LReferences
This data is provided by OSV and the GitHub Advisory Database (CC-BY 4.0).
Hugging Face Transformers Regular Expression Denial of Service (ReDoS) vulnerability
CVE-2025-5197 / GHSA-9356-575x-2w9m
More information
Details
A Regular Expression Denial of Service (ReDoS) vulnerability exists in the Hugging Face Transformers library, specifically in the
convert_tf_weight_name_to_pt_weight_name()function. This function, responsible for converting TensorFlow weight names to PyTorch format, uses a regex pattern/[^/]*___([^/]*)/that can be exploited to cause excessive CPU consumption through crafted input strings due to catastrophic backtracking. The vulnerability affects versions up to 4.51.3 and is fixed in version 4.53.0. This issue can lead to service disruption, resource exhaustion, and potential API service vulnerabilities, impacting model conversion processes between TensorFlow and PyTorch formats.Severity
CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:LReferences
This data is provided by OSV and the GitHub Advisory Database (CC-BY 4.0).
Hugging Face Transformers library has Regular Expression Denial of Service
CVE-2025-6051 / GHSA-rcv9-qm8p-9p6j
More information
Details
A Regular Expression Denial of Service (ReDoS) vulnerability was discovered in the Hugging Face Transformers library, specifically within the
normalize_numbers()method of theEnglishNormalizerclass. This vulnerability affects versions up to 4.52.4 and is fixed in version 4.53.0. The issue arises from the method's handling of numeric strings, which can be exploited using crafted input strings containing long sequences of digits, leading to excessive CPU consumption. This vulnerability impacts text-to-speech and number normalization tasks, potentially causing service disruption, resource exhaustion, and API vulnerabilities.Severity
CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:LReferences
This data is provided by OSV and the GitHub Advisory Database (CC-BY 4.0).
HuggingFace Transformers allows for arbitrary code execution in the
TrainerclassCVE-2026-1839 / GHSA-69w3-r845-3855
More information
Details
A vulnerability in the HuggingFace Transformers library, specifically in the
Trainerclass, allows for arbitrary code execution. The_load_rng_state()method insrc/transformers/trainer.pyat line 3059 callstorch.load()without theweights_only=Trueparameter. This issue affects all versions of the library supportingtorch>=2.2when used with PyTorch versions below 2.6, as thesafe_globals()context manager provides no protection in these versions. An attacker can exploit this vulnerability by supplying a malicious checkpoint file, such asrng_state.pth, which can execute arbitrary code when loaded. The issue is resolved in version v5.0.0rc3.Severity
CVSS:3.0/AV:L/AC:H/PR:N/UI:R/S:U/C:H/I:L/A:HReferences
This data is provided by OSV and the GitHub Advisory Database (CC-BY 4.0).
Release Notes
huggingface/transformers (transformers)
v5.0.0: Transformers v5Compare Source
Transformers v5 release notes
We have a migration guide that will be continuously updated available on the
mainbranch, please check it out in case you're facing issues: migration guide.Highlights
We are excited to announce the initial release of Transformers v5. This is the first major release in five years, and the release is significant: 1200 commits have been pushed to
mainsince the latest minor release. This release removes a lot of long-due deprecations, introduces several refactors that significantly simplify our APIs and internals, and comes with a large number of bug fixes.We give an overview of our focus for this release in the following blogpost. In these release notes, we'll focus directly on the refactors and new APIs coming with v5.
This release is the full V5 release. It sets in motion something bigger: going forward, starting with v5, we'll now release minor releases every week, rather than every 5 weeks. Expect v5.1 to follow next week, then v5.2 the week that follows, etc.
We're moving forward with this change to ensure you have access to models as soon as they're supported in the library, rather than a few weeks after.
In order to install this release, please do so with the following:
For us to deliver the best package possible, it is imperative that we have feedback on how the toolkit is currently working for you. Please try it out, and open an issue in case you're facing something inconsistent/a bug.
Transformers version 5 is a community endeavor, and we couldn't have shipped such a massive release without the help of the entire community.
Significant API changes
Dynamic weight loading
We introduce a new weight loading API in
transformers, which significantly improves on the previous API. Thisweight loading API is designed to apply operations to the checkpoints loaded by transformers.
Instead of loading the checkpoint exactly as it is serialized within the model, these operations can reshape, merge,
and split the layers according to how they're defined in this new API. These operations are often a necessity when
working with quantization or parallelism algorithms.
This new API is centered around the new
WeightConverterclass:The weight converter is designed to apply a list of operations on the source keys, resulting in target keys. A common
operation done on the attention layers is to fuse the query, key, values layers. Doing so with this API would amount
to defining the following conversion:
In this situation, we apply the
Concatenateoperation, which accepts a list of layers as input and returns a singlelayer.
This allows us to define a mapping from architecture to a list of weight conversions. Applying those weight conversions
can apply arbitrary transformations to the layers themselves. This significantly simplified the
from_pretrainedmethodand helped us remove a lot of technical debt that we accumulated over the past few years.
This results in several improvements:
Linked PR: #41580
Tokenization
Just as we moved towards a single backend library for model definition, we want our tokenizers, and the
Tokenizerobject to be a lot more intuitive. With v5, tokenizer definition is much simpler; one can now initialize an emptyLlamaTokenizerand train it directly on your corpus.Defining a new tokenizer object should be as simple as this:
Once the tokenizer is defined as above, you can load it with the following:
Llama5Tokenizer(). Doing this returns you an empty, trainable tokenizer that follows the definition of the authors ofLlama5(it does not exist yet 😉).The above is the main motivation towards refactoring tokenization: we want tokenizers to behave similarly to models: trained or empty, and with exactly what is defined in their class definition.
Backend Architecture Changes: moving away from the slow/fast tokenizer separation
Up to now, transformers maintained two parallel implementations for many tokenizers:
tokenization_<model>.py) - Python-based implementations, often using SentencePiece as the backend.tokenization_<model>_fast.py) - Rust-based implementations using the 🤗 tokenizers library.In v5, we consolidate to a single tokenizer file per model:
tokenization_<model>.py. This file will use the most appropriate backend available:sentencepiecelibrary. It inherits fromPythonBackend.tokenizers. Basically allows adding tokens.MistralCommon's tokenization library. (Previously known as theMistralCommonTokenizer)The
AutoTokenizerautomatically selects the appropriate backend based on available files and dependencies. This is transparent, you continue to useAutoTokenizer.from_pretrained()as before. This allows transformers to be future-proof and modular to easily support future backends.Defining a tokenizers outside of the existing backends
We enable users and tokenizer builders to define their own tokenizers from top to bottom. Tokenizers are usually defined using a backend such as
tokenizers,sentencepieceormistral-common, but we offer the possibility to design the tokenizer at a higher-level, without relying on those backends.To do so, you can import the
PythonBackend(which was previously known asPreTrainedTokenizer). This class encapsulates all the logic related to added tokens, encoding, and decoding.If you want something even higher up the stack, then
PreTrainedTokenizerBaseis whatPythonBackendinherits from. It contains the very basic tokenizer API features:encodedecodevocab_sizeget_vocabconvert_tokens_to_idsconvert_ids_to_tokensfrom_pretrainedsave_pretrainedAPI Changes
1. Direct tokenizer initialization with vocab and merges
Starting with v5, we now enable initializing blank, untrained
tokenizers-backed tokenizers:This tokenizer will therefore follow the definition of the
LlamaTokenizeras defined in its class definition. It can then be trained on a corpus as can be seen in thetokenizersdocumentation.These tokenizers can also be initialized from vocab and merges (if necessary), like the previous "slow" tokenizers:
This tokenizer will behave as a Llama-like tokenizer, with an updated vocabulary. This allows comparing different tokenizer classes with the same vocab; therefore enabling the comparison of different pre-tokenizers, normalizers, etc.
vocab_file(as in, a path towards a file containing the vocabulary) cannot be used to initialize theLlamaTokenizeras loading from files is reserved to thefrom_pretrainedmethod.2. Simplified decoding API
The
batch_decodeanddecodemethods have been unified to reflect behavior of theencodemethod. Both single and batch decoding now use the samedecodemethod. See an example of the new behavior below:Gives:
We expect
encodeanddecodeto behave, as two sides of the same coin:encode,process,decode, should work.3. Unified encoding API
The
encode_plusmethod is deprecated in favor of the single__call__method.4.
apply_chat_templatereturnsBatchEncodingPreviously,
apply_chat_templatereturnedinput_idsfor backward compatibility. Starting with v5, it now consistently returns aBatchEncodingdict like other tokenizer methods.5. Removed legacy configuration file saving:
We simplify the serialization of tokenization attributes:
special_tokens_map.json- special tokens are now stored intokenizer_config.json.added_tokens.json- added tokens are now stored intokenizer.json.added_tokens_decoderis only stored when there is notokenizer.json.When loading older tokenizers, these files are still read for backward compatibility, but new saves use the consolidated format. We're gradually moving towards consolidating attributes to fewer files so that other libraries and implementations may depend on them more reliably.
6. Model-Specific Changes
Several models that had identical tokenizers now import from their base implementation:
These modules will eventually be removed altogether.
Removed T5-specific workarounds
The internal
_eventually_correct_t5_max_lengthmethod has been removed. T5 tokenizers now handle max length consistently with other models.Testing Changes
A few testing changes specific to tokenizers have been applied:
add_tokens,encode,decode) are now centralized and automatically applied across all tokenizers. This reduces test duplication and ensures consistent behaviorFor legacy implementations, the original BERT Python tokenizer code (including
WhitespaceTokenizer,BasicTokenizer, etc.) is preserved inbert_legacy.pyfor reference purposes.7. Deprecated / Modified Features
Special Tokens Structure:
SpecialTokensMixin: Merged intoPreTrainedTokenizerBaseto simplify the tokenizer architecture.special_tokens_map: Now only stores named special token attributes (e.g.,bos_token,eos_token). Useextra_special_tokensfor additional special tokens (formerlyadditional_special_tokens).all_special_tokensincludes both named and extra tokens.special_tokens_map_extendedandall_special_tokens_extended: Removed. AccessAddedTokenobjects directly from_special_tokens_mapor_extra_special_tokensif needed.additional_special_tokens: Still accepted for backward compatibility but is automatically converted toextra_special_tokens.Deprecated Methods:
sanitize_special_tokens(): Already deprecated in v4, removed in v5.prepare_seq2seq_batch(): Deprecated; use__call__()withtext_targetparameter instead.BatchEncoding.words(): Deprecated; useword_ids()instead.Removed Methods:
create_token_type_ids_from_sequences(): Removed from base class. Subclasses that need custom token type ID creation should implement this method directly.prepare_for_model(),build_inputs_with_special_tokens(),truncate_sequences(): Moved fromtokenization_utils_base.pytotokenization_python.pyforPythonBackendtokenizers.TokenizersBackendprovides model-ready input viatokenize()andencode(), so these methods are no longer needed in the base class._switch_to_input_mode(),_switch_to_target_mode(),as_target_tokenizer(): Removed from base class. Use__call__()withtext_targetparameter instead.parse_response(): Removed from base class.Performance
MoE Performance
The v5 release significantly improves the performance of the MoE models, as can be seen in the graphs below. We improve and optimize MoE performance through batched and grouped experts implementations, and we optimize them for decoding using
batched_mm.Core performance
We focus on improving the performance of loading weights on device (which gives speedups up to 6x in tensor parallel situations); this is preliminary work that we'll continue to work on in the coming weeks. Some notable improvements:
Library-wide changes with lesser impact
Default
dtypeupdateWe have updated the default
dtypefor all models loaded withfrom_pretrainedto beauto. This will lead to model instantiations respecting thedtypein which the model was saved, rather than forcing it to load in float 32.You can, of course, still specify the
dtypein which you want to load your model by specifying it as an argument to thefrom_pretrainedmethod.Shard size
The Hugging Face Hub infrastructure has gradually moved to a XET backend. This will significantly simplify uploads and downloads, with higher download and upload speeds, partial uploads, and, most notably, a higher threshold for accepted file sizes on the Hugging Face Hub.
To reflect this, we're increasing the default shard size of models serialized on the Hub to 50GB (up
Configuration
📅 Schedule: (UTC)
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about these updates again.
This PR was generated by Mend Renovate. View the repository job log.