[ROCm][CI] Gate incompatible HF references on Transformers v5#41532
[ROCm][CI] Gate incompatible HF references on Transformers v5#41532AndreasKaratzas wants to merge 14 commits into
Conversation
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
There was a problem hiding this comment.
Code Review
This pull request updates the model testing framework to include model revisions and trust_remote_code flags. It also introduces version constraints for specific models in the registry. Feedback was provided regarding a logic bug where adding a version reason causes tests to be skipped regardless of version validity, and a concern that the specified Transformers version range for MiniCPM4 appears to be incorrect or overly restrictive.
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
cc @charlifu |
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
Notes on the AudioFlamingo3 / MusicFlamingo changes:
|
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
eustlb
left a comment
There was a problem hiding this comment.
transformers releases
AudioFlamingo3: v5.0.0
MusicFlamingo3: v5.5.0
There should not be any reasons not to use super()._call_hf_processor. Looks like a lot of the confusion here comes from the fact that the vLLM PR was merged before the transformers one on which it depended, and that evolved after the vLLM one was merged.
Such fixes should be addressed in work already started in #39011, but happy to help if needed. @lashahub looping you in here
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
…processing Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
I pushed some more changes (a lot of them actually). The main intent is to stop carrying vLLM-local copies of preprocessing that now belongs to the upstream HF processors. So vLLM had manual chunking, feature extraction, and audio-token expansion logic. With AudioFlamingo3 at transformers In The remaining vLLM logic after
So I also removed MusicFlamingo's Also regarding MusicFlamingo's The The RoPE forward change from flattened The fp32 RoPE buffer restoration is there because vLLM model dtype/device application can cast non-persistent buffers to bf16. In the HF construction path we compare against, The cast in Also, for ROCm, the transcription was wrong yielding a weird "four-on-the-floor" with MusicFlamingo. Direct HF generation matched the original fixture, while vLLM's URL audio path produced a different transcription. The root cause was audio resampling: HF/librosa uses soxr-style resampling, while vLLM was using the default parser resampler. Switching only MusicFlamingo's parser to The MusicFlamingo generation test no longer skips missing fixtures because missing committed fixtures should be a test error, not a skip. The small warmup in |
|
Hi @AndreasKaratzas, the pre-commit checks have failed. Please run: uv pip install pre-commit>=4.5.1
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
Thanks for looping me in. AF-Next should not be added as a separate vLLM architecture anymore. The current HF checkpoints use the existing MusicFlamingo architecture ( I’ll wait for this PR to settle/land, then rebase #39011 on top and re-scope it to AF-Next checkpoint coverage through MusicFlamingo only. That should avoid duplicating the Flamingo processor/RoTE fixes here. |
This updates generation-test metadata for models whose HF reference path is incompatible with the installed Transformers v5 runtime.
Test groups fixed:
mi355_1: Language Models Tests (Standard)for MiniCPM4.mi355_1: Language Models Test (Extended Generation)for HyperCLOVAX.MiniCPM4:
is_torch_fx_available, removed from the installed Transformers v5 runtime.is_torch_fx_availablein v5.0 breakstrust_remote_codemodels huggingface/transformers#44561HyperCLOVAX:
ROPE_INIT_FUNCTIONSand unconditionally indexesROPE_INIT_FUNCTIONS[self.rope_type]; for the default case,self.rope_typeis"default": https://huggingface.co/naver-hyperclovax/HyperCLOVAX-SEED-Think-14B/blob/main/modeling_hyperclovax.py"default"as a valid RoPE type. The incompatibility is narrower: v5 no longer exposes a"default"entry inROPE_INIT_FUNCTIONS, while v4.57 did. In v5's own Llama code, default RoPE is handled bycompute_default_rope_parameters;ROPE_INIT_FUNCTIONSis consulted only whenrope_type != "default".config.rope_parameters, which matches the new v5 model path.cc @kenroche