Skip to content

Fix IndexError in HunyuanVideo I2V pipeline#13244

Merged
yiyixuxu merged 13 commits intohuggingface:mainfrom
kaixuanliu:hunyuan-fix
Apr 6, 2026
Merged

Fix IndexError in HunyuanVideo I2V pipeline#13244
yiyixuxu merged 13 commits intohuggingface:mainfrom
kaixuanliu:hunyuan-fix

Conversation

@kaixuanliu
Copy link
Copy Markdown
Contributor

@yiyixuxu @asomoza, pls help review, thx!

latest transformers

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
@kaixuanliu
Copy link
Copy Markdown
Contributor Author

kaixuanliu commented Mar 10, 2026

Problem
HunyuanVideoImageToVideoPipeline fails with IndexError: index -1 is out of bounds for dimension 1 with size 0 when using latest version transformers.

Cause
The code searches for [double_return_token_id] (271) representing \n\n to locate the assistant section. In latest transformers, Llama tokenizers no longer produce a separate token for \n\n, returning an empty tensor and causing reshape to fail.

Fix
Added fallback using <|end_header_id|> position + 1 when the double newline token is not found:

@kaixuanliu
Copy link
Copy Markdown
Contributor Author

Here is the sample code to reproduce:

import torch
from diffusers import HunyuanVideoImageToVideoPipeline, HunyuanVideoTransformer3DModel
from diffusers.utils import load_image, export_to_video

# Available checkpoints: "hunyuanvideo-community/HunyuanVideo-I2V" and "hunyuanvideo-community/HunyuanVideo-I2V-33ch"
model_id = "hunyuanvideo-community/HunyuanVideo-I2V"
transformer = HunyuanVideoTransformer3DModel.from_pretrained(
    model_id, subfolder="transformer", torch_dtype=torch.bfloat16
)
pipe = HunyuanVideoImageToVideoPipeline.from_pretrained(
    model_id, transformer=transformer, torch_dtype=torch.float16
)
pipe.vae.enable_tiling()
pipe.to("cuda")

prompt = "A man with short gray hair plays a red electric guitar."
image = load_image(
    "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/guitar-man.png"
)

output = pipe(image=image, prompt=prompt).frames[0]
export_to_video(output, "output.mp4", fps=15)

@kaixuanliu kaixuanliu changed the title add fallback logic for Hunyuan pipeline to make it compatible with Fix IndexError in HunyuanVideo I2V pipeline Mar 10, 2026
@kaixuanliu
Copy link
Copy Markdown
Contributor Author

Hi, @yiyixuxu @asomoza, can you help review this PR? Thx!

@kaixuanliu
Copy link
Copy Markdown
Contributor Author

@DN6 , Hi, can you help review? Thx

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@kaixuanliu
Copy link
Copy Markdown
Contributor Author

@DN6 Hi, Can we merge this PR?

Copy link
Copy Markdown
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR!
i left one comment


# Fallback for newer transformers versions where double newline is not tokenized as a separate token
# In this case, use the last <|end_header_id|> token position + 1 as the assistant section marker
if last_double_return_token_indices.numel() == 0:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we just replace the hard coded numbers with this logic instead? regardless the transformer version?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for the advice. Have updated the code.

…ction marker

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
@kaixuanliu
Copy link
Copy Markdown
Contributor Author

Hi, @yiyixuxu @DN6 Can you help review it again? Thx!

Copy link
Copy Markdown
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!
Do you know which version of Transformeer introduced this behavior?
Would you be able to run a test with an older transnfomer version to make sure we don't break backward?

:, -1
]
# Get the last <|end_header_id|> position per batch, then +1 to get the position after it
last_double_return_token_indices = end_header_indices.reshape(text_input_ids.shape[0], -1)[:, -1] + 1
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we change the name of this variable now?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes. Thx for the advice~ Done for it.

@kaixuanliu
Copy link
Copy Markdown
Contributor Author

It should be an incompatible bug from transformers 5.0.0.

  • transformers 4.57.6 OK
  • transformers 5.0.0 Failed

And when I use this fix with both transformers 4.57.6 and 5.0.0, it works well.

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
@yiyixuxu yiyixuxu merged commit 10ba0be into huggingface:main Apr 6, 2026
11 checks passed
@yiyixuxu
Copy link
Copy Markdown
Collaborator

yiyixuxu commented Apr 6, 2026

thanks so much for working on this! @kaixuanliu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants