Skip to content

[LLADA2] Fix llada2 review #13598#13698

Open
kashif wants to merge 2 commits intohuggingface:mainfrom
kashif:fix-llada2-review-13598
Open

[LLADA2] Fix llada2 review #13598#13698
kashif wants to merge 2 commits intohuggingface:mainfrom
kashif:fix-llada2-review-13598

Conversation

@kashif
Copy link
Copy Markdown
Contributor

@kashif kashif commented May 8, 2026

What does this PR do?

Fix the issues raised in #13598

Fixes #13598

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

kashif added 2 commits May 8, 2026 09:12
Fixes the six in-scope issues raised in the llada2 model/pipeline review:

1. Carry tokenizer `attention_mask` through `_prepare_input_ids` and add an
   `attention_mask` arg to `__call__` for pre-tokenized inputs. The runtime
   mask now reflects prompt padding and zeros out the block-aligned tail
   past `prompt_length + gen_length` instead of treating those positions
   as valid context.

2. Thread the per-call `block_length` into `BlockRefinementScheduler.set_timesteps`
   so the transfer schedule matches the requested block size (previously the
   scheduler only read its constructor default).

3. Drop `x0`/`x0_p`/`confidence` from `_callback_tensor_inputs` (never bound
   locals) and bind `sampled_tokens`, `sampled_probs`, `editing_transfer_index`,
   `active_block` so all advertised callback keys resolve.

4. Allow EOS exactly at index `prompt_length` (the first generated position)
   to mark a row finished.

5. Freeze rows that have already emitted EOS so subsequent block refinement
   doesn't extend them, and trim per-row at decode (previously gated on
   batch_size==1) so post-EOS positions don't leak into decoded text.

6. Stop calling `self.set_progress_bar_config(...)` from inside `__call__`;
   build a local config dict for the inner block bar so user-supplied flags
   (in particular `disable=True`) survive the call.

Adds regression tests pinning each of the six fixes.
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

llada2 model/pipeline review

2 participants