You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: limit guided decoding loops to generation_size from logits shape
FillMask, ScheduleUpdate, and FinishUpdate previously iterated over
d.matchers.size() entries, but only the first generation_size
(= logits.shape(0)) slots are actively generating. Entries beyond
that index contain stale output_ids and unused bitmasks.
- FillMask: limit matcher iteration and reserve to gs = logits.shape(0)
- ScheduleUpdate: copy only gs output_ids entries for D2H transfer
- FinishUpdate: add TensorMap& env param, iterate only over gs slots
Fixes review comments on PR #4605 (3280137130, 3280137198).
0 commit comments