fix: also patch model.generate() to inject cached pixel_values by abrichr · Pull Request #254 · OpenAdaptAI/openadapt-evals

abrichr · 2026-03-29T23:43:43Z

Patches both forward() and generate() on the model instance. TRL calls generate(input_ids=...) without pixel_values — the generate patch injects them from cache.

forward() patch handles training logprob recomputation, but TRL also calls model.generate(input_ids=...) without pixel_values. HF's generate() uses prepare_inputs_for_generation() which builds a fresh kwargs dict — cached pixel_values in forward() aren't enough because generate() needs them at the top level to pass them through. Now patches BOTH forward() and generate() on the model instance. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

abrichr merged commit 9612019 into main Mar 29, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: also patch model.generate() to inject cached pixel_values#254

fix: also patch model.generate() to inject cached pixel_values#254
abrichr merged 1 commit intomainfrom
fix/patch-generate-too

abrichr commented Mar 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

abrichr commented Mar 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant