You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
prefill() now returns the sampled next token directly, making the
API more natural. Callers get the token if they want it, and
internally it's also stored in prefill_next_token_ for the
generate("") workflow.
This eliminates the need for the private prefill_and_sample() helper
since prefill() itself returns the token — generate(vector) can
call prefill() directly.
This PR was authored with the assistance of Claude.
0 commit comments