Skip to content

Commit e0ee4dc

Browse files
committed
Fix misleading prefill docstring in _llm_runner.pyi
The old wording suggested calling generate("") after prefill, but the Python API takes List[MultimodalInput] and the C++ runner enforces a non-empty prompt. Updated to describe the correct post-prefill pattern. This PR was authored with the assistance of Claude.
1 parent 6ab9e7c commit e0ee4dc

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

extension/llm/runner/_llm_runner.pyi

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -479,8 +479,8 @@ class MultimodalRunner:
479479
def prefill(self, inputs: List[MultimodalInput]) -> None:
480480
"""
481481
Prefill multimodal inputs (e.g., to rebuild KV cache from chat history)
482-
without generating tokens. After prefill, call generate("") to start
483-
decoding from the prefilled state.
482+
without generating tokens. After prefill, call generate() with a
483+
non-empty final text input to start decoding.
484484
485485
Args:
486486
inputs: List of multimodal inputs to prefill

0 commit comments

Comments
 (0)