Commit 67bc28b
authored
Unify MultimodalRunner under IRunner with multimodal prefill (#17741)
Make `MultimodalRunner` inherit from `IRunner` so callers can hold a
single `IRunner*` regardless of model type. Add
`prefill(vector<MultimodalInput>, num_bos, num_eos)` to `IRunner`
returning `Result<uint64_t>` (the predicted next token), with a default
`NotSupported` implementation.
Resolves #17728
## Changes
**`irunner.h`** — Forward-declare `MultimodalInput`. Add virtual
`prefill(vector<MultimodalInput>)` returning `Result<uint64_t>` with
default `NotSupported`.
**`multimodal_runner.h/cpp`** — `MultimodalRunner` now inherits
`IRunner`. `generate(string)` override is a pure wrapper that delegates
to `generate(vector)`. `generate(vector)` handles both non-empty inputs
(prefill + decode) and empty inputs (consume `prefill_next_token_` from
a prior `prefill()` call). Decode loop extracted into private
`decode_from_token()` to avoid duplication. `is_loaded()` becomes `const
override`, `stop()`/`reset()`/`load()` gain `override`. String
convenience `prefill(string)` provided inline.
**`text_llm_runner.h/cpp`** — New `prefill(vector<MultimodalInput>)`
override handles text inputs (encode + prefill KV cache), returns
predicted next token. `generate("")` allowed after `prefill()` —
consumes stored `prefill_next_token_`. Old `prefill(string,
GenerationConfig)` preserved as deprecated wrapper. String convenience
methods defined in .cpp to avoid header dependency on
`multimodal_input.h`.
**`pybindings.cpp`** — Adapts to `Result<uint64_t>` return type from
`prefill()`.
**`_llm_runner.pyi`** — Updated `MultimodalRunner.prefill` docstring.
## Design decisions
- `prefill()` returns `Result<uint64_t>` — the sampled next token from
the final forward pass. This is stored internally in
`prefill_next_token_` for the `prefill()` →
`generate("")`/`generate({})` workflow, and also returned to callers who
may want the token directly.
- `prefill()` takes `num_bos`/`num_eos` instead of `GenerationConfig` —
those are the only fields relevant to prefill (for tokenizer encoding).
- BOS is only applied when `pos_ == 0` (start of conversation) in
`MultimodalRunner`. `TextLLMRunner` trusts the caller's `num_bos` value.
- `generate(string)` is always a pure wrapper to `generate(vector)` in
`MultimodalRunner` — empty string passes an empty vector, non-empty
wraps as `MultimodalInput`.
- `text_llm_runner.h` avoids `#include multimodal_input.h` — only the
forward declaration from `irunner.h` is needed in the header. String
convenience methods are defined in the .cpp.
## Backward compatibility
- **C++ source-compatible**: `MultimodalRunner::prefill(inputs)` still
compiles (new params have defaults). `TextLLMRunner::prefill(string,
GenerationConfig)` preserved as deprecated wrapper. Return type changed
from `Error` to `Result<uint64_t>` — callers checking `.error()` still
work.
- **ABI-breaking**: Expected for `ET_EXPERIMENTAL` APIs.
- **Python**: Fully compatible, interface signatures unchanged.
## Test plan
- Existing C++ tests: `test_text_llm_runner.cpp`,
`test_text_prefiller.cpp`, `test_generation_config.cpp` — verify no
regressions
- Build: `cmake --build` for `extension_llm_runner` target1 parent 4528ae2 commit 67bc28b
8 files changed
Lines changed: 453 additions & 149 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
479 | 479 | | |
480 | 480 | | |
481 | 481 | | |
482 | | - | |
| 482 | + | |
| 483 | + | |
483 | 484 | | |
484 | 485 | | |
485 | 486 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| 17 | + | |
17 | 18 | | |
18 | 19 | | |
19 | 20 | | |
| 21 | + | |
20 | 22 | | |
21 | 23 | | |
22 | 24 | | |
23 | 25 | | |
24 | 26 | | |
| 27 | + | |
| 28 | + | |
25 | 29 | | |
26 | 30 | | |
27 | 31 | | |
| |||
128 | 132 | | |
129 | 133 | | |
130 | 134 | | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
131 | 151 | | |
132 | 152 | | |
133 | 153 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
56 | | - | |
| 56 | + | |
57 | 57 | | |
58 | 58 | | |
59 | 59 | | |
| |||
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
88 | | - | |
89 | | - | |
90 | | - | |
91 | | - | |
92 | | - | |
93 | | - | |
94 | | - | |
95 | | - | |
96 | | - | |
97 | | - | |
98 | | - | |
99 | | - | |
100 | | - | |
101 | | - | |
| 88 | + | |
102 | 89 | | |
103 | | - | |
104 | | - | |
105 | | - | |
106 | | - | |
107 | | - | |
108 | | - | |
109 | | - | |
110 | | - | |
| 90 | + | |
| 91 | + | |
111 | 92 | | |
112 | 93 | | |
113 | 94 | | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | | - | |
121 | | - | |
122 | | - | |
123 | | - | |
124 | | - | |
125 | | - | |
126 | | - | |
127 | | - | |
128 | | - | |
129 | | - | |
130 | | - | |
131 | | - | |
132 | | - | |
133 | | - | |
134 | | - | |
135 | | - | |
136 | | - | |
137 | | - | |
138 | | - | |
139 | | - | |
140 | | - | |
| 95 | + | |
141 | 96 | | |
142 | | - | |
143 | | - | |
144 | | - | |
145 | | - | |
146 | | - | |
147 | | - | |
148 | | - | |
149 | | - | |
150 | | - | |
151 | | - | |
| 97 | + | |
152 | 98 | | |
153 | 99 | | |
154 | | - | |
155 | | - | |
156 | | - | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
157 | 118 | | |
158 | 119 | | |
159 | 120 | | |
160 | 121 | | |
161 | 122 | | |
162 | | - | |
| 123 | + | |
163 | 124 | | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
164 | 128 | | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
165 | 134 | | |
166 | 135 | | |
167 | 136 | | |
168 | 137 | | |
169 | | - | |
170 | | - | |
| 138 | + | |
171 | 139 | | |
172 | 140 | | |
173 | 141 | | |
| |||
183 | 151 | | |
184 | 152 | | |
185 | 153 | | |
186 | | - | |
187 | | - | |
| 154 | + | |
188 | 155 | | |
189 | 156 | | |
190 | 157 | | |
| |||
204 | 171 | | |
205 | 172 | | |
206 | 173 | | |
207 | | - | |
| 174 | + | |
208 | 175 | | |
209 | 176 | | |
210 | 177 | | |
| |||
249 | 216 | | |
250 | 217 | | |
251 | 218 | | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
252 | 289 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
18 | 19 | | |
19 | 20 | | |
20 | 21 | | |
| |||
74 | 75 | | |
75 | 76 | | |
76 | 77 | | |
77 | | - | |
| 78 | + | |
78 | 79 | | |
79 | 80 | | |
80 | 81 | | |
| |||
105 | 106 | | |
106 | 107 | | |
107 | 108 | | |
108 | | - | |
109 | | - | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
110 | 127 | | |
111 | 128 | | |
112 | 129 | | |
| |||
124 | 141 | | |
125 | 142 | | |
126 | 143 | | |
127 | | - | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
128 | 147 | | |
129 | 148 | | |
130 | | - | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
131 | 161 | | |
132 | | - | |
133 | | - | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
134 | 168 | | |
135 | | - | |
| 169 | + | |
136 | 170 | | |
137 | 171 | | |
138 | 172 | | |
139 | | - | |
| 173 | + | |
140 | 174 | | |
141 | 175 | | |
| 176 | + | |
142 | 177 | | |
143 | 178 | | |
144 | | - | |
| 179 | + | |
145 | 180 | | |
146 | 181 | | |
147 | 182 | | |
| |||
160 | 195 | | |
161 | 196 | | |
162 | 197 | | |
| 198 | + | |
163 | 199 | | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
164 | 207 | | |
165 | 208 | | |
166 | 209 | | |
| |||
0 commit comments