Commit b271aa6
fix(llm): auto-shape multimodal mediaPath messages in chat template (#1089)
## Description
`LLMController.generate()` collected `imagePaths` from messages with a
`mediaPath` set, but never transformed their `content` into the
`[{type:'image'}, {type:'text', text}]` form that the chat template
needs to emit the `<image>` placeholder. Calling `generate()` directly
with a vision-capable model (e.g. LFM2-VL) thus threw `"More images
paths provided than '<image>' placeholders in prompt"` from native, even
though `sendMessage()` worked because it built its own
`historyForTemplate` that did the transformation.
This PR moves the transformation into `applyChatTemplate` so both call
sites (`generate` and `sendMessage`) get the correct behavior, and
removes the now-redundant `historyForTemplate` block from `sendMessage`.
The public `Message.content` type stays `string` — external callers
always pass plain strings; the controller handles the structured array
form internally.
The helper is idempotent: messages whose `content` is already an array
(e.g. callers who pre-shaped it as a workaround) are passed through
unchanged.
### Introduces a breaking change?
- [ ] Yes
- [x] No
Public types are unchanged. `sendMessage` produces an identical rendered
chat-template string (the transformation just happens one step later in
the pipeline; token count and rendered output are byte-identical).
`generate` only changes behavior in cases that previously threw — pure
bug fix.
### Type of change
- [x] Bug fix (change which fixes an issue)
- [ ] New feature (change which adds functionality)
- [ ] Documentation update (improves or adds clarity to existing
documentation)
- [ ] Other (chores, tests, code style improvements etc.)
### Tested on
- [ ] iOS
- [ ] Android
The original bug was reproduced on a vision-capable model
(LFM2-VL-1.6B-quantized) on Android while building a downstream consumer
app. Re-verification of the fix on a real device is recommended before
merge — see Testing instructions below. ~I have not personally re-run
the failing scenario after the fix.~
### Testing instructions
To reproduce the original bug (without this PR):
```ts
import { LLMModule, LFM2_VL_1_6B_QUANTIZED } from 'react-native-executorch';
const llm = await LLMModule.fromModelName(LFM2_VL_1_6B_QUANTIZED);
await llm.generate([
{ role: 'user', content: 'Describe this image.', mediaPath: 'file:///path/to/image.jpg' },
]);
// Throws: "More images paths provided than '<image>' placeholders in prompt"
```
With this PR applied, the same call should succeed and return the
model's description.
Regression check: a vision-capable `sendMessage(text, { imagePath })`
flow should continue producing identical output.
### Screenshots
N/A (controller change, no UI).
### Related issues
Addresses items 1 and 2 of #1086. With item 1 fixed, item 2's
`Message.content` type mismatch no longer surfaces in practice because
external callers never need to construct the array form themselves (the
`as unknown as string` workaround that motivated #2 becomes
unnecessary).
### Checklist
- [ ] I have performed a self-review of my code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have updated the documentation accordingly
- [ ] My changes generate no new warnings
### Additional notes
The `messagesForChatTemplate` helper lives at module scope rather than
as a static class method because it doesn't depend on controller state.
Internal `any[]` return is a deliberate concession to the dynamic shape
the chat-template engine accepts; the public `Message[]` input/output
contract stays well-typed.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent e8d4305 commit b271aa6
1 file changed
Lines changed: 23 additions & 14 deletions
Lines changed: 23 additions & 14 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
354 | 354 | | |
355 | 355 | | |
356 | 356 | | |
357 | | - | |
358 | | - | |
359 | | - | |
360 | | - | |
361 | | - | |
362 | | - | |
363 | | - | |
364 | | - | |
365 | | - | |
366 | | - | |
367 | | - | |
368 | | - | |
369 | 357 | | |
370 | 358 | | |
371 | 359 | | |
| |||
383 | 371 | | |
384 | 372 | | |
385 | 373 | | |
386 | | - | |
| 374 | + | |
387 | 375 | | |
388 | 376 | | |
389 | 377 | | |
| |||
448 | 436 | | |
449 | 437 | | |
450 | 438 | | |
451 | | - | |
| 439 | + | |
452 | 440 | | |
453 | 441 | | |
454 | 442 | | |
| |||
468 | 456 | | |
469 | 457 | | |
470 | 458 | | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
0 commit comments