Skip to content

Commit e8d4305

Browse files
msluszniakclaude
andauthored
fix(llm): normalize multimodal image paths to file:// URIs (#1090)
## Description `LLMController.forward` passed `imagePaths` straight through to `nativeModule.generateMultimodal` with no normalization. The native side requires the `file://` prefix; without it, native throws `"Read image error: invalid argument"` with no further context. Callers can plausibly arrive with either form: - `ResourceFetcher.fetch` returns raw paths *without* `file://` (per its own docstring on the `fetch` method). - Platform image-picker APIs (e.g. `expo-image-picker`) typically return `file:///...` URIs. - The same path string passed to a vision module's `forward(...)` works either way; the asymmetry between vision modules and multimodal LLM is undocumented. This PR normalizes each image path inside `LLMController.forward` so both forms work, and updates the JSDoc on `Message.mediaPath` and `LLMModule.forward.imagePaths` to document the new contract. ### Introduces a breaking change? - [ ] Yes - [x] No Strictly additive: previously-working calls (paths with `file://`) keep working unchanged. Previously-failing calls (paths without `file://`) now succeed. ### Type of change - [x] Bug fix (change which fixes an issue) - [ ] New feature - [ ] Documentation update - [ ] Other ### Tested on - [ ] iOS - [ ] Android The bare-path failure was reproduced on Android (Samsung Galaxy S24 Ultra) with LFM2-VL-1.6B while building a downstream consumer; both forms tested manually post-fix on the same device. Re-verification of both forms on iOS is recommended. ### Testing instructions ```ts import { LLMModule, LFM2_VL_1_6B_QUANTIZED } from 'react-native-executorch'; const llm = await LLMModule.fromModelName(LFM2_VL_1_6B_QUANTIZED); // Both should now work; previously only the first did. await llm.generate([ { role: 'user', content: 'Describe.', mediaPath: 'file:///absolute/path/to/img.jpg' }, ]); await llm.generate([ { role: 'user', content: 'Describe.', mediaPath: '/absolute/path/to/img.jpg' }, ]); ``` ### Related issues Addresses item 3 of #1086. ### Checklist - [ ] I have performed a self-review of my code - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have updated the documentation accordingly - [ ] My changes generate no new warnings ### Additional notes The normalizer is module-scope (matching `messagesForChatTemplate` from #1089) rather than a class method because it doesn't depend on controller state. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent da22171 commit e8d4305

3 files changed

Lines changed: 16 additions & 2 deletions

File tree

packages/react-native-executorch/src/controllers/LLMController.ts

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -254,7 +254,7 @@ export class LLMController {
254254
imagePaths && imagePaths.length > 0
255255
? await this.nativeModule.generateMultimodal(
256256
input,
257-
imagePaths,
257+
imagePaths.map(normalizeImagePath),
258258
this.getImageToken(),
259259
this.onToken
260260
)
@@ -456,3 +456,15 @@ export class LLMController {
456456
return result;
457457
}
458458
}
459+
460+
/**
461+
* The native multimodal pipeline expects image paths to be `file://` URIs.
462+
* `ResourceFetcher.fetch` and most platform file APIs return raw filesystem
463+
* paths without that prefix, so callers routinely pass either form. Accept
464+
* both and normalize to the prefixed form here.
465+
* @param path - Local image path, either with or without the `file://` prefix.
466+
* @returns The same path with a `file://` prefix.
467+
*/
468+
function normalizeImagePath(path: string): string {
469+
return path.startsWith('file://') ? path : `file://${path}`;
470+
}

packages/react-native-executorch/src/modules/natural_language_processing/LLMModule.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -139,7 +139,7 @@ export class LLMModule {
139139
* It doesn't manage conversation context. It is intended for users that need access to the model itself without any wrapper.
140140
* If you want a simple chat with model the consider using `sendMessage`
141141
* @param input - Raw input string containing the prompt and conversation history.
142-
* @param imagePaths - Optional array of local image paths for multimodal inference.
142+
* @param imagePaths - Optional array of local image paths for multimodal inference. Each entry may be either `file:///absolute/path` or `/absolute/path` — the controller normalizes the path before passing it to native code.
143143
* @returns The generated response as a string.
144144
*/
145145
async forward(input: string, imagePaths?: string[]): Promise<string> {

packages/react-native-executorch/src/types/llm.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -270,6 +270,8 @@ export interface Message {
270270
/**
271271
* Optional local file path to media (image, audio, etc.).
272272
* Only valid on `user` messages.
273+
* Either `file:///absolute/path` or `/absolute/path` is accepted; the
274+
* controller normalizes the path before passing it to native code.
273275
*/
274276
mediaPath?: string;
275277
}

0 commit comments

Comments
 (0)