Skip to content

feat: implementation of multimodal runner#892

Merged
NorbertKlockiewicz merged 58 commits intomainfrom
@nk/lfm-vlm
Mar 11, 2026
Merged

feat: implementation of multimodal runner#892
NorbertKlockiewicz merged 58 commits intomainfrom
@nk/lfm-vlm

Conversation

@NorbertKlockiewicz
Copy link
Copy Markdown
Contributor

@NorbertKlockiewicz NorbertKlockiewicz commented Mar 2, 2026

Description

Adds vision/multimodal support to useLLM: load a VLM by passing capabilities: ['vision'], then use sendMessage(text, { imagePath }) to send messages with images. Under the hood this introduces a pluggable encoder architecture (IEncoder / VisionEncoder), a dedicated MultimodalRunner, and a refactored BaseLLMRunner with cleaner ownership and shared state. Also exposes getVisualTokenCount() JSI method for accurate token counting with images. No changes to the text-only path.

Introduces a breaking change?

  • Yes
  • No

Type of change

  • Bug fix (change which fixes an issue)
  • New feature (change which adds functionality)
  • Documentation update (improves or adds clarity to existing documentation)
  • Other (chores, tests, code style improvements etc.)

Tested on

  • iOS
  • Android

Testing instructions

Run the llm example app, select multimodal llm screen. Select an image and prompt the model.

Screenshots

Related issues

Checklist

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have updated the documentation accordingly
  • My changes generate no new warnings

Additional notes

@NorbertKlockiewicz NorbertKlockiewicz changed the title feat: initial implementation of multimodal runner with lfm vlm feat implementation of multimodal runner Mar 5, 2026
@NorbertKlockiewicz NorbertKlockiewicz marked this pull request as ready for review March 5, 2026 16:19
@msluszniak msluszniak changed the title feat implementation of multimodal runner feat: implementation of multimodal runner Mar 5, 2026
@msluszniak msluszniak added the feature PRs that implement a new feature label Mar 5, 2026
This was linked to issues Mar 6, 2026
@NorbertKlockiewicz NorbertKlockiewicz added this to the v0.8.0 milestone Mar 6, 2026
@chmjkb chmjkb removed this from the v0.8.0 milestone Mar 6, 2026
Comment thread apps/llm/app.json Outdated
Comment thread packages/react-native-executorch/src/controllers/LLMController.ts Outdated
Comment thread packages/react-native-executorch/src/controllers/LLMController.ts Outdated
Comment thread packages/react-native-executorch/src/index.ts Outdated
Comment thread packages/react-native-executorch/common/runner/base_llm_runner.cpp
Comment thread packages/react-native-executorch/common/runner/base_llm_runner.cpp
Comment thread packages/react-native-executorch/src/constants/modelUrls.ts Outdated
Comment thread packages/react-native-executorch/common/runner/encoders/vision_encoder.cpp Outdated
Comment thread packages/react-native-executorch/common/runner/base_llm_runner.cpp Outdated
Comment thread packages/react-native-executorch/common/runner/base_llm_runner.cpp Outdated
Comment thread packages/react-native-executorch/common/runner/multimodal_runner.cpp Outdated
Comment thread packages/react-native-executorch/common/runner/multimodal_runner.cpp Outdated
Comment thread packages/react-native-executorch/common/runner/multimodal_runner.cpp Outdated
Comment thread packages/react-native-executorch/common/runner/text_runner.cpp
Comment thread packages/react-native-executorch/common/runner/text_runner.cpp Outdated
@NorbertKlockiewicz
Copy link
Copy Markdown
Contributor Author

NorbertKlockiewicz commented Mar 10, 2026

API reference will be generated after the PR is approved

@msluszniak
Copy link
Copy Markdown
Member

On huggingface, you can add information using which version of executorch was model exported.

msluszniak

This comment was marked as outdated.

Comment thread packages/react-native-executorch/src/constants/modelUrls.ts
Comment on lines +288 to +291
public async generate(
messages: Message[],
tools?: LLMTool[]
tools?: LLMTool[],
imagePaths?: string[]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this:
Why are we passing imagePaths if the Message type includes a mediaPath member? It seems like the user needs to pass the same things twice

NorbertKlockiewicz and others added 23 commits March 11, 2026 11:35
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… EOS IDs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…g cache

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…kenCount JSI

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… runner classes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ad image shape from model metadata

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…mage_token from config

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@chmjkb chmjkb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good job 🥳

Comment thread packages/react-native-executorch/src/constants/modelUrls.ts Outdated
Co-authored-by: Jakub Chmura <92989966+chmjkb@users.noreply.github.com>
@NorbertKlockiewicz NorbertKlockiewicz enabled auto-merge (squash) March 11, 2026 11:09
Copy link
Copy Markdown
Contributor

@benITo47 benITo47 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great changes overall! Thanks!

@NorbertKlockiewicz NorbertKlockiewicz merged commit ce065d2 into main Mar 11, 2026
5 checks passed
@NorbertKlockiewicz NorbertKlockiewicz deleted the @nk/lfm-vlm branch March 11, 2026 11:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature PRs that implement a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for LFM2.5-VL-1.6B Add VLM support

4 participants