The WebLlmProvider runs LLM models directly in the browser using the MLC WebLLM runtime. It exposes the WebLLM MlcEngineConfig and GenerationConfig.
ℹ️ Info:
This configuration guide assumes you have completed the setup for theLlmConnectorplugin according to the guide here.
npm install @mlc-ai/web-llmimport { WebLlmProvider } from "@rcb-plugins/llm-connector";A minimal example for browser-based inference:
const webllm = new WebLlmProvider({
model: "qwen2-0.5b-instruct-q4f16", // your local or CDN model identifier
responseFormat: "stream", // "stream" (default) or "json"
});
⚠️ Warning: Loading large models in the browser can impact performance and memory so do select and test your choice of models carefully.
| Option | Type | Required | Default | Description |
|---|---|---|---|---|
model |
string |
✅ always | — | The model name or path to load via MLC WebLLM (e.g. Qwen2-0.5B-Instruct-q4f16_1-MLC). You can find the list of models here |
systemMessage |
string |
❌ | null |
Prepends a system prompt before user messages. |
responseFormat |
"stream" | "json" |
❌ | "stream" |
Determines whether to use stream endpoint from the provider or fetch a full JSON output. |
engineConfig |
MLCEngineConfig |
❌ | {} |
Custom engine initialization options referenced from MLCEngineConfig. |
chatCompletionOptions |
GenerationConfig |
❌ | {} |
Custom chat completion options from referenced from GenerationConfig. |
messageParser |
(msgs: Message[]) => CustomMessage[] |
❌ | null |
Custom parser converting React ChatBotify Message[] into desired message format for the provider. |
debug |
boolean |
❌ | false |
Enables debug logging for the provider. |
const webllm = new WebLlmProvider({
model: "qwen2-0.5b-instruct-q4f16",
systemMessage: "You are a helpful assistant in the browser.",
responseFormat: "stream",
engineConfig: {
numThreads: 4,
sampler: { topK: 40, topP: 0.95 },
},
chatCompletionOptions: { temperature: 0.7 },
messageParser: (msgs) => msgs.map(m => ({ role: m.sender.toLowerCase(), content: String(m.content) })),
});- Constructor: sets defaults such as model.
sendMessages():- Invokes
engine.chat.completions.create(). - If streaming, yields each chunk text content.
- Otherwise yields the full text response.
- Invokes
Check out other providers here.