WebLlmProvider Configuration Guide

The WebLlmProvider runs LLM models directly in the browser using the MLC WebLLM runtime. It exposes the WebLLM MlcEngineConfig and GenerationConfig.

ℹ️ Info:
This configuration guide assumes you have completed the setup for the LlmConnector plugin according to the guide here.

1. Install Dependency Package & Import Provider

npm install @mlc-ai/web-llm

import { WebLlmProvider } from "@rcb-plugins/llm-connector";

2. Basic Instantiation

A minimal example for browser-based inference:

const webllm = new WebLlmProvider({
  model: "qwen2-0.5b-instruct-q4f16", // your local or CDN model identifier
  responseFormat: "stream",           // "stream" (default) or "json"
});

⚠️ Warning: Loading large models in the browser can impact performance and memory so do select and test your choice of models carefully.

3. Configuration Options

Option	Type	Required	Default	Description
`model`	`string`	✅ always	—	The model name or path to load via MLC WebLLM (e.g. `Qwen2-0.5B-Instruct-q4f16_1-MLC`). You can find the list of models here
`systemMessage`	`string`	❌	`null`	Prepends a system prompt before user messages.
`responseFormat`	`"stream"` \| `"json"`	❌	`"stream"`	Determines whether to use stream endpoint from the provider or fetch a full JSON output.
`engineConfig`	`MLCEngineConfig`	❌	`{}`	Custom engine initialization options referenced from MLCEngineConfig.
`chatCompletionOptions`	`GenerationConfig`	❌	`{}`	Custom chat completion options from referenced from GenerationConfig.
`messageParser`	`(msgs: Message[]) => CustomMessage[]`	❌	`null`	Custom parser converting React ChatBotify `Message[]` into desired message format for the provider.
`debug`	`boolean`	❌	`false`	Enables debug logging for the provider.

4. Advanced Example

const webllm = new WebLlmProvider({
  model: "qwen2-0.5b-instruct-q4f16",
  systemMessage: "You are a helpful assistant in the browser.",
  responseFormat: "stream",
  engineConfig: {
    numThreads: 4,
    sampler: { topK: 40, topP: 0.95 },
  },
  chatCompletionOptions: { temperature: 0.7 },
  messageParser: (msgs) => msgs.map(m => ({ role: m.sender.toLowerCase(), content: String(m.content) })),
});

5. How It Works Under the Hood

Constructor: sets defaults such as model.
sendMessages():
- Invokes engine.chat.completions.create().
- If streaming, yields each chunk text content.
- Otherwise yields the full text response.

Check out other providers here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WebLlmProvider Configuration Guide

1. Install Dependency Package & Import Provider

2. Basic Instantiation

3. Configuration Options

4. Advanced Example

5. How It Works Under the Hood

FilesExpand file tree

WebLlm.md

Latest commit

History

WebLlm.md

File metadata and controls

WebLlmProvider Configuration Guide

1. Install Dependency Package & Import Provider

2. Basic Instantiation

3. Configuration Options

4. Advanced Example

5. How It Works Under the Hood