Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .cspell-wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -46,3 +46,4 @@ Infima
sublabel
Aeonik
Lexend
finetuned
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,10 +65,10 @@ function MyComponent() {

```tsx
const handleGenerate = async () => {
const prompt = 'The meaning of life is';
const chat = [{ role: 'user', content: 'What is the meaning of life?' }];

// Generate text based on your desired prompt
await llama.runInference(prompt);
// Chat completion
await llama.generate(chat);
console.log('Llama says:', llama.response);
};
```
Expand Down
2 changes: 1 addition & 1 deletion android/src/main/java/com/swmansion/rnexecutorch/LLM.kt
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ class LLM(
}
}

override fun runInference(
override fun forward(
input: String,
promise: Promise,
) {
Expand Down
47 changes: 30 additions & 17 deletions docs/docs/natural-language-processing/useLLM.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,13 +68,14 @@ const useLLM: ({
}) => LLMType;

interface LLMType {
messageHistory: MessageType[];
messageHistory: Message[];
response: string;
isReady: boolean;
isGenerating: boolean;
downloadProgress: number;
error: string | null;
runInference: (input: string) => Promise<void>;
forward: (input: string) => Promise<void>;
generate: (messages: Message[], tools?: LLMTool[]) => Promise<void>;
sendMessage: (message: string) => Promise<void>;
deleteMessage: (index: number) => void;
interrupt: () => void;
Expand All @@ -84,12 +85,12 @@ type ResourceSource = string | number;

type MessageRole = 'user' | 'assistant' | 'system';

interface MessageType {
interface Message {
role: MessageRole;
content: string;
}
interface ChatConfig {
initialMessageHistory: MessageType[];
initialMessageHistory: Message[];
contextWindowLength: number;
systemPrompt: string;
}
Expand Down Expand Up @@ -135,7 +136,7 @@ Given computational constraints, our architecture is designed to support only on

- **`systemPrompt`** - Often used to tell the model what is its purpose, for example - "Be a helpful translator".

- **`initialMessageHistory`** - An array of `MessageType` objects that represent the conversation history. This can be used to provide initial context to the model.
- **`initialMessageHistory`** - An array of `Message` objects that represent the conversation history. This can be used to provide initial context to the model.

- **`contextWindowLength`** - The number of messages from the current conversation that the model will use to generate a response. The higher the number, the more context the model will have. Keep in mind that using larger context windows will result in longer inference time and higher memory usage.

Expand All @@ -149,18 +150,19 @@ Given computational constraints, our architecture is designed to support only on

### Returns

| Field | Type | Description |
| ------------------ | ------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `messageHistory` | `MessageType[]` | State of the generated response. This field is updated with each token generated by the model |
| `response` | `string` | State of the generated response. This field is updated with each token generated by the model |
| `isReady` | `boolean` | Indicates whether the model is ready |
| `isGenerating` | `boolean` | Indicates whether the model is currently generating a response |
| `downloadProgress` | `number` | Represents the download progress as a value between 0 and 1, indicating the extent of the model file retrieval. |
| `error` | <code>string &#124; null</code> | Contains the error message if the model failed to load |
| `sendMessage` | `(message: string, tools?: LLMTool[]) => Promise<void>` | Method to add user message to conversation. After model responds, `messageHistory` will be updated with both user message and model response. |
| `deleteMessage` | `(index: number) => void` | Deletes all messages starting with message on `index` position. |
| `runInference` | `(input: string) => Promise<void>` | Runs model inference with raw input string. You need to provide entire conversation and prompt (in correct format and with special tokens!) in input string to this method. It doesn't manage conversation context. It is intended for users that need access to the model itself without any wrapper. If you want simple chat with model consider using `sendMessage` |
| `interrupt` | `() => void` | Function to interrupt the current inference |
| Field | Type | Description |
| ------------------ | ----------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `messageHistory` | `Message[]` | State of the generated response. This field is updated with each token generated by the model |
| `response` | `string` | State of the generated response. This field is updated with each token generated by the model |
| `isReady` | `boolean` | Indicates whether the model is ready |
| `isGenerating` | `boolean` | Indicates whether the model is currently generating a response |
| `downloadProgress` | `number` | Represents the download progress as a value between 0 and 1, indicating the extent of the model file retrieval. |
| `error` | <code>string &#124; null</code> | Contains the error message if the model failed to load |
| `sendMessage` | `(message: string, tools?: LLMTool[]) => Promise<void>` | Method to add user message to conversation. After model responds, `messageHistory` will be updated with both user message and model response. |
| `deleteMessage` | `(index: number) => void` | Deletes all messages starting with message on `index` position. After deletion `messageHistory` will be updated. |
| `generate` | `(messages: Message[], tools?: LLMTool[]) => Promise<void>` | Runs model to complete chat passed in `messages` argument. It doesn't manage conversation context. |
| `forward` | `(input: string) => Promise<void>` | Runs model inference with raw input string. You need to provide entire conversation and prompt (in correct format and with special tokens!) in input string to this method. It doesn't manage conversation context. It is intended for users that need access to the model itself without any wrapper. If you want simple chat with model consider using `sendMessage` |
| `interrupt` | `() => void` | Function to interrupt the current inference |

## Sending a message

Expand Down Expand Up @@ -262,6 +264,17 @@ const message = `Hi, what's the weather like in Cracow right now?`;
await llm.sendMessage(message);
```

## Available models

| Model Family | Sizes | Quantized |
| ---------------------------------------------------------------------------------------- | :--------------: | :-------: |
| [Hammer 2.1](https://huggingface.co/software-mansion/react-native-executorch-hammer-2.1) | 0.5B, 1.5B, 3B | ✅ |
| [Qwen 2.5](https://huggingface.co/software-mansion/react-native-executorch-qwen-2.5) | 0.5B, 1.5B, 3B | ✅ |
| [Qwen 3](https://huggingface.co/software-mansion/react-native-executorch-qwen-3) | 0.6B, 1.7B, 4B | ✅ |
| [Phi 4 Mini](https://huggingface.co/software-mansion/react-native-executorch-phi-4-mini) | 4B | ✅ |
| [SmolLM 2](https://huggingface.co/software-mansion/react-native-executorch-smolLm-2) | 135M, 360M, 1.7B | ✅ |
| [LLaMA 3.2](https://huggingface.co/software-mansion/react-native-executorch-llama-3.2) | 1B, 3B | ✅ |

## Benchmarks

### Model size
Expand Down
Loading