Skip to content

Commit 2bddae5

Browse files
mkopcinsMateusz Kopciński
andauthored
chore: 0.5.8 (#633)
## Description <!-- Provide a concise and descriptive summary of the changes implemented in this PR. --> ### Introduces a breaking change? - [ ] Yes - [ ] No ### Type of change - [ ] Bug fix (change which fixes an issue) - [ ] New feature (change which adds functionality) - [ ] Documentation update (improves or adds clarity to existing documentation) - [ ] Other (chores, tests, code style improvements etc.) ### Tested on - [ ] iOS - [ ] Android ### Testing instructions <!-- Provide step-by-step instructions on how to test your changes. Include setup details if necessary. --> ### Screenshots <!-- Add screenshots here, if applicable --> ### Related issues <!-- Link related issues here using #issue-number --> ### Checklist - [ ] I have performed a self-review of my code - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have updated the documentation accordingly - [ ] My changes generate no new warnings ### Additional notes <!-- Include any additional information, assumptions, or context that reviewers might need to understand this PR. --> --------- Co-authored-by: Mateusz Kopciński <mateusz.kopcinski@swmansnion.com>
1 parent 86b2a59 commit 2bddae5

File tree

7 files changed

+79
-39
lines changed

7 files changed

+79
-39
lines changed

docs/docs/02-hooks/01-natural-language-processing/useLLM.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,7 @@ interface LLMType {
113113
toolsConfig?: ToolsConfig;
114114
generationConfig?: GenerationConfig;
115115
}) => void;
116+
getGeneratedTokenCount: () => number;
116117
generate: (messages: Message[], tools?: LLMTool[]) => Promise<void>;
117118
sendMessage: (message: string) => Promise<void>;
118119
deleteMessage: (index: number) => void;
@@ -133,6 +134,11 @@ interface ChatConfig {
133134
systemPrompt: string;
134135
}
135136

137+
interface GenerationConfig {
138+
outputTokenBatchSize: number;
139+
batchTimeInterval: number;
140+
}
141+
136142
// tool calling
137143
interface ToolsConfig {
138144
tools: LLMTool[];
@@ -145,11 +151,6 @@ interface ToolCall {
145151
arguments: Object;
146152
}
147153

148-
interface GenerationConfig {
149-
outputTokenBatchSize: number;
150-
batchTimeInterval: number;
151-
}
152-
153154
type LLMTool = Object;
154155
```
155156

docs/docs/03-typescript-api/01-natural-language-processing/LLMModule.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -62,18 +62,18 @@ interface ChatConfig {
6262
systemPrompt: string;
6363
}
6464

65+
interface GenerationConfig {
66+
outputTokenBatchSize: number;
67+
batchTimeInterval: number;
68+
}
69+
6570
// tool calling
6671
interface ToolsConfig {
6772
tools: LLMTool[];
6873
executeToolCallback: (call: ToolCall) => Promise<string | null>;
6974
displayToolCalls?: boolean;
7075
}
7176

72-
interface GenerationConfig {
73-
outputTokenBatchSize: number;
74-
batchTimeInterval: number;
75-
}
76-
7777
interface ToolCall {
7878
toolName: string;
7979
arguments: Object;

docs/versioned_docs/version-0.5.x/02-hooks/01-natural-language-processing/useLLM.md

Lines changed: 34 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -64,20 +64,21 @@ For more information on loading resources, take a look at [loading models](../..
6464

6565
### Returns
6666

67-
| Field | Type | Description |
68-
| ------------------ | --------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
69-
| `generate()` | `(messages: Message[], tools?: LLMTool[]) => Promise<void>` | Runs model to complete chat passed in `messages` argument. It doesn't manage conversation context. |
70-
| `interrupt()` | `() => void` | Function to interrupt the current inference. |
71-
| `response` | `string` | State of the generated response. This field is updated with each token generated by the model. |
72-
| `token` | `string` | The most recently generated token. |
73-
| `isReady` | `boolean` | Indicates whether the model is ready. |
74-
| `isGenerating` | `boolean` | Indicates whether the model is currently generating a response. |
75-
| `downloadProgress` | `number` | Represents the download progress as a value between 0 and 1, indicating the extent of the model file retrieval. |
76-
| `error` | <code>string &#124; null</code> | Contains the error message if the model failed to load. |
77-
| `configure` | `({ chatConfig?: Partial<ChatConfig>, toolsConfig?: ToolsConfig }) => void` | Configures chat and tool calling. See more details in [configuring the model](#configuring-the-model). |
78-
| `sendMessage` | `(message: string) => Promise<void>` | Function to add user message to conversation. After model responds, `messageHistory` will be updated with both user message and model response. |
79-
| `deleteMessage` | `(index: number) => void` | Deletes all messages starting with message on `index` position. After deletion `messageHistory` will be updated. |
80-
| `messageHistory` | `Message[]` | History containing all messages in conversation. This field is updated after model responds to `sendMessage`. |
67+
| Field | Type | Description |
68+
| ------------------------ | -------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
69+
| `generate()` | `(messages: Message[], tools?: LLMTool[]) => Promise<void>` | Runs model to complete chat passed in `messages` argument. It doesn't manage conversation context. |
70+
| `interrupt()` | `() => void` | Function to interrupt the current inference. |
71+
| `response` | `string` | State of the generated response. This field is updated with each token generated by the model. |
72+
| `token` | `string` | The most recently generated token. |
73+
| `isReady` | `boolean` | Indicates whether the model is ready. |
74+
| `isGenerating` | `boolean` | Indicates whether the model is currently generating a response. |
75+
| `downloadProgress` | `number` | Represents the download progress as a value between 0 and 1, indicating the extent of the model file retrieval. |
76+
| `error` | <code>string &#124; null</code> | Contains the error message if the model failed to load. |
77+
| `configure` | `({chatConfig?: Partial<ChatConfig>, toolsConfig?: ToolsConfig, generationConfig?: GenerationConfig}) => void` | Configures chat and tool calling. See more details in [configuring the model](#configuring-the-model). |
78+
| `sendMessage` | `(message: string) => Promise<void>` | Function to add user message to conversation. After model responds, `messageHistory` will be updated with both user message and model response. |
79+
| `deleteMessage` | `(index: number) => void` | Deletes all messages starting with message on `index` position. After deletion `messageHistory` will be updated. |
80+
| `messageHistory` | `Message[]` | History containing all messages in conversation. This field is updated after model responds to `sendMessage`. |
81+
| `getGeneratedTokenCount` | `() => number` | Returns the number of tokens generated in the last response. |
8182

8283
<details>
8384
<summary>Type definitions</summary>
@@ -106,10 +107,13 @@ interface LLMType {
106107
configure: ({
107108
chatConfig,
108109
toolsConfig,
110+
generationConfig,
109111
}: {
110112
chatConfig?: Partial<ChatConfig>;
111113
toolsConfig?: ToolsConfig;
114+
generationConfig?: GenerationConfig;
112115
}) => void;
116+
getGeneratedTokenCount: () => number;
113117
generate: (messages: Message[], tools?: LLMTool[]) => Promise<void>;
114118
sendMessage: (message: string) => Promise<void>;
115119
deleteMessage: (index: number) => void;
@@ -130,6 +134,11 @@ interface ChatConfig {
130134
systemPrompt: string;
131135
}
132136

137+
interface GenerationConfig {
138+
outputTokenBatchSize: number;
139+
batchTimeInterval: number;
140+
}
141+
133142
// tool calling
134143
interface ToolsConfig {
135144
tools: LLMTool[];
@@ -151,7 +160,7 @@ type LLMTool = Object;
151160

152161
You can use functions returned from this hooks in two manners:
153162

154-
1. Functional/pure - we will not keep any state for you. You'll need to keep conversation history and handle function calling yourself. Use `generate` (and rarely `forward`) and `response`. Note that you don't need to run `configure` to use those. Furthermore, it will not have any effect on those functions.
163+
1. Functional/pure - we will not keep any state for you. You'll need to keep conversation history and handle function calling yourself. Use `generate` (and rarely `forward`) and `response`. Note that you don't need to run `configure` to use those. Furthermore, `chatConfig` and `toolsConfig` will not have any effect on those functions.
155164

156165
2. Managed/stateful - we will manage conversation state. Tool calls will be parsed and called automatically after passing appropriate callbacks. See more at [managed LLM chat](#managed-llm-chat).
157166

@@ -271,6 +280,12 @@ To configure model (i.e. change system prompt, load initial conversation history
271280

272281
- **`displayToolCalls`** - If set to true, JSON tool calls will be displayed in chat. If false, only answers will be displayed.
273282

283+
**`generationConfig`** - Object configuring generation settings, currently only output token batching.
284+
285+
- **`outputTokenBatchSize`** - Soft upper limit on the number of tokens in each token batch (in certain cases there can be more tokens in given batch, i.e. when the batch would end with special emoji join character).
286+
287+
- **`batchTimeInterval`** - Upper limit on the time interval between consecutive token batches.
288+
274289
### Sending a message
275290

276291
In order to send a message to the model, one can use the following code:
@@ -463,6 +478,10 @@ The response should include JSON:
463478
}
464479
```
465480

481+
## Token Batching
482+
483+
Depending on selected model and the user's device generation speed can be above 60 tokens per second. If the `tokenCallback` triggers rerenders and is invoked on every single token it can significantly decrease the app's performance. To alleviate this and help improve performance we've implemented token batching. To configure this you need to call `configure` method and pass `generationConfig`. Inside you can set two parameters `outputTokenBatchSize` and `batchTimeInterval`. They set the size of the batch before tokens are emitted and the maximum time interval between consecutive batches respectively. Each batch is emitted if either `timeInterval` elapses since last batch or `countInterval` number of tokens are generated. This allows for smooth generation even if model lags during generation. Default parameters are set to 10 tokens and 80ms for time interval (~12 batches per second).
484+
466485
## Available models
467486

468487
| Model Family | Sizes | Quantized |

0 commit comments

Comments
 (0)