You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -123,16 +126,31 @@ When a `summarizer` is provided, each message goes through a three-level fallbac
123
126
124
127
#### Token budget
125
128
126
-
Use `tokenBudget` to automatically find the least compression needed to fit a token limit. The engine binary-searches `recencyWindow` internally. Token counts are estimated at ~3.5 characters per token — a reasonable average across models, but not exact. For precise budgeting, use `tokenBudget` as an approximate guide and verify with your model's tokenizer.
129
+
Use `tokenBudget` to automatically find the least compression needed to fit a token limit. The engine binary-searches `recencyWindow` internally.
130
+
131
+
By default, tokens are estimated at ~3.5 characters per token. For accurate budgeting, pass a `tokenCounter` that uses your model's tokenizer — the counter is used for all budget decisions, binary search iterations, force-converge deltas, and `token_ratio` stats.
const text =typeofmsg.content==='string'?msg.content:'';
151
+
returnencode(text).length;
152
+
},
153
+
});
136
154
137
155
// With LLM summarizer for tighter fits
138
156
const result =awaitcompress(messages, {
@@ -141,6 +159,16 @@ const result = await compress(messages, {
141
159
});
142
160
```
143
161
162
+
When `forceConverge` is enabled, the engine hard-truncates non-recency messages to 512 characters if the binary search bottoms out and the budget is still exceeded. This mirrors LCM's Level 3 `DeterministicTruncate` — no LLM involved, guaranteed convergence.
163
+
164
+
```ts
165
+
const result =compress(messages, {
166
+
tokenBudget: 4000,
167
+
forceConverge: true,
168
+
});
169
+
// result.fits is guaranteed true (unless only system/recency messages remain)
170
+
```
171
+
144
172
### uncompress
145
173
146
174
Restore originals from the verbatim store. Always sync. Supports recursive expansion for multi-layer compression (up to 10 levels deep).
0 commit comments