You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+40-31Lines changed: 40 additions & 31 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,24 +5,25 @@
5
5
<br><em>Squeeze out the juice, leave the pulp behind.</em>
6
6
</p>
7
7
8
-
LLM coding agents waste **80-95% of context tokens** on irrelevant tool output. Squeez extracts only the lines that matter — compressing tool output by ~86% on average.
8
+
LLM coding agents waste 80-95% of context tokens on irrelevant tool output. Squeez extracts only the lines that matter, compressing tool output by ~91% while keeping 86% of the relevant information.
cat output.txt | squeez "Find the failing traceback block"
169
+
squeez "Fix the CSRF bug" --input-file output.txt
156
170
```
157
171
158
-
Or via CLI flags:
172
+
> **Note:** Local mode loads the model on every call. Fine for one-off use, but for repeated calls (e.g. an agent piping every tool through squeez), use vLLM.
159
173
160
-
```bash
161
-
squeez "Find the bug" \
162
-
--server-url http://localhost:8000/v1 \
163
-
--server-model KRLabsOrg/squeez-qwen3.5-2b \
164
-
--input-file output.txt
165
-
```
174
+
### Any OpenAI-compatible API
166
175
167
-
Works with any OpenAI-compatible API (Groq, Together, etc.) — just set the URL, model name, and API key:
176
+
Works with Groq, Together, or any OpenAI-compatible server. Set the URL, model name, and API key:
0 commit comments