|
5 | 5 | <br><em>Squeeze out the juice, leave the pulp behind.</em> |
6 | 6 | </p> |
7 | 7 |
|
8 | | -LLM coding agents waste 80-95% of context tokens on irrelevant tool output. Squeez extracts only the lines that matter, compressing tool output by ~91% while keeping 86% of the relevant information. |
9 | | - |
10 | 8 | [](https://pypi.org/project/squeez/) |
11 | 9 | [](https://huggingface.co/KRLabsOrg/squeez-2b) |
12 | 10 | [](https://huggingface.co/datasets/KRLabsOrg/tool-output-extraction-swebench) |
13 | 11 | [](https://opensource.org/licenses/Apache-2.0) |
14 | 12 |
|
15 | | -## How it works |
| 13 | +- Tool output pruner for LLM coding agents |
| 14 | +- Pipe any tool output (pytest, grep, git log, npm build, kubectl, ...) through squeez with a task description, get back only the relevant lines |
| 15 | +- Fine-tuned Qwen 3.5 2B, 0.79 F1, ~91% compression |
| 16 | +- CLI pipe, Python library, or vLLM server |
16 | 17 |
|
17 | | -Squeez uses a fine-tuned Qwen 3.5 2B model to read tool output alongside a task description and return only the relevant lines. |
| 18 | +```bash |
| 19 | +pip install squeez |
| 20 | +python -m pytest tests/ -v 2>&1 | squeez "find the test failure related to authentication" |
| 21 | +``` |
18 | 22 |
|
19 | | -### Example: filtering test output |
| 23 | +## Example |
20 | 24 |
|
21 | 25 | Task: *"Find the test failure related to authentication"* |
22 | 26 |
|
@@ -81,10 +85,6 @@ E Got: rejection after 15m (timeout changed?) |
81 | 85 | </tr> |
82 | 86 | </table> |
83 | 87 |
|
84 | | -```bash |
85 | | -$ python -m pytest tests/ -v 2>&1 | squeez "find the test failure related to authentication" |
86 | | -``` |
87 | | - |
88 | 88 | <details> |
89 | 89 | <summary><b>More examples</b></summary> |
90 | 90 |
|
@@ -134,28 +134,18 @@ Evaluated on 617 held-out test samples from SWE-bench, across 14 tool types: |
134 | 134 |
|
135 | 135 | Squeez-2B (2B params) outperforms a 35B MoE model at zero-shot and is 6x better than BM25 on Span F1. |
136 | 136 |
|
137 | | -## Install |
138 | | - |
139 | | -```bash |
140 | | -pip install squeez |
141 | | -``` |
142 | | - |
143 | 137 | ## Quick start |
144 | 138 |
|
145 | 139 | ### With vLLM (recommended) |
146 | 140 |
|
147 | 141 | ```bash |
148 | | -# Start the server |
149 | 142 | pip install vllm |
150 | 143 | vllm serve KRLabsOrg/squeez-2b --dtype bfloat16 --max-model-len 16384 |
151 | 144 |
|
152 | 145 | # Use from squeez CLI |
153 | 146 | pip install squeez |
154 | 147 | export SQUEEZ_SERVER_URL=http://localhost:8000/v1 |
155 | 148 | cat output.txt | squeez "find the bug" |
156 | | - |
157 | | -# Or pipe directly |
158 | | -python -m pytest tests/ -v 2>&1 | squeez "find the test failure related to authentication" |
159 | 149 | ``` |
160 | 150 |
|
161 | 151 | vLLM keeps the model warm in memory with batched inference and high throughput. |
@@ -192,9 +182,6 @@ extractor = ToolOutputExtractor() |
192 | 182 | # Or connect to a server |
193 | 183 | extractor = ToolOutputExtractor(base_url="http://localhost:8000/v1") |
194 | 184 |
|
195 | | -# Or use a custom local model |
196 | | -extractor = ToolOutputExtractor(model_path="./output/squeez_qwen") |
197 | | - |
198 | 185 | filtered = extractor.extract( |
199 | 186 | task="Find the referer validation block", |
200 | 187 | tool_output=raw_output, |
|
0 commit comments