Commit d5c7c9d
committed
fix: be patient with thinking models that output reasoning as plain text
llama-server with Qwen3.5/Claude-distilled models outputs thinking
as 'Let me analyze...' plain text in delta.content (no <think> tags,
no separate reasoning field). The JSON-expect abort was firing at
50 chars, killing the request after 8-10 tokens before the model
could output actual JSON.
Changes:
- Raised JSON content check threshold from 50 to 200 chars
- Strip common plain-text reasoning prefixes before checking
- Only abort if 200+ chars of non-JSON, non-reasoning content1 parent d4b3550 commit d5c7c9d
File tree
1 file changed
+12
-4
lines changed- skills/analysis/home-security-benchmark/scripts
1 file changed
+12
-4
lines changedLines changed: 12 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
428 | 428 | | |
429 | 429 | | |
430 | 430 | | |
431 | | - | |
432 | | - | |
433 | | - | |
434 | | - | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
435 | 443 | | |
436 | 444 | | |
437 | 445 | | |
| |||
0 commit comments