Commit 43870f7
fix: decompress gzip responses for Anthropic token extraction (#1550)
* fix: add token tracking for WebSocket streaming (Claude)
Claude Code CLI uses WebSocket streaming to the Anthropic API, which
routes through proxyWebSocket() instead of proxyRequest(). The
proxyWebSocket function did not call trackTokenUsage(), so all
Anthropic/Claude token usage went unrecorded.
This adds:
- parseWebSocketFrames(): lightweight server→client frame parser
- trackWebSocketTokenUsage(): sniffs upstream TLS socket data events,
skips HTTP 101 header, parses WebSocket text frames, and extracts
token usage using existing extractUsageFromSseLine()
- 12 new tests for frame parsing and WebSocket token extraction
The fix is non-blocking: it adds a data listener alongside the existing
bidirectional pipe relay, with no impact on latency or throughput.
Closes #1536
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* chore: add diagnostic logging for token tracking
Writes token-diag.log alongside token-usage.jsonl in the mounted log
volume. Since api-proxy container stdout is not captured in workflow
logs, this file provides visibility into:
- Whether trackTokenUsage (HTTP) or trackWebSocketTokenUsage (WS) is called
- Content-type, status code, streaming flag for each request
- Whether usage data was found and which fields were extracted
- Frame counts and message counts for WebSocket tracking
This will help diagnose why Claude/Anthropic produces no token records.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* chore: capture raw SSE sample in diagnostics
Add first 500 bytes of raw response data to token-diag.log entries.
This will reveal the actual SSE format from the Anthropic beta API
that the parser is failing to extract usage from.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix: decompress gzip responses for Anthropic token extraction
The Anthropic API returns gzip-compressed SSE responses (content-encoding:
gzip). The token tracker was trying to parse compressed binary data as SSE
text, which silently failed to extract any usage information.
Changes:
- Add gzip/deflate/brotli decompression support in trackTokenUsage()
- Create decompression pipeline when content-encoding header is present
- Raw compressed bytes still flow to client unchanged via pipe()
- Gate diagnostic logging behind AWF_DEBUG_TOKENS=1 env var
- Add isCompressedResponse() and createDecompressor() helpers
- Add 8 new tests for compressed response handling (gzip SSE, gzip JSON,
multi-chunk gzip, backward compat with uncompressed)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix: address PR review feedback on token tracker
- Set WebSocket record status to 101 instead of 200
- Track header bytes separately; report only WS payload in response_bytes
- Properly unmask masked WebSocket frames with XOR key
- Sanitize diag() to strip raw_sample before writing to disk (CodeQL)
- Add test for masked frame unmasking
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>1 parent 505e51e commit 43870f7
3 files changed
Lines changed: 897 additions & 26 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
22 | 23 | | |
23 | 24 | | |
24 | | - | |
| 25 | + | |
25 | 26 | | |
26 | 27 | | |
27 | 28 | | |
| 29 | + | |
28 | 30 | | |
29 | 31 | | |
30 | 32 | | |
| |||
672 | 674 | | |
673 | 675 | | |
674 | 676 | | |
| 677 | + | |
| 678 | + | |
| 679 | + | |
| 680 | + | |
| 681 | + | |
| 682 | + | |
| 683 | + | |
| 684 | + | |
| 685 | + | |
675 | 686 | | |
676 | 687 | | |
677 | 688 | | |
| |||
0 commit comments