⚡ Copilot Token Optimization2026-05-17 — test-coverage-improver

## Target Workflow: `test-coverage-improver`

**Source report:** See latest `token-usage-report` issue
**Estimated cost per run:** N/A (Copilot usage — not billed by token)
**Total tokens per run:** ~3.9M input / ~39M effective tokens (successful run)
**Cache hit rate (within-session):** 49% — but **0% cross-run** (ambient context never cached)
**LLM turns/requests:** 66 requests per successful run
**Model:** claude-sonnet-4.6
**Run frequency:** 2× daily (8am + 8pm UTC)

## Observed Run Data (last 7 days)

| Run | Status | Effective Tokens | Duration | Requests |
|-----|--------|-----------------|----------|----------|
| 25976952947 | ✅ success | 39.4M | 9.7m | 66 |
| 25976313126 | ❌ failure | 0 (skipped early) | 1.5m | — |
| 25974930350 | ❌ failure | 166.7M | 27.2m | ~2 turns |

**Episode total:** ~206M effective tokens across 3 episodes.

## Current Configuration

| Setting | Value |
|---------|-------|
| Tools loaded | `github: [repos, pull_requests]` + 14 bash commands |
| Network groups | `github` only |
| Pre-agent steps | ✅ Yes — build, coverage, summary extraction |
| Prompt size | ~30,679 input tokens (ambient context) |
| Ambient context cached | **0 tokens** (0% cross-run cache hit) |
| Bash tools | `npm run build`, `npm run test`, `npm run test:coverage`, `npm run lint`, `cat:src/*.ts`, `cat:tests/**`, etc. |

## Root Cause: Zero Cross-Run Prefix Caching

The prompt template injects three dynamic blocks at render time:

```
${{ steps.coverage-summary.outputs.COVERAGE_JSON }}   # full JSON, changes every run
${{ steps.coverage-md.outputs.COVERAGE_MD }}           # COVERAGE_SUMMARY.md, changes with PRs
${{ steps.low-coverage.outputs.LOW_COVERAGE }}         # filtered list, also dynamic
```

Because these blocks appear **inside the static prompt**, the entire 30K-token system prompt becomes unique on every run. Claude's prefix caching requires identical prefixes — any change busts the cache. Result: every one of the 66 LLM requests in a run processes the full 30K ambient context as fresh input.

## Recommendations

### 1. Move all dynamic `${{ }}` injections to the END of the prompt

**Estimated savings:** ~20K cache-eligible tokens × 66 requests × 2 runs/day = **~2.6M tokens/day** shifted to cheap cache reads (0.1× multiplier vs 1× full price)

Current layout (simplified):
```
[static context: 8K tokens]
[dynamic: COVERAGE_JSON block: ~5K tokens]   ← cache buster
[static: guidelines, phases, examples: 15K tokens]
[dynamic: LOW_COVERAGE, COVERAGE_MD]
```

Restructured layout:
```
[ALL static content first: ~25K tokens]      ← this prefix caches across runs
---
## Current Coverage Data (this run)
${{ steps.coverage-md.outputs.COVERAGE_MD }}
${{ steps.low-coverage.outputs.LOW_COVERAGE }}
```

Specific change to `.github/workflows/test-coverage-improver.md`: move the `## Current Coverage Status` section (lines 258–280) to the very end of the document body, after all static guidelines and examples.

### 2. Remove the full `COVERAGE_JSON` block from the prompt

**Estimated savings:** ~5K tokens × 66 requests = **~330K tokens/run**

The full `coverage-summary.json` is injected as a `json` code block, but the pre-step already extracts the actionable information as `LOW_COVERAGE` (files below 80%). The raw JSON adds noise and token cost without additional signal.

**Change:** Remove this block entirely from the prompt:

```
### Coverage JSON (full)

```json
${{ steps.coverage-summary.outputs.COVERAGE_JSON }}
```
```

Keep `COVERAGE_SUMMARY.md` (human-readable) and `LOW_COVERAGE` (prioritized list). The agent can `cat coverage/coverage-summary.json` via bash tool if it needs more detail.

### 3. Remove redundant bash tools already covered by pre-steps

**Estimated savings:** Prevents the agent from re-running expensive build/test commands, reducing session length by an estimated 20–30%

The pre-steps already run `npm ci`, `npm run build`, and `npm run test:coverage`. However, the `tools: bash:` section still allows the agent to re-run them:

```yaml
# CURRENT — allows redundant re-runs:
tools:
  bash:
    - "npm run build"        # ← already done in pre-steps
    - "npm run test"         # ← redundant (test:coverage covers this)
    - "npm run test:coverage" # ← already done in pre-steps
    - "npm run lint"
    - ...
```

**Proposed change:**
```yaml
tools:
  bash:
    - "npm run test"          # only for writing new tests iteratively
    - "npm run lint"
    - "cat:src/*.test.ts"
    - "cat:src/*.ts"
    - "cat:tests/**"
    - "cat:coverage/coverage-summary.json"
    - "cat:jest.config.js"
    - "cat:jest.config.ts"
    - "ls:src"
    - "ls:tests"
    - "ls:coverage"
    - "head:*"
    - "tail:*"
```

Remove `npm run build` (agent isn't modifying the build system) and `npm run test:coverage` (pre-steps already ran this; agent should use `npm run test` for fast iteration after writing new tests).

### 4. Reduce run frequency from 2× to 1× daily

**Estimated savings:** 50% reduction in total runs → halves all other token costs

The workflow runs at `cron: '0 8,20 * * *'` (twice daily). Coverage improvements are incremental and PRs require human review — there's no benefit from checking twice daily.

**Change:**
```yaml
on:
  schedule:
    - cron: '0 8 * * *'   # once daily at 8am UTC
```

## Expected Impact

| Metric | Current | Projected | Savings |
|--------|---------|-----------|---------|
| Ambient cache hit rate | 0% | ~85%+ | Major |
| Effective tokens/successful run | ~39M | ~15M | −62% |
| Effective tokens/week | ~546M | ~105M | −81% |
| LLM requests/run | 66 | 50–55 (est.) | −17% |
| Session duration | 9.7m | 7–8m (est.) | −20% |
| Prompt input tokens | 30,679 | ~25,000 | −19% |

## Implementation Checklist

- [ ] Move `## Current Coverage Status` section to end of prompt (fixes cross-run cache miss)
- [ ] Remove `### Coverage JSON (full)` block from prompt (reduces prompt ~5K tokens)
- [ ] Remove `npm run build` and `npm run test:coverage` from `tools: bash:` list
- [ ] Change cron from `0 8,20 * * *` to `0 8 * * *` (once daily)
- [ ] Recompile: `gh aw compile .github/workflows/test-coverage-improver.md`
- [ ] Trigger manual run and compare token usage to baseline
- [ ] Verify CI passes on updated workflow




> Generated by [Daily Copilot Token Optimization Advisor](https://github.com/github/gh-aw-firewall/actions/runs/25986753629/agentic_workflow) · ● 9.1M · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw-firewall+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw-firewall%2Fcopilot-token-optimizer%22&type=issues)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⚡ Copilot Token Optimization2026-05-17 — test-coverage-improver #3293

Target Workflow: `test-coverage-improver`

Observed Run Data (last 7 days)

Current Configuration

Root Cause: Zero Cross-Run Prefix Caching

Recommendations

1. Move all dynamic `${{ }}` injections to the END of the prompt

2. Remove the full `COVERAGE_JSON` block from the prompt

4. Reduce run frequency from 2× to 1× daily

Expected Impact

Implementation Checklist

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Run	Status	Effective Tokens	Duration	Requests
25976952947	✅ success	39.4M	9.7m	66
25976313126	❌ failure	0 (skipped early)	1.5m	—
25974930350	❌ failure	166.7M	27.2m	~2 turns

Setting	Value
Tools loaded	`github: [repos, pull_requests]` + 14 bash commands
Network groups	`github` only
Pre-agent steps	✅ Yes — build, coverage, summary extraction
Prompt size	~30,679 input tokens (ambient context)
Ambient context cached	0 tokens (0% cross-run cache hit)
Bash tools	`npm run build`, `npm run test`, `npm run test:coverage`, `npm run lint`, `cat:src/.ts`, `cat:tests/*`, etc.

Metric	Current	Projected	Savings
Ambient cache hit rate	0%	~85%+	Major
Effective tokens/successful run	~39M	~15M	−62%
Effective tokens/week	~546M	~105M	−81%
LLM requests/run	66	50–55 (est.)	−17%
Session duration	9.7m	7–8m (est.)	−20%
Prompt input tokens	30,679	~25,000	−19%

⚡ Copilot Token Optimization2026-05-17 — test-coverage-improver #3293

Description

Target Workflow: test-coverage-improver

Observed Run Data (last 7 days)

Current Configuration

Root Cause: Zero Cross-Run Prefix Caching

Recommendations

1. Move all dynamic ${{ }} injections to the END of the prompt

2. Remove the full COVERAGE_JSON block from the prompt

4. Reduce run frequency from 2× to 1× daily

Expected Impact

Implementation Checklist

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Target Workflow: `test-coverage-improver`

1. Move all dynamic `${{ }}` injections to the END of the prompt

2. Remove the full `COVERAGE_JSON` block from the prompt