Skip to content

Commit b4fa1d2

Browse files
authored
Merge pull request #70 from dgenio/claude/triage-issues-WowaN
Add cross-invocation budget manager (issue #44)
2 parents 68f5691 + 8b54299 commit b4fa1d2

12 files changed

Lines changed: 1092 additions & 20 deletions

File tree

CHANGELOG.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,26 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
88
## [Unreleased]
99

1010
### Added
11+
- Cross-invocation context budget manager (`BudgetManager`) tracks cumulative token usage across
12+
multiple `Kernel.invoke()` calls within a session. When attached to a `Kernel` via the new
13+
`budget_manager` keyword argument, the kernel reserves a budget slice before each invocation
14+
and reconciles actual frame-payload usage afterwards. As the remaining budget shrinks the
15+
requested `response_mode` is auto-escalated to a more aggressive tier (> 50% remaining keeps
16+
the caller's mode; 20–50% downgrades `raw` to `table`; 5–20% floors at `summary`; < 5% forces
17+
`handle_only`). `Kernel.invoke(..., dry_run=True)` now also reports `budget_remaining` and the
18+
escalated `response_mode` when a manager is configured. The `BudgetManager` is optional and
19+
off by default — existing kernels are unchanged. (#44)
20+
- `TokenCounter` protocol and `default_token_counter` (character-based `len(json.dumps(...))//4`
21+
approximation) provide pluggable token counting without runtime dependencies. A new optional
22+
`[tiktoken]` extra is reserved for callers that want to plug in `tiktoken`-based counting.
23+
- `BudgetExhausted(AgentKernelError)` raised by `BudgetManager.allocate()` (and by
24+
`Kernel.invoke()` before driver execution) when the cumulative session budget is fully spent.
25+
- `BudgetConfigError(AgentKernelError)` raised by `BudgetManager` for invalid configuration or
26+
validation failures (non-positive budgets, negative allocate/record/release amounts), replacing
27+
bare `ValueError` so callers can catch budget mistakes via the `AgentKernelError` hierarchy
28+
per `AGENTS.md` ("never raise bare ValueError to callers").
29+
- New public exports: `BudgetManager`, `BudgetExhausted`, `BudgetConfigError`, `TokenCounter`,
30+
`default_token_counter`, and `Kernel.budget` accessor property.
1131
- LLM tool-format adapters and middleware (`agent_kernel.adapters`): `OpenAIMiddleware` (OpenAI
1232
Responses API + Chat Completions, auto-detected on input) and `AnthropicMiddleware` (Anthropic
1333
Messages with `cache_control` support). Both translate `Capability` objects to vendor tool

docs/context_firewall.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,3 +62,59 @@ Summaries are produced deterministically:
6262
- **dict** → key list + per-value type/value
6363
- **string** → truncated to 500 chars
6464
- **other** → repr() truncated to 200 chars
65+
66+
## Cross-invocation budgets
67+
68+
The per-invocation `Budgets` above cap a single Frame. A separate
69+
`BudgetManager` tracks cumulative token usage *across* invocations within a
70+
session. It is optional — if you don't attach one, kernel behavior is
71+
unchanged.
72+
73+
```python
74+
from agent_kernel import BudgetManager, Kernel
75+
76+
manager = BudgetManager(total_budget=100_000)
77+
kernel = Kernel(registry, budget_manager=manager)
78+
```
79+
80+
Per `invoke()` the kernel:
81+
82+
1. Reserves a slice of the remaining budget (default 4,000 tokens). If the
83+
budget is empty, `BudgetExhausted` is raised before the driver runs.
84+
2. Consults `manager.suggested_mode(requested)` to escalate the requested
85+
`response_mode` to a more aggressive tier as the remaining budget shrinks.
86+
3. After the firewall produces a Frame, counts the actual tokens in the
87+
LLM-facing payload and reconciles them against the reservation.
88+
89+
Escalation table:
90+
91+
| Budget remaining | Suggested mode (effective `response_mode`) |
92+
|-----------------:|------------------------------------------------|
93+
| > 50% | Caller's requested mode (no change) |
94+
| 20% – 50% | `table` (when caller requested `raw`) |
95+
| 5% – 20% (≥ 5%) | `summary` (floor — never *relaxes* to `table`) |
96+
| < 5% | `handle_only` |
97+
98+
Boundaries land in the more-conservative tier — exactly 50% remaining
99+
downgrades `raw` to `table`, exactly 20% floors at `summary`, and only when
100+
remaining drops *below* 5% does `handle_only` take over.
101+
102+
`Kernel.invoke(..., dry_run=True)` mirrors the escalation and reports
103+
`budget_remaining` in the returned `DryRunResult`, so callers can preview
104+
what their next live invocation would actually return.
105+
106+
Plug a different token counter (for example, a `tiktoken`-based one) via the
107+
`TokenCounter` protocol:
108+
109+
```python
110+
import tiktoken # pip install weaver-kernel[tiktoken]
111+
enc = tiktoken.encoding_for_model("gpt-4o")
112+
113+
def tiktoken_counter(value):
114+
return len(enc.encode(str(value)))
115+
116+
manager = BudgetManager(total_budget=128_000, token_counter=tiktoken_counter)
117+
```
118+
119+
The default counter (`default_token_counter`) is a character-based
120+
`len(json.dumps(value)) // 4` approximation with no extra dependencies.

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ policy = [
5252
"pyyaml>=6.0",
5353
"tomli>=2.0; python_version<'3.11'",
5454
]
55+
tiktoken = ["tiktoken>=0.6"]
5556

5657
[tool.hatch.build.targets.wheel]
5758
packages = ["src/agent_kernel"]

src/agent_kernel/__init__.py

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
2020
Firewall::
2121
22-
from agent_kernel import Firewall, Budgets
22+
from agent_kernel import Firewall, Budgets, BudgetManager
2323
2424
Handles & traces::
2525
@@ -35,6 +35,7 @@
3535
AgentKernelError,
3636
TokenExpired, TokenInvalid, TokenScopeError,
3737
PolicyDenied, PolicyConfigError, DriverError, FirewallError,
38+
BudgetExhausted, BudgetConfigError,
3839
CapabilityNotFound, HandleNotFound, HandleExpired,
3940
)
4041
"""
@@ -48,6 +49,8 @@
4849
from .errors import (
4950
AdapterParseError,
5051
AgentKernelError,
52+
BudgetConfigError,
53+
BudgetExhausted,
5154
CapabilityAlreadyRegistered,
5255
CapabilityNotFound,
5356
DriverError,
@@ -61,7 +64,9 @@
6164
TokenRevoked,
6265
TokenScopeError,
6366
)
67+
from .firewall.budget_manager import BudgetManager
6468
from .firewall.budgets import Budgets
69+
from .firewall.token_counting import TokenCounter, default_token_counter
6570
from .firewall.transform import Firewall
6671
from .handles import HandleStore
6772
from .kernel import Kernel
@@ -125,6 +130,8 @@
125130
# errors
126131
"AdapterParseError",
127132
"AgentKernelError",
133+
"BudgetConfigError",
134+
"BudgetExhausted",
128135
"CapabilityAlreadyRegistered",
129136
"CapabilityNotFound",
130137
"DriverError",
@@ -156,8 +163,11 @@
156163
"MCPDriver",
157164
"make_billing_driver",
158165
# firewall
159-
"Firewall",
166+
"BudgetManager",
160167
"Budgets",
168+
"Firewall",
169+
"TokenCounter",
170+
"default_token_counter",
161171
# stores
162172
"HandleStore",
163173
"TraceStore",

src/agent_kernel/errors.py

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,27 @@ class FirewallError(AgentKernelError):
4949
"""Raised when the context firewall cannot transform a raw result."""
5050

5151

52+
class BudgetExhausted(AgentKernelError):
53+
"""Raised when a :class:`~agent_kernel.firewall.budget_manager.BudgetManager` has
54+
no remaining cross-invocation context budget.
55+
56+
Distinct from :class:`FirewallError`: this error fires *before* the
57+
firewall transforms data, signalling that the caller has consumed the
58+
entire session-level context budget. The current invocation never runs
59+
the driver.
60+
"""
61+
62+
63+
class BudgetConfigError(AgentKernelError):
64+
"""Raised when a :class:`~agent_kernel.firewall.budget_manager.BudgetManager` is
65+
constructed with invalid parameters, or asked to allocate/record/release
66+
a negative amount.
67+
68+
Used in place of bare :class:`ValueError` so callers can catch budget
69+
configuration mistakes without swallowing unrelated stdlib errors.
70+
"""
71+
72+
5273
# ── Adapter errors ────────────────────────────────────────────────────────────
5374

5475

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,18 @@
11
"""Firewall sub-package exports."""
22

3+
from .budget_manager import BudgetManager
34
from .budgets import Budgets
45
from .redaction import redact
56
from .summarize import summarize
7+
from .token_counting import TokenCounter, default_token_counter
68
from .transform import Firewall
79

8-
__all__ = ["Budgets", "Firewall", "redact", "summarize"]
10+
__all__ = [
11+
"BudgetManager",
12+
"Budgets",
13+
"Firewall",
14+
"TokenCounter",
15+
"default_token_counter",
16+
"redact",
17+
"summarize",
18+
]

0 commit comments

Comments
 (0)