Skip to content

Commit 43021a7

Browse files
abrichrclaude
andauthored
feat: add comprehensive API and infrastructure cost tracking (#192)
Add a centralized, thread-safe CostTracker that records token usage from every VLM/LLM API call and infrastructure time (GPU/VM hours). The tracker is integrated at the vlm_call() level so all 15+ callers automatically get cost tracking without any changes. Key integration points have cost_label tags for per-component breakdown (planner, grounder, vlm_judge, demo_verify, etc.). - New openadapt_evals/cost_tracker.py with global singleton, pricing tables, JSON persistence, and human-readable summary output - vlm.py extracts response.usage tokens from both OpenAI and Anthropic responses and reports to the tracker - 18 unit tests covering pricing lookup, aggregation, thread safety, persistence, and vlm.py integration Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent b89a15b commit 43021a7

12 files changed

Lines changed: 706 additions & 0 deletions

openadapt_evals/agents/demo_guided_agent.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -315,6 +315,7 @@ def verify_step(
315315
model=self._verify_model,
316316
provider=self._verify_provider,
317317
max_tokens=256,
318+
cost_label="demo_verify",
318319
)
319320

320321
parsed = extract_json(raw)

openadapt_evals/agents/planner_grounder_agent.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -670,6 +670,7 @@ def _call_planner(
670670
model=self._planner,
671671
provider=self._planner_provider,
672672
max_tokens=512,
673+
cost_label="planner",
673674
)
674675

675676
logger.debug("Planner raw output: %s", raw[:500])
@@ -733,6 +734,7 @@ def _call_grounder(
733734
model=self._grounder,
734735
provider=self._grounder_provider,
735736
max_tokens=256,
737+
cost_label="grounder",
736738
)
737739

738740
logger.debug("Grounder raw output: %s", raw[:500])
@@ -756,6 +758,7 @@ def _call_grounder(
756758
model=self._grounder,
757759
provider=self._grounder_provider,
758760
max_tokens=128,
761+
cost_label="grounder_retry",
759762
)
760763
action = parse_action_json(raw2)
761764
if action.type != "done":

openadapt_evals/correction_parser.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,7 @@ def parse_correction(
5656
model=model,
5757
provider=provider,
5858
max_tokens=512,
59+
cost_label="correction_parser",
5960
)
6061

6162
# Extract JSON from response

0 commit comments

Comments
 (0)