Skip to content

Commit ac19311

Browse files
frankbriaTest User
andauthored
feat(costs): per-task and per-agent breakdowns + task board cost badge (#558) (#591)
* feat(costs): per-task and per-agent breakdowns + task board cost badge (#558) Extends the /costs page (added in #557) with two analytics sections and adds an inline cost badge to each task card on the /tasks board. Backend (Python): - New TokenRepository.get_top_tasks_by_cost(days, limit) and get_costs_by_agent(days), aggregating the workspace token_usage table. - New endpoints under /api/v2/costs: GET /tasks -> top 10 tasks with titles, agent, tokens, cost GET /by-agent -> per-agent rollup + total input/output tokens - Title resolution falls back to a placeholder when token_usage references a task no longer present in the workspace, so the table never blanks out. - TokenUsage.task_id widened to Optional[Union[int, str]] so v2 UUID task IDs are preserved end-to-end (react_agent.py was int()-casting and storing NULL for every v2 record, blocking per-task analytics). Frontend (TypeScript / Next.js): - New types TaskCostEntry, TaskCostsResponse, AgentCostEntry, AgentCostsResponse and matching costsApi.getTopTasks / getByAgent. - New CostsView sections: TopTasksTable and AgentCostBars (pure-Tailwind horizontal bars + input/output token split row, no charting library). - TaskCard renders a small MoneyBag02Icon + cost badge with a tooltip showing input/output token breakdown when costMap has a positive entry for that task. costMap threads through TaskBoardView -> Content -> Column as an optional prop; non-breaking. * fix(costs): address review feedback on #558 PR - TaskCard badge: format sub-cent costs at 4dp instead of collapsing to $0.00. Adds a regression test against the $0.0042 case. - /api/v2/costs/tasks: expose a `limit` query param (1..1000, default 10). Analytics view keeps the top-10 default; TaskBoardView now requests limit=1000 so the badge map covers every task with spend, not just the top 10 (a board task outside the top-10 would otherwise never show). - react_agent: simplify task_id persistence — the upstream caller never passes None, so drop the unreachable branch. - TokenRepository: TODO note on the N+1 dominant-agent lookup. Fine at current scale; flagged for a future CTE refactor. - New direct unit tests for TopTasksTable and AgentCostBars covering empty/loading/data states and the zero-total token-split edge case. * docs: mark Phase 5.2 (cost analytics) complete (#558) --------- Co-authored-by: Test User <test@example.com>
1 parent a4f7e66 commit ac19311

22 files changed

Lines changed: 1492 additions & 50 deletions

File tree

CLAUDE.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,8 @@ If you are an agent working in this repo: **do not improvise architecture**. Fol
3636

3737
### Current Focus: Phase 4A
3838

39+
**Phase 5.2 is complete** — Costs page now ships per-task and per-agent breakdowns (#558) on top of the spend summary (#557). Backend: `GET /api/v2/costs/tasks?days=N&limit=M` (top-N tasks with titles, agent, tokens, cost) and `GET /api/v2/costs/by-agent?days=N` (per-agent rollup + total input/output tokens), both via `TokenRepository.get_top_tasks_by_cost` and `get_costs_by_agent`. Task board cards show an inline `MoneyBag02Icon` cost badge with token-breakdown tooltip when cost data exists. Fixed a v2 data-loss bug where `react_agent` int-cast UUID task IDs and stored NULL in `token_usage`.
40+
3941
**Phase 5.1 is complete** — Settings page now ships three working tabs: Agent (#554), API Keys (#555), and PROOF9 Defaults + Workspace Config (#556). Backend: `GET/PUT /api/v2/proof/config` and `/api/v2/workspaces/config`, plus `run_proof()` now honors `enabled_gates` filtering and `strictness` (`strict` vs `warn`). Atomic JSON writes via `codeframe/ui/routers/_helpers.atomic_write_json`. The 9-gate canonical order and `proof_config.json` filename live in `codeframe/core/proof/models.py`.
4042

4143
**Phase 3.5C is complete**`CaptureGlitchModal` form (description/markdown, source, scope, gate obligations, severity, expiry) reachable from the PROOF9 page and the persistent sidebar "Capture Glitch" button. REQ detail view (`/proof/[req_id]`) ships markdown description rendering, `ProofScope` metadata display, obligations table with `Latest Run` column, sortable/filterable evidence history, and empty-state CTA. Backend: `ScopeOut` model on `RequirementResponse`. Issues #568, #569.

codeframe/core/models.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -879,7 +879,9 @@ class TokenUsage(BaseModel):
879879
"""Token usage record for a single LLM call (Sprint 10)."""
880880

881881
id: Optional[int] = None
882-
task_id: Optional[int] = None # None for non-task calls
882+
# Tasks use integer PKs in the v1 schema and UUID strings in v2 workspaces;
883+
# SQLite is type-flexible, so we accept either at the model boundary.
884+
task_id: Optional[Union[int, str]] = None # None for non-task calls
883885
agent_id: str
884886
project_id: int
885887
model_name: str = Field(..., description="e.g., claude-sonnet-4-5")

codeframe/core/react_agent.py

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -359,15 +359,19 @@ def _persist_token_usage(self, task_id: str) -> None:
359359
db.initialize()
360360
tracker = MetricsTracker(db=db)
361361

362-
# Cast task_id to int for the persistence layer (core uses str, DB uses int).
362+
# v1 tasks have integer PKs; v2 workspaces use UUID strings.
363+
# Pass the raw value — SQLite preserves the type, and downstream
364+
# analytics (issue #558) group by whatever was stored. Forcing
365+
# int() here used to drop every v2 record's task linkage.
366+
persist_task_id: int | str
363367
try:
364-
task_id_int: int | None = int(task_id)
368+
persist_task_id = int(task_id)
365369
except (ValueError, TypeError):
366-
task_id_int = None
370+
persist_task_id = str(task_id)
367371

368372
for record in self._token_records:
369373
tracker.record_token_usage_sync(
370-
task_id=task_id_int,
374+
task_id=persist_task_id,
371375
agent_id="react-agent",
372376
project_id=0,
373377
model_name=record["model"],

codeframe/lib/metrics_tracker.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@
4040
import logging
4141
import re
4242
from datetime import datetime, timedelta, timezone
43-
from typing import Dict, Any, List, Optional
43+
from typing import Any, Dict, List, Optional, Union
4444
from codeframe.core.models import CallType, TokenUsage
4545
from codeframe.persistence.database import Database
4646

@@ -163,7 +163,7 @@ def calculate_cost(model_name: str, input_tokens: int, output_tokens: int) -> fl
163163

164164
async def record_token_usage(
165165
self,
166-
task_id: Optional[int],
166+
task_id: Optional[Union[int, str]],
167167
agent_id: str,
168168
project_id: int,
169169
model_name: str,
@@ -238,7 +238,7 @@ async def record_token_usage(
238238

239239
def record_token_usage_sync(
240240
self,
241-
task_id: Optional[int],
241+
task_id: Optional[Union[int, str]],
242242
agent_id: str,
243243
project_id: int,
244244
model_name: str,

codeframe/persistence/repositories/token_repository.py

Lines changed: 165 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -351,6 +351,171 @@ def get_costs_summary(self, days: int) -> Dict[str, Any]:
351351
"daily": daily,
352352
}
353353

354+
def _window_iso_bounds(self, days: int) -> tuple[str, str]:
355+
"""Return inclusive start / exclusive end ISO strings for a `days` window.
356+
357+
Mirrors get_costs_summary's bounds so the per-task and per-agent
358+
aggregations cover the same rows. Space-separated, offset-free format
359+
works against both ``CURRENT_TIMESTAMP`` defaults and ``.isoformat()``.
360+
"""
361+
if days <= 0:
362+
raise ValueError("days must be a positive integer")
363+
end_date = datetime.now(timezone.utc).date()
364+
start_date = end_date - timedelta(days=days - 1)
365+
start_iso = start_date.strftime("%Y-%m-%d %H:%M:%S")
366+
end_iso = (end_date + timedelta(days=1)).strftime("%Y-%m-%d %H:%M:%S")
367+
return start_iso, end_iso
368+
369+
def get_top_tasks_by_cost(
370+
self,
371+
days: int,
372+
limit: int = 10,
373+
) -> List[Dict[str, Any]]:
374+
"""Aggregate spend per task and return the top N by cost.
375+
376+
Args:
377+
days: Trailing window in days.
378+
limit: Maximum number of tasks to return.
379+
380+
Returns:
381+
List of dicts, sorted by total_cost_usd DESC:
382+
{
383+
"task_id": <native value from token_usage.task_id>,
384+
"agent_id": str,
385+
"input_tokens": int,
386+
"output_tokens": int,
387+
"total_cost_usd": float,
388+
}
389+
Excludes rows where task_id IS NULL. The reported ``agent_id`` is
390+
the agent that made the most calls for that task (ties broken
391+
arbitrarily). ``task_id`` is returned as stored — SQLite preserves
392+
the inserted type, so v2 UUID strings come back as strings and v1
393+
integers come back as integers.
394+
"""
395+
if limit <= 0:
396+
raise ValueError("limit must be a positive integer")
397+
start_iso, end_iso = self._window_iso_bounds(days)
398+
399+
cursor = self.conn.cursor()
400+
cursor.execute(
401+
"""
402+
SELECT
403+
task_id,
404+
COALESCE(SUM(input_tokens), 0) AS input_tokens,
405+
COALESCE(SUM(output_tokens), 0) AS output_tokens,
406+
COALESCE(SUM(estimated_cost_usd), 0.0) AS total_cost_usd
407+
FROM token_usage
408+
WHERE task_id IS NOT NULL
409+
AND timestamp >= ?
410+
AND timestamp < ?
411+
GROUP BY task_id
412+
ORDER BY total_cost_usd DESC
413+
LIMIT ?
414+
""",
415+
(start_iso, end_iso, limit),
416+
)
417+
rows = cursor.fetchall()
418+
419+
# TODO(perf): the dominant-agent lookup is N+1 against the limit.
420+
# Acceptable at limit=10 (analytics view) and even limit=1000 (badge
421+
# map for a board). Fold into a single CTE if the cap grows further.
422+
result: List[Dict[str, Any]] = []
423+
for row in rows:
424+
task_id = row["task_id"]
425+
# Find the most-used agent for this task in the same window.
426+
cursor.execute(
427+
"""
428+
SELECT agent_id, COUNT(*) AS calls
429+
FROM token_usage
430+
WHERE task_id = ?
431+
AND timestamp >= ?
432+
AND timestamp < ?
433+
GROUP BY agent_id
434+
ORDER BY calls DESC
435+
LIMIT 1
436+
""",
437+
(task_id, start_iso, end_iso),
438+
)
439+
agent_row = cursor.fetchone()
440+
agent_id = agent_row["agent_id"] if agent_row else ""
441+
442+
result.append({
443+
"task_id": task_id,
444+
"agent_id": agent_id,
445+
"input_tokens": int(row["input_tokens"] or 0),
446+
"output_tokens": int(row["output_tokens"] or 0),
447+
"total_cost_usd": float(row["total_cost_usd"] or 0.0),
448+
})
449+
450+
return result
451+
452+
def get_costs_by_agent(self, days: int) -> Dict[str, Any]:
453+
"""Aggregate spend per agent over a trailing `days` window.
454+
455+
Args:
456+
days: Trailing window in days.
457+
458+
Returns:
459+
{
460+
"by_agent": [
461+
{
462+
"agent_id": str,
463+
"input_tokens": int,
464+
"output_tokens": int,
465+
"total_cost_usd": float,
466+
"call_count": int,
467+
},
468+
...
469+
],
470+
"total_input_tokens": int,
471+
"total_output_tokens": int,
472+
}
473+
474+
Includes records with NULL ``task_id`` — calls without a task still
475+
attribute to an agent. Sorted by total_cost_usd DESC.
476+
"""
477+
start_iso, end_iso = self._window_iso_bounds(days)
478+
479+
cursor = self.conn.cursor()
480+
cursor.execute(
481+
"""
482+
SELECT
483+
agent_id,
484+
COALESCE(SUM(input_tokens), 0) AS input_tokens,
485+
COALESCE(SUM(output_tokens), 0) AS output_tokens,
486+
COALESCE(SUM(estimated_cost_usd), 0.0) AS total_cost_usd,
487+
COUNT(*) AS call_count
488+
FROM token_usage
489+
WHERE timestamp >= ? AND timestamp < ?
490+
GROUP BY agent_id
491+
ORDER BY total_cost_usd DESC
492+
""",
493+
(start_iso, end_iso),
494+
)
495+
rows = cursor.fetchall()
496+
497+
by_agent: List[Dict[str, Any]] = []
498+
total_input = 0
499+
total_output = 0
500+
for row in rows:
501+
inp = int(row["input_tokens"] or 0)
502+
out = int(row["output_tokens"] or 0)
503+
by_agent.append({
504+
"agent_id": row["agent_id"],
505+
"input_tokens": inp,
506+
"output_tokens": out,
507+
"total_cost_usd": float(row["total_cost_usd"] or 0.0),
508+
"call_count": int(row["call_count"] or 0),
509+
})
510+
total_input += inp
511+
total_output += out
512+
513+
return {
514+
"by_agent": by_agent,
515+
"total_input_tokens": total_input,
516+
"total_output_tokens": total_output,
517+
}
518+
354519
def get_project_costs_aggregate(self, project_id: int) -> Dict[str, Any]:
355520
"""Get aggregated cost statistics for a project.
356521

0 commit comments

Comments
 (0)