Skip to content

Commit 789dbd4

Browse files
committed
v0.51.0: vstack-import — bring-your-own-traces (chat logs / OTel spans -> AgentTrace)
- vstack.ingest: from_chat_messages (OpenAI/Anthropic logs -> AgentTrace; maps roles, infers goal/outcome, flattens multimodal) + from_otel_spans (best-effort OTel; gen_ai.* attrs, OTLP key/value lists, start-time ordering). - vstack-import CLI: --format {messages,otel}, pipes into vstack-diagnose --trace -. 59 -> 60 CLIs. _ingest wired into CI (pytest+ruff+mypy --strict) + testpaths. Removes the #1 adoption friction (getting real traces into vstack). 3,245 tests.
1 parent 0f4f289 commit 789dbd4

8 files changed

Lines changed: 460 additions & 5 deletions

File tree

.github/workflows/ci.yml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@ jobs:
5959
_findings_router/ _alerting/ _eval_gates/ \
6060
_intervention_tracker/ _redaction/ _health/ \
6161
_priority_queue/ _snippet/ _aggregate/ \
62+
_ingest/ \
6263
.github/action/ \
6364
-v --tb=short \
6465
--cov=vstack \
@@ -100,7 +101,7 @@ jobs:
100101
_budgeter/ _tracer/ _export/ _findings_db/ _trace_diff/ _timeline/ \
101102
_cost_sim/ _findings_router/ _alerting/ _eval_gates/ \
102103
_intervention_tracker/ _redaction/ _health/ _priority_queue/ \
103-
_snippet/ _aggregate/ .github/action/
104+
_snippet/ _aggregate/ _ingest/ .github/action/
104105
105106
- name: Run ruff format check
106107
run: |
@@ -114,7 +115,7 @@ jobs:
114115
_budgeter/ _tracer/ _export/ _findings_db/ _trace_diff/ _timeline/ \
115116
_cost_sim/ _findings_router/ _alerting/ _eval_gates/ \
116117
_intervention_tracker/ _redaction/ _health/ _priority_queue/ \
117-
_snippet/ _aggregate/ .github/action/
118+
_snippet/ _aggregate/ _ingest/ .github/action/
118119
119120
typecheck:
120121
name: Typecheck (mypy)
@@ -174,6 +175,7 @@ jobs:
174175
_priority_queue \
175176
_snippet \
176177
_aggregate \
178+
_ingest \
177179
_diagnose \
178180
_dashboard \
179181
_scorecard \

CHANGELOG.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,30 @@ project adheres to [Semantic Versioning](https://semver.org/) from
66
`1.0.0` onward. During the `0.x` series, minor bumps may include
77
breaking changes (see API stability promise in `vstack/__init__.py`).
88

9+
## [0.51.0] — 2026-06-23
10+
11+
Bring-your-own-traces: import the logs you already have into a vstack trace.
12+
13+
### Added
14+
15+
- **`vstack.ingest`** — trace importers that build the canonical
16+
`AgentTrace` from real data:
17+
- `from_chat_messages(messages, …)` — OpenAI/Anthropic chat logs
18+
(`{role, content, tool_calls}`); maps system/user/assistant/tool to
19+
trace steps, infers goal (first user message) + outcome (last
20+
assistant message), flattens multimodal content.
21+
- `from_otel_spans(spans, …)` — OpenTelemetry spans (best-effort; reads
22+
`gen_ai.*` attributes as a dict or OTLP `{key,value}` list, orders by
23+
start time, classifies GenAI vs other spans).
24+
- **`vstack-import` CLI**`vstack-import --format {messages,otel} input.json`
25+
emits an `AgentTrace` JSON, designed to pipe straight into
26+
`vstack-diagnose --trace -`. (CLI surface: 59 → 60.)
27+
- `_ingest` wired into CI (pytest + ruff + `mypy --strict`).
28+
29+
### Compatibility
30+
31+
- All tests pass. Additive only; no breaking changes.
32+
933
## [0.50.0] — 2026-06-23
1034

1135
GitHub Marketplace launch of the Action.

README.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,13 @@ vstack-recipes # browse named bundles (stuck_i
130130
vstack-diagnose --trace trace.json --recipe stuck_in_loop --client anthropic
131131
```
132132

133+
Don't have a vstack trace yet? Import the logs you already have — `vstack-import` converts OpenAI/Anthropic chat-message logs or OpenTelemetry spans into a trace, ready to pipe straight in:
134+
135+
```bash
136+
vstack-import --format messages chat.json | vstack-diagnose --trace - --client anthropic
137+
vstack-import --format otel spans.json --goal "ship auth" | vstack-diagnose --trace -
138+
```
139+
133140
## Install
134141

135142
> [!TIP]
@@ -326,7 +333,7 @@ vstack ships **13 invocation surfaces**. Same patterns, same data shape, differe
326333
| # | Surface | Get it with | Use when |
327334
|---|---|---|---|
328335
| 1 | **Python imports** | `pip install valanistack` | You're building in Python and want patterns as library calls |
329-
| 2 | **59 CLIs** | `vstack-<pattern>` + workflow CLIs (`vstack-diagnose`, `vstack-recipes`, `vstack-scorecard`, `vstack-redaction`, `vstack-export`, `vstack-aggregate`, `vstack-findings-db`, `vstack-trace-diff`, `vstack-heatmap`, `vstack-timeline`, `vstack-synth`, `vstack-vdiff`, …) | Shell scripts, CI checks, one-shot diagnoses |
336+
| 2 | **60 CLIs** | `vstack-<pattern>` + workflow CLIs (`vstack-diagnose`, `vstack-import`, `vstack-recipes`, `vstack-scorecard`, `vstack-redaction`, `vstack-export`, `vstack-aggregate`, `vstack-findings-db`, `vstack-trace-diff`, `vstack-heatmap`, `vstack-timeline`, `vstack-synth`, `vstack-vdiff`, …) | Shell scripts, CI checks, one-shot diagnoses |
330337
| 3 | **MCP server** | `pip install "valanistack[mcp]"` · `vstack-mcp serve` | Any MCP-speaking AI client (see table below) |
331338
| 4 | **REST API (FastAPI)** | `pip install "valanistack[api]"` · `vstack-api serve` | Production multi-tenant deploys; auth + rate-limit baked in |
332339
| 5 | **Docker** | `docker pull ghcr.io/valani9/vstack:0.37.0` | Kubernetes deploys; multi-arch (amd64 + arm64) |

_ingest/lib/__init__.py

Lines changed: 189 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,189 @@
1+
"""Import real-world traces into vstack's :class:`~vstack.aar.AgentTrace`.
2+
3+
Your agent runs already produce traces — as chat-completion message logs
4+
(OpenAI / Anthropic style) or as OpenTelemetry spans. These converters turn
5+
those into the canonical ``AgentTrace`` that every vstack pattern consumes, so
6+
you can pipe real data straight into ``vstack-diagnose`` without hand-writing a
7+
trace.
8+
9+
Public API:
10+
11+
* :func:`from_chat_messages` — a list of ``{role, content, tool_calls?}`` dicts.
12+
* :func:`from_otel_spans` — a list of OpenTelemetry span dicts (best-effort,
13+
reads ``gen_ai.*`` attributes).
14+
"""
15+
16+
from __future__ import annotations
17+
18+
from datetime import datetime, timedelta, timezone
19+
from typing import Any
20+
21+
from vstack.aar import AgentTrace, TraceStep
22+
23+
__all__ = ["from_chat_messages", "from_otel_spans"]
24+
25+
_BASE_TS = datetime(2026, 1, 1, tzinfo=timezone.utc)
26+
27+
28+
def _coerce_content(content: Any) -> str:
29+
"""Flatten a message ``content`` (str or multimodal list) to text."""
30+
if content is None:
31+
return ""
32+
if isinstance(content, str):
33+
return content
34+
if isinstance(content, list):
35+
parts: list[str] = []
36+
for block in content:
37+
if isinstance(block, dict):
38+
parts.append(
39+
str(block.get("text") or block.get("content") or block.get("type") or "")
40+
)
41+
else:
42+
parts.append(str(block))
43+
return "\n".join(p for p in parts if p)
44+
return str(content)
45+
46+
47+
def _ts(index: int) -> datetime:
48+
return _BASE_TS + timedelta(seconds=index)
49+
50+
51+
def from_chat_messages(
52+
messages: list[dict[str, Any]],
53+
*,
54+
goal: str = "",
55+
outcome: str = "",
56+
success: bool = False,
57+
agent_id: str | None = None,
58+
agent_framework: str = "chat",
59+
metadata: dict[str, Any] | None = None,
60+
) -> AgentTrace:
61+
"""Build an ``AgentTrace`` from chat-completion messages.
62+
63+
Role → step mapping: ``system``/``user`` → ``message``; ``assistant``
64+
text → ``message`` and any ``tool_calls`` → ``tool_call``; ``tool`` →
65+
``observation``. ``goal`` defaults to the first user message and
66+
``outcome`` to the last assistant message when not given.
67+
"""
68+
steps: list[TraceStep] = []
69+
idx = 0
70+
first_user = ""
71+
last_assistant = ""
72+
73+
for msg in messages:
74+
role = str(msg.get("role", "")).lower()
75+
content = _coerce_content(msg.get("content"))
76+
tool_calls = msg.get("tool_calls") or []
77+
78+
if role in ("system", "user", "developer"):
79+
if content:
80+
steps.append(TraceStep(timestamp=_ts(idx), type="message", content=content))
81+
idx += 1
82+
if role == "user" and content and not first_user:
83+
first_user = content
84+
elif role == "assistant":
85+
if content:
86+
steps.append(TraceStep(timestamp=_ts(idx), type="message", content=content))
87+
idx += 1
88+
last_assistant = content
89+
for call in tool_calls:
90+
fn = call.get("function", call) if isinstance(call, dict) else {}
91+
name = fn.get("name", "tool")
92+
args = fn.get("arguments", "")
93+
steps.append(
94+
TraceStep(timestamp=_ts(idx), type="tool_call", content=f"{name}({args})")
95+
)
96+
idx += 1
97+
elif role == "tool":
98+
steps.append(
99+
TraceStep(
100+
timestamp=_ts(idx), type="observation", content=content or "(tool result)"
101+
)
102+
)
103+
idx += 1
104+
elif content:
105+
steps.append(TraceStep(timestamp=_ts(idx), type="message", content=content))
106+
idx += 1
107+
108+
return AgentTrace(
109+
agent_id=agent_id,
110+
agent_framework=agent_framework,
111+
goal=goal or first_user or "(goal not provided)",
112+
steps=steps,
113+
outcome=outcome or last_assistant or "(outcome not provided)",
114+
success=success,
115+
metadata=metadata or {},
116+
)
117+
118+
119+
def _span_start(span: dict[str, Any]) -> Any:
120+
return span.get("start_time") or span.get("startTime") or span.get("startTimeUnixNano") or 0
121+
122+
123+
def from_otel_spans(
124+
spans: list[dict[str, Any]],
125+
*,
126+
goal: str = "",
127+
outcome: str = "",
128+
success: bool = False,
129+
agent_id: str | None = None,
130+
agent_framework: str = "otel",
131+
metadata: dict[str, Any] | None = None,
132+
) -> AgentTrace:
133+
"""Build an ``AgentTrace`` from OpenTelemetry spans (best-effort).
134+
135+
Spans are ordered by start time. Each span becomes a step: GenAI/LLM spans
136+
(a ``gen_ai.*`` attribute or an ``llm``/``chat`` name) → ``tool_call``;
137+
others → ``observation``. Reads ``attributes`` either as a dict or as a
138+
list of ``{key, value}`` (OTLP-JSON form).
139+
"""
140+
ordered = sorted(spans, key=_span_start)
141+
steps: list[TraceStep] = []
142+
143+
for i, span in enumerate(ordered):
144+
attrs = _otel_attrs(span)
145+
name = str(span.get("name", "span"))
146+
is_genai = name.lower().startswith(("llm", "chat", "gen_ai")) or any(
147+
k.startswith("gen_ai") for k in attrs
148+
)
149+
completion = (
150+
attrs.get("gen_ai.completion")
151+
or attrs.get("gen_ai.response.content")
152+
or attrs.get("llm.output")
153+
)
154+
prompt = attrs.get("gen_ai.prompt") or attrs.get("llm.input")
155+
detail = str(completion or prompt or "")
156+
content = f"{name}: {detail}" if detail else name
157+
steps.append(
158+
TraceStep(
159+
timestamp=_ts(i),
160+
type="tool_call" if is_genai else "observation",
161+
content=content,
162+
metadata={"span_name": name},
163+
)
164+
)
165+
166+
return AgentTrace(
167+
agent_id=agent_id,
168+
agent_framework=agent_framework,
169+
goal=goal or "(goal not provided)",
170+
steps=steps,
171+
outcome=outcome or "(outcome not provided)",
172+
success=success,
173+
metadata=metadata or {},
174+
)
175+
176+
177+
def _otel_attrs(span: dict[str, Any]) -> dict[str, Any]:
178+
raw = span.get("attributes", {})
179+
if isinstance(raw, dict):
180+
return raw
181+
out: dict[str, Any] = {}
182+
if isinstance(raw, list):
183+
for item in raw:
184+
if isinstance(item, dict) and "key" in item:
185+
val = item.get("value")
186+
if isinstance(val, dict):
187+
val = next(iter(val.values()), val)
188+
out[str(item["key"])] = val
189+
return out

_ingest/lib/_cli.py

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
"""``vstack-import`` — convert real traces into a vstack ``AgentTrace``.
2+
3+
Reads chat-completion message logs or OpenTelemetry spans (JSON, from a file or
4+
stdin) and writes an ``AgentTrace`` JSON ready for ``vstack-diagnose``::
5+
6+
vstack-import --format messages chat.json --goal "ship auth" | \\
7+
vstack-diagnose --trace - --client anthropic --fail-on high
8+
"""
9+
10+
from __future__ import annotations
11+
12+
import argparse
13+
import json
14+
import sys
15+
from pathlib import Path
16+
from typing import Any, Sequence
17+
18+
from . import from_chat_messages, from_otel_spans
19+
20+
21+
def _load(path: str | None) -> Any:
22+
raw = (
23+
sys.stdin.read()
24+
if (path is None or path == "-")
25+
else Path(path).read_text(encoding="utf-8")
26+
)
27+
return json.loads(raw)
28+
29+
30+
def _items(payload: Any, key: str) -> list[dict[str, Any]]:
31+
"""Accept a bare list, or a dict wrapping the list under ``key``."""
32+
if isinstance(payload, list):
33+
return list(payload)
34+
if isinstance(payload, dict) and isinstance(payload.get(key), list):
35+
return list(payload[key])
36+
raise ValueError(f"expected a JSON list of {key} (or {{'{key}': [...]}}).")
37+
38+
39+
def main(argv: Sequence[str] | None = None) -> int:
40+
parser = argparse.ArgumentParser(
41+
prog="vstack-import",
42+
description=(
43+
"Convert chat-completion message logs or OpenTelemetry spans into a "
44+
"vstack AgentTrace JSON (pipe into vstack-diagnose)."
45+
),
46+
)
47+
parser.add_argument(
48+
"input",
49+
nargs="?",
50+
default="-",
51+
help="Input JSON file (omit or '-' for stdin).",
52+
)
53+
parser.add_argument(
54+
"--format",
55+
"-f",
56+
choices=("messages", "otel"),
57+
required=True,
58+
help="messages = OpenAI/Anthropic chat log; otel = OpenTelemetry spans.",
59+
)
60+
parser.add_argument("--goal", default="", help="The agent's goal (else inferred).")
61+
parser.add_argument("--outcome", default="", help="What happened (else inferred).")
62+
parser.add_argument(
63+
"--success",
64+
action="store_true",
65+
help="Mark the run successful (default: failed — you usually diagnose failures).",
66+
)
67+
parser.add_argument("--agent-id", default=None, help="Agent id to record.")
68+
parser.add_argument("--out", "-o", default=None, help="Write to a file (default: stdout).")
69+
args = parser.parse_args(argv)
70+
71+
try:
72+
payload = _load(args.input)
73+
if args.format == "messages":
74+
trace = from_chat_messages(
75+
_items(payload, "messages"),
76+
goal=args.goal,
77+
outcome=args.outcome,
78+
success=args.success,
79+
agent_id=args.agent_id,
80+
)
81+
else:
82+
trace = from_otel_spans(
83+
_items(payload, "spans"),
84+
goal=args.goal,
85+
outcome=args.outcome,
86+
success=args.success,
87+
agent_id=args.agent_id,
88+
)
89+
except (OSError, ValueError, json.JSONDecodeError) as e:
90+
print(f"vstack-import: {e}", file=sys.stderr)
91+
return 2
92+
93+
out_json = trace.model_dump_json(indent=2)
94+
if args.out and args.out != "-":
95+
Path(args.out).write_text(out_json + "\n", encoding="utf-8")
96+
print(f"Wrote {args.out} ({len(trace.steps)} steps)", file=sys.stderr)
97+
else:
98+
sys.stdout.write(out_json + "\n")
99+
return 0
100+
101+
102+
if __name__ == "__main__": # pragma: no cover
103+
raise SystemExit(main())

0 commit comments

Comments
 (0)