valani9
diff --git a/‎.github/workflows/ci.yml‎
Lines changed: 4 additions & 2 deletions b/‎.github/workflows/ci.yml‎
Lines changed: 4 additions & 2 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 24 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 24 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 8 additions & 1 deletion b/‎README.md‎
Lines changed: 8 additions & 1 deletion
diff --git a/‎_ingest/lib/__init__.py‎
Lines changed: 189 additions & 0 deletions b/‎_ingest/lib/__init__.py‎
Lines changed: 189 additions & 0 deletions
diff --git a/‎_ingest/lib/_cli.py‎
Lines changed: 103 additions & 0 deletions b/‎_ingest/lib/_cli.py‎
Lines changed: 103 additions & 0 deletions
@@ -59,6 +59,7 @@ jobs:
                  _findings_router/ _alerting/ _eval_gates/ \
                  _intervention_tracker/ _redaction/ _health/ \
                  _priority_queue/ _snippet/ _aggregate/ \
+                 _ingest/ \
                  .github/action/ \
             -v --tb=short \
             --cov=vstack \
@@ -100,7 +101,7 @@ jobs:
                      _budgeter/ _tracer/ _export/ _findings_db/ _trace_diff/ _timeline/ \
                      _cost_sim/ _findings_router/ _alerting/ _eval_gates/ \
                      _intervention_tracker/ _redaction/ _health/ _priority_queue/ \
-                     _snippet/ _aggregate/ .github/action/
+                     _snippet/ _aggregate/ _ingest/ .github/action/
 
       - name: Run ruff format check
         run: |
@@ -114,7 +115,7 @@ jobs:
                               _budgeter/ _tracer/ _export/ _findings_db/ _trace_diff/ _timeline/ \
                               _cost_sim/ _findings_router/ _alerting/ _eval_gates/ \
                               _intervention_tracker/ _redaction/ _health/ _priority_queue/ \
-                              _snippet/ _aggregate/ .github/action/
+                              _snippet/ _aggregate/ _ingest/ .github/action/
 
   typecheck:
     name: Typecheck (mypy)
@@ -174,6 +175,7 @@ jobs:
               _priority_queue \
               _snippet \
               _aggregate \
+              _ingest \
               _diagnose \
               _dashboard \
               _scorecard \
 
@@ -6,6 +6,30 @@ project adheres to [Semantic Versioning](https://semver.org/) from
 `1.0.0` onward. During the `0.x` series, minor bumps may include
 breaking changes (see API stability promise in `vstack/__init__.py`).
 
+## [0.51.0] — 2026-06-23
+
+Bring-your-own-traces: import the logs you already have into a vstack trace.
+
+### Added
+
+- **`vstack.ingest`** — trace importers that build the canonical
+  `AgentTrace` from real data:
+  - `from_chat_messages(messages, …)` — OpenAI/Anthropic chat logs
+    (`{role, content, tool_calls}`); maps system/user/assistant/tool to
+    trace steps, infers goal (first user message) + outcome (last
+    assistant message), flattens multimodal content.
+  - `from_otel_spans(spans, …)` — OpenTelemetry spans (best-effort; reads
+    `gen_ai.*` attributes as a dict or OTLP `{key,value}` list, orders by
+    start time, classifies GenAI vs other spans).
+- **`vstack-import` CLI** — `vstack-import --format {messages,otel} input.json`
+  emits an `AgentTrace` JSON, designed to pipe straight into
+  `vstack-diagnose --trace -`. (CLI surface: 59 → 60.)
+- `_ingest` wired into CI (pytest + ruff + `mypy --strict`).
+
+### Compatibility
+
+- All tests pass. Additive only; no breaking changes.
+
 ## [0.50.0] — 2026-06-23
 
 GitHub Marketplace launch of the Action.
 
@@ -130,6 +130,13 @@ vstack-recipes                                   # browse named bundles (stuck_i
 vstack-diagnose --trace trace.json --recipe stuck_in_loop --client anthropic
 ```
 
+Don't have a vstack trace yet? Import the logs you already have — `vstack-import` converts OpenAI/Anthropic chat-message logs or OpenTelemetry spans into a trace, ready to pipe straight in:
+
+```bash
+vstack-import --format messages chat.json | vstack-diagnose --trace - --client anthropic
+vstack-import --format otel spans.json --goal "ship auth" | vstack-diagnose --trace -
+```
+
 ## Install
 
 > [!TIP]
@@ -326,7 +333,7 @@ vstack ships **13 invocation surfaces**. Same patterns, same data shape, differe
 | # | Surface | Get it with | Use when |
 |---|---|---|---|
 | 1 | **Python imports** | `pip install valanistack` | You're building in Python and want patterns as library calls |
-| 2 | **59 CLIs** | `vstack-<pattern>` + workflow CLIs (`vstack-diagnose`, `vstack-recipes`, `vstack-scorecard`, `vstack-redaction`, `vstack-export`, `vstack-aggregate`, `vstack-findings-db`, `vstack-trace-diff`, `vstack-heatmap`, `vstack-timeline`, `vstack-synth`, `vstack-vdiff`, …) | Shell scripts, CI checks, one-shot diagnoses |
+| 2 | **60 CLIs** | `vstack-<pattern>` + workflow CLIs (`vstack-diagnose`, `vstack-import`, `vstack-recipes`, `vstack-scorecard`, `vstack-redaction`, `vstack-export`, `vstack-aggregate`, `vstack-findings-db`, `vstack-trace-diff`, `vstack-heatmap`, `vstack-timeline`, `vstack-synth`, `vstack-vdiff`, …) | Shell scripts, CI checks, one-shot diagnoses |
 | 3 | **MCP server** | `pip install "valanistack[mcp]"` · `vstack-mcp serve` | Any MCP-speaking AI client (see table below) |
 | 4 | **REST API (FastAPI)** | `pip install "valanistack[api]"` · `vstack-api serve` | Production multi-tenant deploys; auth + rate-limit baked in |
 | 5 | **Docker** | `docker pull ghcr.io/valani9/vstack:0.37.0` | Kubernetes deploys; multi-arch (amd64 + arm64) |
 
@@ -0,0 +1,189 @@
+"""Import real-world traces into vstack's :class:`~vstack.aar.AgentTrace`.
+
+Your agent runs already produce traces — as chat-completion message logs
+(OpenAI / Anthropic style) or as OpenTelemetry spans. These converters turn
+those into the canonical ``AgentTrace`` that every vstack pattern consumes, so
+you can pipe real data straight into ``vstack-diagnose`` without hand-writing a
+trace.
+
+Public API:
+
+* :func:`from_chat_messages` — a list of ``{role, content, tool_calls?}`` dicts.
+* :func:`from_otel_spans` — a list of OpenTelemetry span dicts (best-effort,
+  reads ``gen_ai.*`` attributes).
+"""
+
+from __future__ import annotations
+
+from datetime import datetime, timedelta, timezone
+from typing import Any
+
+from vstack.aar import AgentTrace, TraceStep
+
+__all__ = ["from_chat_messages", "from_otel_spans"]
+
+_BASE_TS = datetime(2026, 1, 1, tzinfo=timezone.utc)
+
+
+def _coerce_content(content: Any) -> str:
+    """Flatten a message ``content`` (str or multimodal list) to text."""
+    if content is None:
+        return ""
+    if isinstance(content, str):
+        return content
+    if isinstance(content, list):
+        parts: list[str] = []
+        for block in content:
+            if isinstance(block, dict):
+                parts.append(
+                    str(block.get("text") or block.get("content") or block.get("type") or "")
+                )
+            else:
+                parts.append(str(block))
+        return "\n".join(p for p in parts if p)
+    return str(content)
+
+
+def _ts(index: int) -> datetime:
+    return _BASE_TS + timedelta(seconds=index)
+
+
+def from_chat_messages(
+    messages: list[dict[str, Any]],
+    *,
+    goal: str = "",
+    outcome: str = "",
+    success: bool = False,
+    agent_id: str | None = None,
+    agent_framework: str = "chat",
+    metadata: dict[str, Any] | None = None,
+) -> AgentTrace:
+    """Build an ``AgentTrace`` from chat-completion messages.
+
+    Role → step mapping: ``system``/``user`` → ``message``; ``assistant``
+    text → ``message`` and any ``tool_calls`` → ``tool_call``; ``tool`` →
+    ``observation``. ``goal`` defaults to the first user message and
+    ``outcome`` to the last assistant message when not given.
+    """
+    steps: list[TraceStep] = []
+    idx = 0
+    first_user = ""
+    last_assistant = ""
+
+    for msg in messages:
+        role = str(msg.get("role", "")).lower()
+        content = _coerce_content(msg.get("content"))
+        tool_calls = msg.get("tool_calls") or []
+
+        if role in ("system", "user", "developer"):
+            if content:
+                steps.append(TraceStep(timestamp=_ts(idx), type="message", content=content))
+                idx += 1
+            if role == "user" and content and not first_user:
+                first_user = content
+        elif role == "assistant":
+            if content:
+                steps.append(TraceStep(timestamp=_ts(idx), type="message", content=content))
+                idx += 1
+                last_assistant = content
+            for call in tool_calls:
+                fn = call.get("function", call) if isinstance(call, dict) else {}
+                name = fn.get("name", "tool")
+                args = fn.get("arguments", "")
+                steps.append(
+                    TraceStep(timestamp=_ts(idx), type="tool_call", content=f"{name}({args})")
+                )
+                idx += 1
+        elif role == "tool":
+            steps.append(
+                TraceStep(
+                    timestamp=_ts(idx), type="observation", content=content or "(tool result)"
+                )
+            )
+            idx += 1
+        elif content:
+            steps.append(TraceStep(timestamp=_ts(idx), type="message", content=content))
+            idx += 1
+
+    return AgentTrace(
+        agent_id=agent_id,
+        agent_framework=agent_framework,
+        goal=goal or first_user or "(goal not provided)",
+        steps=steps,
+        outcome=outcome or last_assistant or "(outcome not provided)",
+        success=success,
+        metadata=metadata or {},
+    )
+
+
+def _span_start(span: dict[str, Any]) -> Any:
+    return span.get("start_time") or span.get("startTime") or span.get("startTimeUnixNano") or 0
+
+
+def from_otel_spans(
+    spans: list[dict[str, Any]],
+    *,
+    goal: str = "",
+    outcome: str = "",
+    success: bool = False,
+    agent_id: str | None = None,
+    agent_framework: str = "otel",
+    metadata: dict[str, Any] | None = None,
+) -> AgentTrace:
+    """Build an ``AgentTrace`` from OpenTelemetry spans (best-effort).
+
+    Spans are ordered by start time. Each span becomes a step: GenAI/LLM spans
+    (a ``gen_ai.*`` attribute or an ``llm``/``chat`` name) → ``tool_call``;
+    others → ``observation``. Reads ``attributes`` either as a dict or as a
+    list of ``{key, value}`` (OTLP-JSON form).
+    """
+    ordered = sorted(spans, key=_span_start)
+    steps: list[TraceStep] = []
+
+    for i, span in enumerate(ordered):
+        attrs = _otel_attrs(span)
+        name = str(span.get("name", "span"))
+        is_genai = name.lower().startswith(("llm", "chat", "gen_ai")) or any(
+            k.startswith("gen_ai") for k in attrs
+        )
+        completion = (
+            attrs.get("gen_ai.completion")
+            or attrs.get("gen_ai.response.content")
+            or attrs.get("llm.output")
+        )
+        prompt = attrs.get("gen_ai.prompt") or attrs.get("llm.input")
+        detail = str(completion or prompt or "")
+        content = f"{name}: {detail}" if detail else name
+        steps.append(
+            TraceStep(
+                timestamp=_ts(i),
+                type="tool_call" if is_genai else "observation",
+                content=content,
+                metadata={"span_name": name},
+            )
+        )
+
+    return AgentTrace(
+        agent_id=agent_id,
+        agent_framework=agent_framework,
+        goal=goal or "(goal not provided)",
+        steps=steps,
+        outcome=outcome or "(outcome not provided)",
+        success=success,
+        metadata=metadata or {},
+    )
+
+
+def _otel_attrs(span: dict[str, Any]) -> dict[str, Any]:
+    raw = span.get("attributes", {})
+    if isinstance(raw, dict):
+        return raw
+    out: dict[str, Any] = {}
+    if isinstance(raw, list):
+        for item in raw:
+            if isinstance(item, dict) and "key" in item:
+                val = item.get("value")
+                if isinstance(val, dict):
+                    val = next(iter(val.values()), val)
+                out[str(item["key"])] = val
+    return out
@@ -0,0 +1,103 @@
+"""``vstack-import`` — convert real traces into a vstack ``AgentTrace``.
+
+Reads chat-completion message logs or OpenTelemetry spans (JSON, from a file or
+stdin) and writes an ``AgentTrace`` JSON ready for ``vstack-diagnose``::
+
+    vstack-import --format messages chat.json --goal "ship auth" | \\
+        vstack-diagnose --trace - --client anthropic --fail-on high
+"""
+
+from __future__ import annotations
+
+import argparse
+import json
+import sys
+from pathlib import Path
+from typing import Any, Sequence
+
+from . import from_chat_messages, from_otel_spans
+
+
+def _load(path: str | None) -> Any:
+    raw = (
+        sys.stdin.read()
+        if (path is None or path == "-")
+        else Path(path).read_text(encoding="utf-8")
+    )
+    return json.loads(raw)
+
+
+def _items(payload: Any, key: str) -> list[dict[str, Any]]:
+    """Accept a bare list, or a dict wrapping the list under ``key``."""
+    if isinstance(payload, list):
+        return list(payload)
+    if isinstance(payload, dict) and isinstance(payload.get(key), list):
+        return list(payload[key])
+    raise ValueError(f"expected a JSON list of {key} (or {{'{key}': [...]}}).")
+
+
+def main(argv: Sequence[str] | None = None) -> int:
+    parser = argparse.ArgumentParser(
+        prog="vstack-import",
+        description=(
+            "Convert chat-completion message logs or OpenTelemetry spans into a "
+            "vstack AgentTrace JSON (pipe into vstack-diagnose)."
+        ),
+    )
+    parser.add_argument(
+        "input",
+        nargs="?",
+        default="-",
+        help="Input JSON file (omit or '-' for stdin).",
+    )
+    parser.add_argument(
+        "--format",
+        "-f",
+        choices=("messages", "otel"),
+        required=True,
+        help="messages = OpenAI/Anthropic chat log; otel = OpenTelemetry spans.",
+    )
+    parser.add_argument("--goal", default="", help="The agent's goal (else inferred).")
+    parser.add_argument("--outcome", default="", help="What happened (else inferred).")
+    parser.add_argument(
+        "--success",
+        action="store_true",
+        help="Mark the run successful (default: failed — you usually diagnose failures).",
+    )
+    parser.add_argument("--agent-id", default=None, help="Agent id to record.")
+    parser.add_argument("--out", "-o", default=None, help="Write to a file (default: stdout).")
+    args = parser.parse_args(argv)
+
+    try:
+        payload = _load(args.input)
+        if args.format == "messages":
+            trace = from_chat_messages(
+                _items(payload, "messages"),
+                goal=args.goal,
+                outcome=args.outcome,
+                success=args.success,
+                agent_id=args.agent_id,
+            )
+        else:
+            trace = from_otel_spans(
+                _items(payload, "spans"),
+                goal=args.goal,
+                outcome=args.outcome,
+                success=args.success,
+                agent_id=args.agent_id,
+            )
+    except (OSError, ValueError, json.JSONDecodeError) as e:
+        print(f"vstack-import: {e}", file=sys.stderr)
+        return 2
+
+    out_json = trace.model_dump_json(indent=2)
+    if args.out and args.out != "-":
+        Path(args.out).write_text(out_json + "\n", encoding="utf-8")
+        print(f"Wrote {args.out} ({len(trace.steps)} steps)", file=sys.stderr)
+    else:
+        sys.stdout.write(out_json + "\n")
+    return 0
+
+
+if __name__ == "__main__":  # pragma: no cover
+    raise SystemExit(main())