awesome-pro
diff --git a/‎.github/workflows/ci.yml‎
Lines changed: 41 additions & 0 deletions b/‎.github/workflows/ci.yml‎
Lines changed: 41 additions & 0 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 33 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 33 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 63 additions & 8 deletions b/‎README.md‎
Lines changed: 63 additions & 8 deletions
diff --git a/‎docs/design.md‎
Lines changed: 40 additions & 0 deletions b/‎docs/design.md‎
Lines changed: 40 additions & 0 deletions
@@ -0,0 +1,41 @@
+name: CI
+
+on:
+  push:
+    branches: [main]
+  pull_request:
+
+permissions:
+  contents: read
+
+jobs:
+  test:
+    name: Test (Python ${{ matrix.python-version }})
+    runs-on: ubuntu-latest
+    strategy:
+      fail-fast: false
+      matrix:
+        python-version: ["3.11", "3.12", "3.13"]
+
+    steps:
+      - name: Check out repository
+        uses: actions/checkout@v6
+
+      - name: Set up uv
+        uses: astral-sh/setup-uv@08807647e7069bb48b6ef5acd8ec9567f424441b # v8.1.0
+        with:
+          python-version: ${{ matrix.python-version }}
+
+      - name: Install dependencies
+        run: uv sync --all-extras --group dev
+
+      - name: Lint (ruff)
+        run: |
+          uv run ruff check .
+          uv run ruff format --check .
+
+      - name: Type-check (pyright)
+        run: uv run pyright
+
+      - name: Tests
+        run: uv run pytest --cov=guardloop
@@ -5,6 +5,38 @@ All notable changes to GuardLoop are documented here. The format is based on
 follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html) (pre-1.0:
 minor releases may include breaking changes).
 
+## [0.4.0] - 2026-05-11
+
+### Added
+
+- **LangGraph adapter (`guardloop.adapters.langgraph`).** `guarded_graph(graph)`
+  returns a GuardLoop-compatible agent callable you pass to `GuardLoop.run(...)`;
+  a `GuardLoopCallbackHandler` (a synchronous LangChain `BaseCallbackHandler`)
+  bound to the `RunContext` runs the pre-flight budget check before each LLM call,
+  records actual usage afterward, and routes tool calls through the per-tool
+  circuit breaker and the tool-call budget — so cost / token / time caps, breakers,
+  and `llm_call` / `tool_call` OpenTelemetry spans all apply *inside* a LangGraph
+  run. The verifier retry loop wraps the whole graph run, with verifier feedback
+  injected into a copy of the input state (`feedback_to_state` to customise).
+  `guarded_graph(..., reserved_output_tokens=N)` sets the output-token reservation
+  for the pre-flight check (default `1024`), since LangChain chat models often omit
+  `max_tokens`. Behind the new `langgraph` optional extra
+  (`pip install "guardloop[langgraph]"`).
+- `guardloop.adapters` subpackage; `guardloop.adapters.langgraph` exports
+  `guarded_graph` and `GuardLoopCallbackHandler`. (Adapters are intentionally not
+  re-exported from the top-level `guardloop` package, so `import guardloop` stays
+  dependency-light.)
+- `RunContext.circuit_breakers` — public read-only access to the per-tool circuit
+  breaker registry (used by adapters; also handy for inspecting breaker state).
+- No-key demo `examples/langgraph_guarded.py`.
+- `.github/workflows/ci.yml` — runs pytest + ruff + pyright on push / pull request
+  across Python 3.11–3.13.
+
+### Changed
+
+- `pyproject.toml`: new `langgraph` optional-dependency extra; `langgraph` /
+  `langchain-core` added to the dev dependency group; `langgraph` keyword.
+
 ## [0.3.0] - 2026-05-10
 
 ### Added
@@ -77,6 +109,7 @@ minor releases may include breaking changes).
 - No-key demo `examples/runaway_cost_prevention.py`; packaged and published to
   PyPI via GitHub Actions OIDC Trusted Publishing.
 
+[0.4.0]: https://github.com/awesome-pro/guardloop/releases/tag/v0.4.0
 [0.3.0]: https://github.com/awesome-pro/guardloop/releases/tag/v0.3.0
 [0.2.0]: https://github.com/awesome-pro/guardloop/releases/tag/v0.2.0
 [0.1.0]: https://github.com/awesome-pro/guardloop/releases/tag/v0.1.0
@@ -8,9 +8,10 @@ loops can be stopped before they burn through money, flaky tools can be cut off
 before an agent retries them into a bigger incident, and confidently-wrong
 answers get a second pass.
 
-The v0.3 focus is intentionally sharp: **runtime guardrails for async Python
-agents** — direct OpenAI and Anthropic wrappers, protected tool calls, per-tool
-circuit breakers, and a verify-fix-retry loop.
+The v0.4 focus: **runtime guardrails for async Python agents, including agents
+built with LangGraph** — direct OpenAI and Anthropic wrappers, protected tool
+calls, per-tool circuit breakers, a verify-fix-retry loop, and a LangGraph
+adapter that puts all of it *under* an existing graph without rewriting it.
 
 ```python
 from guardloop import (
@@ -65,15 +66,16 @@ clear trace. GuardLoop puts an explicit execution layer around that loop:
 
 ```mermaid
 flowchart LR
-    U["User code"] --> R["GuardLoop"]
+    LG["LangGraph graph"] -. "guarded_graph(...)" .-> U
+    U["Your agent"] --> R["GuardLoop"]
     R --> B["BudgetController"]
     R --> CB["CircuitBreakerRegistry"]
     R --> V["VerifierChain"]
     R --> T["OpenTelemetry spans"]
     R --> C["RunContext"]
     C --> O["Wrapped OpenAI client"]
     C --> A["Wrapped Anthropic client"]
-    C --> W["Wrapped tools"]
+    C --> W["Wrapped tools / LangChain callbacks"]
     V -. "feedback on retry" .-> C
 ```
 
@@ -113,6 +115,44 @@ that fails every retry comes back as `success=False` with
 `terminated_reason="verification_failed"` but with `output` still populated;
 set `VerifierConfig(raise_on_failure=True)` for a hard stop.
 
+## Framework Adapters
+
+GuardLoop is a wrapper, not a framework — so it slots *under* the agent
+frameworks you already use. The first adapter is for **LangGraph** (behind the
+`langgraph` extra):
+
+```bash
+pip install "guardloop[langgraph]"
+```
+
+```python
+from langchain_core.messages import HumanMessage
+
+from guardloop import GuardLoop, BudgetConfig
+from guardloop.adapters.langgraph import guarded_graph
+
+runtime = GuardLoop(
+    budget=BudgetConfig(cost_limit_usd="0.10", token_limit=10_000, tool_call_limit=20),
+    verifiers=[...],  # optional: the verifier retry loop wraps the whole graph run
+)
+
+agent = guarded_graph(my_compiled_graph, input_key="messages")
+result = await runtime.run(agent, {"messages": [HumanMessage("research agent runtime safety")]})
+print(result.success, result.cost_usd, result.tokens_used, result.terminated_reason)
+```
+
+`guarded_graph` returns a GuardLoop-compatible agent, so you keep calling
+`runtime.run(...)` as usual. A LangChain callback handler bound to the
+`RunContext` runs the pre-flight budget check before each LLM node, records usage
+afterward, and routes tool calls through the per-tool circuit breaker and the
+tool-call budget — so the cost / token / time caps, breakers, and `llm_call` /
+`tool_call` OpenTelemetry spans all apply *inside* the graph. A budget breach
+inside the graph terminates the run. On a verifier retry the feedback is injected
+into a copy of the input state (override `feedback_to_state` for non-standard
+state shapes). Because LangChain chat models often omit `max_tokens`,
+`guarded_graph(..., reserved_output_tokens=N)` sets the output-token reservation
+used by the pre-flight check (default `1024`).
+
 ## Project Guide
 
 For a deeper walkthrough of what has been implemented, how the code is
@@ -170,6 +210,15 @@ malformed JSON). A verifier chain rejects it with feedback, the agent reads
 `ctx.retry_feedback` and self-corrects, and the run ends with
 `verification_passed: true` after three attempts.
 
+```bash
+uv run python examples/langgraph_guarded.py
+```
+
+This demo runs a small LangGraph graph (with an in-process fake chat model, so
+no API key) under `guarded_graph`. The first run succeeds with cost and tokens
+recorded; the second runs under a tiny token budget and is stopped before the
+model call.
+
 ## Live Provider Smoke Tests
 
 ```bash
@@ -192,7 +241,7 @@ uv run ruff format --check .
 uv run pyright
 ```
 
-## v0.3 Scope
+## v0.4 Scope
 
 - Async Python runtime with `src/` package layout.
 - Hard caps for cost, tokens, time, and tool calls.
@@ -201,16 +250,22 @@ uv run pyright
 - Verify-fix-retry loop: sync or async output verifiers, fail-fast chains,
   built-in rule-based verifiers, feedback into `ctx.retry_feedback`, and an
   opt-in strict mode — all attempts share one budget and the run timeout.
+- LangGraph adapter (`guardloop.adapters.langgraph.guarded_graph`, behind the
+  `langgraph` extra): budget caps, circuit breakers, and `llm_call` / `tool_call`
+  OpenTelemetry spans applied inside a LangGraph run via a LangChain callback
+  handler; the verifier loop wraps the whole graph run.
 - Direct wrappers for `AsyncOpenAI.responses.create` and
   `AsyncAnthropic.messages.create`.
 - OpenTelemetry spans for agent runs, LLM calls, tools, and verifiers.
-- Fake-client tests and demos that do not require API keys.
+- Fake-client tests and demos that do not require API keys; CI on push/PR
+  (pytest + ruff + pyright, Python 3.11–3.13).
 
 ## Roadmap
 
 - v0.2: per-tool circuit breakers. ✅
 - v0.3: verify-fix-retry loop. ✅
-- v0.4: LangGraph and OpenAI Agents SDK adapters.
+- v0.4: LangGraph adapter. ✅
+- v0.4.1: OpenAI Agents SDK adapter.
 - v0.5: Jaeger/Phoenix trace screenshots, demo video, and blog post.
 - v0.6: persistent breaker state, YAML/TOML policy, multi-model pricing, loop detection.
 - v1.0: stable API, changelog, docs site, release checklist.
 
@@ -66,6 +66,46 @@ agent produced an answer, it just isn't trusted. With
 `VerificationFailed` (same `terminated_reason`, `output=None`, attempt count and
 feedback in `metadata`).
 
+## Framework Adapters
+
+GuardLoop is not an agent framework, so it does not "support" frameworks — it
+*wraps* them. An adapter is just a thing that produces a GuardLoop-compatible
+`async def agent(ctx, ...)` callable; you still call `runtime.run(agent, ...)`. The
+adapters live in `guardloop.adapters`, each behind its own optional extra, and the
+core `GuardLoop` class never references a framework.
+
+The LangGraph adapter (`guardloop.adapters.langgraph.guarded_graph`) is the first.
+LangGraph nodes call LangChain chat models, which do not go through GuardLoop's
+`ctx.openai` / `ctx.anthropic` wrappers, so the adapter hooks the cross-cutting seam
+LangChain *does* expose: a callback handler. `GuardLoopCallbackHandler` is a
+*synchronous* `BaseCallbackHandler` (LangChain only honours `raise_error` for sync
+handlers, and the handler does no I/O) with `raise_error = True` and
+`run_inline = True`. On `on_chat_model_start` / `on_llm_start` it estimates input
+tokens and runs `BudgetController.check_llm_call` (the pre-flight cost/token cap);
+on `on_llm_end` it records actual usage from the response's `usage_metadata` (or
+`llm_output["token_usage"]`); on `on_tool_start` / `on_tool_end` / `on_tool_error`
+it routes through `before_call` / `record_tool_call_started` / `record_success` /
+`record_failure`. Each LLM and tool call gets an `llm_call` / `tool_call` span that
+is a child of the active `agent_run` span. Guardrail exceptions raised inside the
+callbacks propagate out of `graph.ainvoke()` and are caught by `runtime.run`'s
+existing arms, so a budget breach inside the graph terminates the run.
+
+Two consequences worth knowing. First, `check_llm_call` always needs an output-token
+reservation, but LangChain chat models frequently do not declare a `max_tokens`, so
+`guarded_graph(...)` exposes `reserved_output_tokens` (default 1024) as the fallback
+reservation. Second, tool-side enforcement is only as "hard" as the graph's own error
+handling: a `ToolNode` with its default `handle_tool_errors=True` will catch a
+`ToolCallLimitExceeded` / `CircuitBreakerOpen` raised by the callback and turn it
+into a `ToolMessage`, so the graph continues (the breaker still records the event and
+the LLM-side caps still terminate the run). Pass `handle_tool_errors=False` for hard
+tool-call enforcement. Streaming (`astream` / `astream_events`) is out of scope for
+v0.4.
+
+The adapter needs `before_call` / `record_success` / `record_failure` on the per-tool
+breaker registry, so `RunContext.circuit_breakers` exposes it as a public read-only
+property (it persists on the `GuardLoop` instance across runs, like the breaker state
+itself).
+
 ## Telemetry
 
 Provider wrappers emit OpenTelemetry spans through a small conventions module.