Add local-first agent template example#475
Conversation
Adds examples/local-first-template/: a framework-free agent loop that calls a local OpenAI-compatible model server (llama.cpp, Ollama) while AgentGuard enforces a budget cap, a rate limit, and a tool allowlist, writing a JSONL audit line for every decision. - agent.py: hand-rolled loop, stdlib-only LocalChatClient, tool allowlist - config.py: one place for budget/rate/allowlist policy - README.md: offline-default run path, model-server swap, honest limits - sdk/tests/test_local_first_template.py: offline smoke test (no GPU, no network) Per the processed decision request, this is an in-repo example, not a new repository. No new dependency, package, or service. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9679206326
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
Adds a new local-first, framework-free agent template under examples/ that demonstrates running against an OpenAI-compatible local model server (with an offline default for CI) while using AgentGuard to enforce budgets, rate limiting, and a tool allowlist, with full JSONL auditing.
Changes:
- Adds
examples/local-first-template/(agent loop + policy config + docs + sample task), defaulting to deterministic offline mode. - Adds
sdk/tests/test_local_first_template.pyto smoke-test offline execution and audit-log event coverage. - Updates
examples/README.mdto list and show how to run the new example.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
sdk/tests/test_local_first_template.py |
Runs the new local-first example offline in a subprocess and asserts expected audit events are written. |
examples/README.md |
Documents the new example and its CLI usage. |
examples/local-first-template/sample_task.txt |
Provides a deterministic local task file for the template. |
examples/local-first-template/README.md |
Documents offline vs live usage and what the template demonstrates. |
examples/local-first-template/config.py |
Centralizes policy knobs (budget, rate limit, allowlist, audit log path). |
examples/local-first-template/agent.py |
Implements the guarded agent loop and a minimal stdlib OpenAI-compatible client with offline mode. |
Resolves Copilot and Codex review feedback on PR #475: - Use estimate_cost() per call so the dollar budget enforces when the template points at a paid OpenAI-compatible endpoint (was hardcoded 0.0). - Fall back to a token estimate when a local server omits the usage field, so the token budget is never a silent no-op in live mode. - parse_tool_call flags malformed JSON args; the loop treats them as a failed tool call instead of crashing on an empty path. - read_file_tool refuses empty paths and non-file targets. - Replace PEP 604 union with typing.Optional for Python 3.9 clarity. - Neutral "[guard STOP]" label since RateLimitGuard also raises BudgetExceeded. - allowed_tools is a Tuple so a frozen AgentPolicy is genuinely immutable. - Add a unit test for the bad-input helper paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
All Copilot and Codex review points addressed in d9fe3c8:
Added a unit test covering the bad-input helper paths. All checks green on Python 3.9-3.12. Holding for Patrick's approving review (main branch protection requires one approval; the queue worker cannot self-approve). |
Summary
examples/local-first-template/— a framework-free agent loop that calls a local OpenAI-compatible model server (llama.cppllama-server, Ollama, LM Studio) while AgentGuard enforces a budget cap, a rate limit, and a tool allowlist.JsonlFileSink.AGENTGUARD_LOCAL_DEMO=liveto point at a real server.New-repo decision: the originating task considered a standalone
agentguard-local-templaterepo. Per the processed decision request, this was narrowed to an in-repo example — no new repository, package, dependency, or service.Test plan
python examples/local-first-template/agent.pyruns offline, exits 0, no networkguard.rate_check,guard.tool_denied,llm.call,tool.call,agent.finalsdk/tests/test_local_first_template.py— 3 tests pass (0.30s), no GPUtest_example_starters.py+test_architecture.py(18) still passruff checkcleanArtifacts
WORK_PLAN, RESEARCH, and QA_REPORT were produced by the queue worker; QA verdict was ✅ (no secrets, no denylist paths, no coverage regression). Available on request.
Risk
Low. Additive only — one new example directory, one new test, one table row in
examples/README.md. No production code, no dependencies, no denylist paths touched.