Skip to content

Add local-first agent template example#475

Open
bmdhodl wants to merge 3 commits into
mainfrom
feat/local-first-template
Open

Add local-first agent template example#475
bmdhodl wants to merge 3 commits into
mainfrom
feat/local-first-template

Conversation

@bmdhodl
Copy link
Copy Markdown
Owner

@bmdhodl bmdhodl commented May 16, 2026

Summary

  • Adds examples/local-first-template/ — a framework-free agent loop that calls a local OpenAI-compatible model server (llama.cpp llama-server, Ollama, LM Studio) while AgentGuard enforces a budget cap, a rate limit, and a tool allowlist.
  • Every model call, tool call, and guard decision is written to a JSONL audit log via JsonlFileSink.
  • Defaults to offline mode (canned model reply) so it runs in CI with no model server, no GPU, and no network. Set AGENTGUARD_LOCAL_DEMO=live to point at a real server.

New-repo decision: the originating task considered a standalone agentguard-local-template repo. Per the processed decision request, this was narrowed to an in-repo example — no new repository, package, dependency, or service.

Test plan

  • python examples/local-first-template/agent.py runs offline, exits 0, no network
  • Output shows budget tracking, a rate-limit check, and a tool-allowlist denial
  • JSONL audit log contains guard.rate_check, guard.tool_denied, llm.call, tool.call, agent.final
  • sdk/tests/test_local_first_template.py — 3 tests pass (0.30s), no GPU
  • No regression: test_example_starters.py + test_architecture.py (18) still pass
  • ruff check clean

Artifacts

WORK_PLAN, RESEARCH, and QA_REPORT were produced by the queue worker; QA verdict was ✅ (no secrets, no denylist paths, no coverage regression). Available on request.

Risk

Low. Additive only — one new example directory, one new test, one table row in examples/README.md. No production code, no dependencies, no denylist paths touched.

Adds examples/local-first-template/: a framework-free agent loop that calls a
local OpenAI-compatible model server (llama.cpp, Ollama) while AgentGuard
enforces a budget cap, a rate limit, and a tool allowlist, writing a JSONL
audit line for every decision.

- agent.py: hand-rolled loop, stdlib-only LocalChatClient, tool allowlist
- config.py: one place for budget/rate/allowlist policy
- README.md: offline-default run path, model-server swap, honest limits
- sdk/tests/test_local_first_template.py: offline smoke test (no GPU, no network)

Per the processed decision request, this is an in-repo example, not a new
repository. No new dependency, package, or service.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 16, 2026 05:08
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9679206326

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread examples/local-first-template/agent.py Outdated
Comment thread examples/local-first-template/agent.py Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new local-first, framework-free agent template under examples/ that demonstrates running against an OpenAI-compatible local model server (with an offline default for CI) while using AgentGuard to enforce budgets, rate limiting, and a tool allowlist, with full JSONL auditing.

Changes:

  • Adds examples/local-first-template/ (agent loop + policy config + docs + sample task), defaulting to deterministic offline mode.
  • Adds sdk/tests/test_local_first_template.py to smoke-test offline execution and audit-log event coverage.
  • Updates examples/README.md to list and show how to run the new example.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
sdk/tests/test_local_first_template.py Runs the new local-first example offline in a subprocess and asserts expected audit events are written.
examples/README.md Documents the new example and its CLI usage.
examples/local-first-template/sample_task.txt Provides a deterministic local task file for the template.
examples/local-first-template/README.md Documents offline vs live usage and what the template demonstrates.
examples/local-first-template/config.py Centralizes policy knobs (budget, rate limit, allowlist, audit log path).
examples/local-first-template/agent.py Implements the guarded agent loop and a minimal stdlib OpenAI-compatible client with offline mode.

Comment thread examples/local-first-template/agent.py Outdated
Comment thread examples/local-first-template/agent.py
Comment thread examples/local-first-template/agent.py Outdated
Comment thread examples/local-first-template/agent.py Outdated
Comment thread examples/local-first-template/config.py
Resolves Copilot and Codex review feedback on PR #475:
- Use estimate_cost() per call so the dollar budget enforces when the
  template points at a paid OpenAI-compatible endpoint (was hardcoded 0.0).
- Fall back to a token estimate when a local server omits the usage field,
  so the token budget is never a silent no-op in live mode.
- parse_tool_call flags malformed JSON args; the loop treats them as a
  failed tool call instead of crashing on an empty path.
- read_file_tool refuses empty paths and non-file targets.
- Replace PEP 604 union with typing.Optional for Python 3.9 clarity.
- Neutral "[guard STOP]" label since RateLimitGuard also raises BudgetExceeded.
- allowed_tools is a Tuple so a frozen AgentPolicy is genuinely immutable.
- Add a unit test for the bad-input helper paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@bmdhodl bmdhodl added the needs:patrick-review Requires Patrick personal review label May 16, 2026
@bmdhodl
Copy link
Copy Markdown
Owner Author

bmdhodl commented May 16, 2026

All Copilot and Codex review points addressed in d9fe3c8:

  • Cost budget never triggered (cost_usd=0.0) - now uses estimate_cost() per call, so the dollar cap enforces on paid OpenAI-compatible endpoints.
  • Missing usage -> 0 tokens - _live_reply now falls back to a ~4-char/token estimate so the token budget is never a silent no-op.
  • Malformed/missing tool args crash - parse_tool_call flags valid_args: False; the loop logs guard.tool_failed instead of crashing. read_file_tool refuses empty/non-file paths.
  • PEP 604 | None - replaced with typing.Optional.
  • Rate limit printed as [budget STOP] - now a neutral [guard STOP] label.
  • Frozen dataclass with mutable list - allowed_tools is now a Tuple[str, ...].

Added a unit test covering the bad-input helper paths. All checks green on Python 3.9-3.12.

Holding for Patrick's approving review (main branch protection requires one approval; the queue worker cannot self-approve).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs:patrick-review Requires Patrick personal review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants