Skip to content

Commit a1e9fcb

Browse files
author
bgagent
committed
refactor(agent): decompose entrypoint.py into modular src/ package with Cedar policy engine
Decompose the monolithic agent/entrypoint.py (~2,100 lines) into 13 focused modules under agent/src/, add a Cedar-based policy engine for tool-call governance, and fix 15 review findings across Python and CDK TypeScript code. Agent decomposition: - config.py, models.py (TaskType enum), pipeline.py, runner.py, context.py, prompt_builder.py, hooks.py, policy.py, post_hooks.py, repo.py, shell.py, telemetry.py - entrypoint.py retained as re-export shim for backward compatibility Cedar policy engine (agent/src/policy.py + hooks.py): - In-process cedarpy evaluation with deny-list model (fail-closed) - pr_review agents denied Write/Edit; protected path and destructive command blocking for all agents - Per-repo custom Cedar policies via Blueprint security.cedarPolicies - PreToolUse hook integration with Claude Agent SDK - POLICY_DECISION telemetry events on denied decisions Critical fixes: - log() was silently discarding message text - PolicyEngine changed from fail-open to fail-closed - Hook fallbacks now deny (not silently allow) on invalid inputs CDK changes: - Blueprint cedarPolicies resolved to readonly property - context-hydration: POLICY_EXTRACTORS mapping table, managedWordLists support, formatGuardrailBlocked helper, tightened filter_type union - cedar_policies passthrough in orchestrator and repo-config Tests: 139 Python (6 new test files), 604 CDK (4 files updated) Documentation: 7 docs updated for new module structure and Cedar status
1 parent 06002e2 commit a1e9fcb

55 files changed

Lines changed: 3681 additions & 2175 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

AGENTS.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ Use this routing before editing so the right package and tests get updated:
1919
| REST API, Lambdas, task validation, orchestration | `cdk/src/handlers/`, `cdk/src/stacks/`, `cdk/src/constructs/` | Matching tests under `cdk/test/` |
2020
| Shared API request/response shapes | `cdk/src/handlers/shared/types.ts` | **`cli/src/types.ts`** (must stay in sync) |
2121
| `bgagent` CLI commands and HTTP client | `cli/src/`, `cli/test/` | `cli/src/types.ts` if API types change |
22-
| Agent runtime (clone, tools, prompts, container) | `agent/` (`entrypoint.py`, `prompts/`, `Dockerfile`, etc.) | `agent/tests/`, `agent/README.md` for env/PAT |
22+
| Agent runtime (clone, tools, prompts, container) | `agent/src/` (`pipeline.py`, `runner.py`, `config.py`, `hooks.py`, `policy.py`, `prompts/`, Dockerfile, etc.) | `agent/tests/`, `agent/README.md` for env/PAT |
2323
| User-facing or design prose | `docs/guides/`, `docs/design/` | Run **`mise //docs:sync`** or **`mise //docs:build`** (do not edit `docs/src/content/docs/` by hand) |
2424
| Monorepo tasks, CI glue | Root `mise.toml`, `scripts/`, `.github/workflows/` ||
2525

agent/Dockerfile

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -50,8 +50,7 @@ RUN uv sync --frozen --no-dev --directory /app
5050

5151
# Copy agent code (ARG busts cache so file edits are always picked up)
5252
ARG CACHE_BUST=0
53-
COPY entrypoint.py system_prompt.py server.py task_state.py observability.py memory.py /app/
54-
COPY prompts/ /app/prompts/
53+
COPY src/ /app/src/
5554
COPY prepare-commit-msg.sh /app/
5655
COPY test_sdk_smoke.py test_subprocess_threading.py /app/
5756

@@ -69,4 +68,4 @@ WORKDIR /workspace
6968

7069
EXPOSE 8080
7170

72-
CMD ["opentelemetry-instrument", "uvicorn", "server:app", "--host", "0.0.0.0", "--port", "8080", "--app-dir", "/app", "--loop", "asyncio"]
71+
CMD ["opentelemetry-instrument", "uvicorn", "server:app", "--host", "0.0.0.0", "--port", "8080", "--app-dir", "/app/src", "--loop", "asyncio"]

agent/README.md

Lines changed: 36 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ The second argument is auto-detected:
8181

8282
When an issue number is given, the optional third argument provides additional instructions on top of the issue context.
8383

84-
The `run.sh` script overrides the container's default CMD to run `python /app/entrypoint.py` (batch mode) instead of the uvicorn server.
84+
The `run.sh` script overrides the container's default CMD to run `python /app/src/entrypoint.py` (batch mode) instead of the uvicorn server.
8585

8686
### Environment Variables
8787

@@ -320,24 +320,45 @@ docker images bgagent-local --format "{{.Size}}"
320320
agent/
321321
├── Dockerfile Python 3.13 + Node.js 20 + Claude Code CLI + git + gh + mise (default platform linux/arm64)
322322
├── .dockerignore
323-
├── pyproject.toml App dependencies (claude-agent-sdk, FastAPI, boto3, OpenTelemetry distro, MCP, …)
323+
├── pyproject.toml App dependencies (claude-agent-sdk, FastAPI, boto3, OpenTelemetry distro, MCP, cedarpy, …)
324324
├── uv.lock Locked deps for reproducible `uv sync` in the image
325325
├── mise.toml Tool versions / tasks used when the target repo relies on mise
326-
├── entrypoint.py Config, context hydration, ClaudeSDKClient pipeline, metrics, run_task()
327-
├── server.py FastAPI — async /invocations (background thread) and /ping; OTEL session correlation
328-
├── task_state.py Best-effort DynamoDB task status (no-op if TASK_TABLE_NAME unset)
329-
├── observability.py OpenTelemetry helpers (e.g. AgentCore session id)
330-
├── memory.py Optional memory / episode integration for the agent
331-
├── prompts/ Per-task-type system prompt workflows
332-
│ ├── __init__.py Prompt registry — assembles base template + workflow for each task type
333-
│ ├── base.py Shared base template (environment, rules, placeholders)
334-
│ ├── new_task.py Workflow for new_task (create branch, implement, open PR)
335-
│ ├── pr_iteration.py Workflow for pr_iteration (read feedback, address, push)
336-
│ └── pr_review.py Workflow for pr_review (read-only analysis, structured review comments)
337-
├── system_prompt.py Behavioral contract (PRD Section 11)
326+
├── src/ Agent source modules (pythonpath configured in pyproject.toml)
327+
│ ├── __init__.py
328+
│ ├── entrypoint.py Re-export shim for backward compatibility (tests); delegates to specific modules
329+
│ ├── config.py Configuration: build_config(), get_config(), resolve_github_token(), TaskType validation
330+
│ ├── models.py Data models and enumerations (TaskType StrEnum with is_pr_task property)
331+
│ ├── pipeline.py Top-level pipeline: main() CLI entry, run_task() orchestration
332+
│ ├── runner.py Agent runner: run_agent() — ClaudeSDKClient connect/query/receive_response
333+
│ ├── context.py Context hydration: fetch_github_issue(), assemble_prompt() (local/dry-run only)
334+
│ ├── prompt_builder.py System prompt assembly + memory context, repo config scanning
335+
│ ├── hooks.py PreToolUse hook callback for Cedar policy enforcement (Claude Agent SDK hooks)
336+
│ ├── policy.py Cedar policy engine — in-process cedarpy evaluation, fail-closed, deny-list model
337+
│ ├── post_hooks.py Deterministic post-hooks: ensure_committed, ensure_pushed, ensure_pr, verify_build, verify_lint
338+
│ ├── repo.py Repository setup: clone, branch, git auth, mise trust/install/build/lint
339+
│ ├── shell.py Shell utilities: log(), run_cmd(), redact_secrets(), slugify(), truncate()
340+
│ ├── telemetry.py Metrics, disk usage, trajectory writer (_TrajectoryWriter with write_policy_decision)
341+
│ ├── server.py FastAPI — async /invocations (background thread) and /ping; OTEL session correlation
342+
│ ├── task_state.py Best-effort DynamoDB task status (no-op if TASK_TABLE_NAME unset)
343+
│ ├── observability.py OpenTelemetry helpers (e.g. AgentCore session id)
344+
│ ├── memory.py Optional memory / episode integration for the agent
345+
│ ├── system_prompt.py Behavioral contract (PRD Section 11)
346+
│ └── prompts/ Per-task-type system prompt workflows
347+
│ ├── __init__.py Prompt registry — assembles base template + workflow for each task type
348+
│ ├── base.py Shared base template (environment, rules, placeholders)
349+
│ ├── new_task.py Workflow for new_task (create branch, implement, open PR)
350+
│ ├── pr_iteration.py Workflow for pr_iteration (read feedback, address, push)
351+
│ └── pr_review.py Workflow for pr_review (read-only analysis, structured review comments)
338352
├── prepare-commit-msg.sh Git hook (Task-Id / Prompt-Version trailers on commits)
339353
├── run.sh Build + run helper for local/server mode with AgentCore constraints
340-
├── tests/ pytest unit tests for pure functions and prompt assembly
354+
├── tests/ pytest unit tests (pythonpath: src/)
355+
│ ├── test_config.py Config validation and TaskType tests
356+
│ ├── test_hooks.py PreToolUse hook and hook matcher tests
357+
│ ├── test_models.py TaskType enum tests
358+
│ ├── test_policy.py Cedar policy engine tests (fail-closed, deny-list)
359+
│ ├── test_pipeline.py Pipeline orchestration tests (cedar_policies injection)
360+
│ ├── test_shell.py Shell utility tests (slugify, redact_secrets, truncate, format_bytes)
361+
│ └── ...
341362
├── test_sdk_smoke.py Diagnostic: minimal SDK smoke test (ClaudeSDKClient → CLI → Bedrock)
342363
└── test_subprocess_threading.py Diagnostic: subprocess-in-background-thread verification
343364
```

0 commit comments

Comments
 (0)