Skip to content

Commit 7d10254

Browse files
krokokobgagent
andauthored
feat(tasks): workflow-driven tasks — declarative steps + repo-less workflows (#248) (#296)
* Initial design docs * docs(workflows): address principal-review findings on ADR-014 - Validator parity: declare JSON Schema as the single canonical shape contract (consumed, not re-implemented), keep exactly one cross-field validator impl in Phases 1-3, and add a contracts/workflow-validation/ golden corpus to lock any Phase-4 second impl to parity (mirrors the cedar-parity mechanism). - Cedar principal migration: split into its own isolated PR (Phase 2a) reviewed alone with regenerated parity fixtures, landing ahead of the pr_iteration/pr_review workflow migrations (Phase 2b) that depend on it. - Repo-optionality: reframe web_research as a not-yet-runnable schema fixture (not an acceptance proof) and require the two blocking open questions (memory actorId, artifact delivery) to be resolved as a recorded ADR addendum before the Phase-0 schema is frozen. Regenerated Starlight mirrors. * feat(agent): workflow models + loader (#248 Phase 1) Add the agent/src/workflow/ package: Pydantic models mirroring workflow.schema.json and a loader that resolves a first-party workflow id, parses YAML, and shape-validates against the canonical JSON Schema. Additive only — does not yet touch task_type (the breaking cutover lands later in Phase 1). - models.py: Workflow + sub-models; resolved_requires_repo applies the domain-derived default (coding=true, knowledge/hybrid=false). - loader.py: JSON-Schema shape validation (one canonical contract, consumed not re-implemented), YAML load, id/path agreement, and path-traversal defense. Cross-field rules deliberately live in the separate validator (Task #2), so there is one cross-field impl. - Promote pyyaml + jsonschema to direct deps (were only transitive); the loader must not depend on another package's transitive pin. - Fix a latent schema bug found by the new tests: the requires_repo/ read_only allOf conditionals fired vacuously when the property was absent, wrongly applying repo-less constraints to a coding workflow relying on the domain default. Guard each `if` with `required`. 17 new tests; agent ruff + ty + full suite (836) green. * feat(agent): cross-field workflow validator + golden corpus (#248 Phase 1) The single CI-time implementation of the WORKFLOWS.md cross-field rules the JSON Schema cannot express (rules 1-9, 11, 12, 14). The runtime loader stays shape-only and trusts this verdict, so there is exactly one cross-field implementation in Phases 1-3 (avoids the cedar-style two-language drift). contracts/workflow-validation/ ships the golden parity corpus from Phase 1 so the expected-verdict contract is fixed before #246 adds a second validator, mirroring contracts/cedar-parity/. test_workflow_tree_valid.py gates every first-party workflow file through the validator at CI time. * feat(agent): workflow step runner + handler registry (#248 Phase 1) The agent-side step runner that interprets workflow.steps inside the container (ADR-014): a StepKind→handler registry, in-order execution honouring each step's on_failure policy (fail / continue / skip_remaining), per-step step:<name>:start/<status> progress milestones, and resume-aware checkpointing (deterministic side-effect-free steps recorded to workflow_state.json on the persistent mount are skipped on resume; agentic/side-effecting steps re-run idempotently per WORKFLOWS.md §Step execution semantics). Handlers are thin wrappers over the existing helpers (setup_repo, run_agent, verify_build/lint, ensure_pr) so this is maximal structural change, minimal logic change. post_review/deliver_artifact are registered but raise NotImplementedError (Phase 2b / Phase 3) — fail-loud, not silent no-op, keeping validator rule-8 handler parity honest. Orchestration core is unit-tested with fakes; wiring into pipeline.run_task is task 5. * feat(agent): first-party workflow files — coding/new-task-v1 + default/agent-v1 (#248 Phase 1) Migrates the new_task task type to coding/new-task-v1.yaml (the heavyweight clone→build→open-PR coding path) and ships the platform default workflow default/agent-v1.yaml — the conservative last rung of the resolution ladder used when no workflow_ref and no Blueprint default apply (requires_repo:false, read-leaning tool set, deliver-as-comment). pr_iteration/pr_review migrations are Phase 2b (depend on the Cedar principal migration). Both pass the cross-field validator and the test_workflow_tree_valid CI gate (file path agrees with declared id; zero violations). * fix(agent): address code-review findings in workflow validator + runner (#248 Phase 1) Fixes the group-A bugs surfaced by the high-recall code review of the Phase 1 commits: runner.py - Resume-skip product loss: clone_repo/hydrate_context populate in-memory products (ctx.setup, ctx.user_prompt) that can't be rebuilt from the JSON checkpoint, so skipping them on resume left ctx.setup None (breaking ensure_pr) and ctx.user_prompt empty (unguided agent). Narrow _RESUMABLE_SKIP_KINDS to verify_build/verify_lint — steps whose ENTIRE product is the checkpointed boolean re-applied via ctx.artifacts. clone_repo/hydrate_context now re-run on resume (handler-level idempotency), matching WORKFLOWS.md's intent. - Skipped steps now emit step:<name>:start / :skipped milestones so a watcher sees them accounted for instead of a gap. validator.py - _HANDLER_KINDS (rule 8) is now derived from runner.STEP_HANDLERS — the single source of truth — instead of a hand-copied list that could silently drift. - rule-6 tier ceiling now applies the elevated-only-fields check to read-only (the strictest tier) as well as standard; previously read-only could declare mcp_servers/plugins/skills unchecked. - rule-11 now verifies a deliver_artifact step's target actually produces the declared comment/artifact outcome, not merely that the step kind is present. Tests + corpus updated; rule3/rule7 fixtures switched to s3_and_comment target to stay isolated to their intended rule, plus a new rule11 target-mismatch fixture. Full agent suite green (909 passed), ruff + ty clean. * feat(agent): wire workflow runner into pipeline behind a gate (#248 Phase 1, task 5) Builds the runner→pipeline seam and gates it off (default production behavior unchanged). pipeline.run_task's agent invocation now goes through _execute_agent_step: when WORKFLOW_RUNNER_ENABLED is set AND task_type maps to a shipped workflow (only new_task→coding/new-task-v1 in Phase 1), the single run_agent step is dispatched through workflow.run_workflow — exercising the real handler registry, step milestones, and result threading — while clone, context assembly, and post-hooks stay on the proven inline path. Otherwise it is exactly the legacy asyncio.run(run_agent(...)) call. pr_iteration/pr_review have no workflow file until Phase 2b, so they fall back to inline. run_workflow gains an only_kinds filter so the seam drives just the run_agent step (no double clone / double PR against the inline post-hooks). The flag is flipped to default-on in the tasks 6-8 cutover. Folds in the group-B code-review fixes (effective when the seam is enabled): - system prompt: hydrate_context builds ctx.system_prompt via the existing build_system_prompt (was empty), reusing pipeline's logic rather than reimplementing it. - verify gate: new shared _gate_status gives verify_build/verify_lint matching semantics — informational/read_only never gate, regression_only consults build_before/lint_before (mirrors pipeline's build_ok = passed or not build_before), strict always gates. Fixes the regression_only build bug and the verify_lint read_only asymmetry. - clone_repo is idempotent (reuses a pre-populated ctx.setup). - StepContext threads the --trace trajectory into run_agent (was dropped). ensure_pr strategy reconciliation stays deferred to task 8 (it requires removing the task_type branch inside ensure_pr); recorded in local-docs. Full agent suite green (930 passed), ruff + ty clean. * fix(agent): address second-round review findings on the workflow seam (#248 Phase 1) - Error-handling contract (highest severity): when the run_agent step's handler raises, run_workflow captures it into a failed StepOutcome rather than propagating. _execute_agent_step now RE-RAISES in that case so run_task's except block restores full fidelity — the log_error_cw APPLICATION_LOGS mirror (read by TaskDashboard / bgagent status), the span error, and the real exception text — instead of silently downgrading to a generic AgentResult. This also makes the previously-dead agent_result-is-None branch live. - _gate_status: an unset gate now defaults to regression_only semantics (gate only a regression; a pre-existing failure does not gate), matching pipeline.py which is unconditionally 'build_ok = passed or not build_before'. Previously an unset gate fell through to strict, contradicting the docstring's 'mirrors pipeline.py' claim and would wrongly fail a broken-before repo. - run_agent handler fails loud on an empty system prompt rather than running an unguided agent loop (repo-less prompt assembly is Phase 3; this turns the gap into an attributable failed step instead of a silent context-free run). - _workflow_id_for_task_type: documented as a transitional bridge tied to the canonical task_type→workflow table in WORKFLOWS.md / API_CONTRACT.md, to be kept in sync until tasks 6-8 delete it. Tests added for each behavior change. Full agent suite green (934 passed), ruff + ty clean. * feat(api,cli): replace task_type with workflow_ref/resolved_workflow (#248 tasks 6-7) Breaking API change (no alias): task_type is removed end-to-end; workflow_ref selects a workflow and resolves to a pinned {id, version} at the create-task boundary. CDK/CLI types (task 6, in lockstep — check:types-sync): - Remove TaskType + isPrTaskType; add ResolvedWorkflow. - CreateTaskRequest.workflow_ref?; TaskRecord/TaskDetail/TaskSummary resolved_workflow (+ mappers). - CLI: --pr/--review-pr now set workflow_ref (coding/pr-iteration-v1, coding/pr-review-v1); new --workflow flag; format reads resolved_workflow.id. CDK resolution boundary (task 7): - New workflows.ts: CDK-side descriptor mirror of agent/workflows/** + resolveWorkflowRef (ladder: explicit ref → default/agent-v1), isValidWorkflowRef, requires_repo/read_only/uses_pr accessors. Drift-guard test keeps the table in sync with the shipped YAML. - validation.ts: drop VALID_TASK_TYPES/isValidTaskType; hasTaskSpec now takes the resolved workflow's required_inputs contract. - create-task-core.ts: resolve workflow, validate inputs against the contract, persist workflow_ref + resolved_workflow. Repo stays required for ALL workflows in Phase 1 (repo-less admission is Phase 3). - preflight.ts: taskType param → readOnly + requiresRepo; repo-less short-circuits to passed (seam for Phase 3). - orchestrate-task/context-hydration/orchestrator/ecs-strategy: derive read_only/requires_repo/PR-ness from resolved_workflow; payload swaps task_type → resolved_workflow. ecs bootCommand passes resolved_workflow (lockstep with agent run_task in task 8). CLI 294 tests + CDK suite green; check:types-sync OK. Agent cutover (task 8) lands next to consume resolved_workflow. * feat(agent): complete task_type→workflow cutover in the agent (#248 task 8) Removes the Python TaskType enum / PR_TASK_TYPES / _PROMPTS-by-task_type and the gated seam; the workflow runner is now the sole agentic path, driven by the resolved_workflow {id, version} threaded from the create-task boundary. - models.py: delete TaskType enum; TaskConfig gains resolved_workflow, policy_principal, is_pr_workflow (drops task_type). - config.py: build_config takes resolved_workflow (not task_type); derives policy_principal via workflow.policy_principal_for and is_pr_workflow; PR_TASK_TYPES → PR_WORKFLOW_IDS. - Cedar UNCHANGED (Phase 2a owns the principal migration): the Agent::TaskAgent::"<id>" scheme and contracts/cedar-parity/ are untouched. policy_principal_for maps read_only⇒pr_review, else id→legacy {new_task, pr_iteration, pr_review}. Only policy.py edit: drop the WARN-only TaskType() validation. runner.py passes config.policy_principal as the principal. All test_policy* pass unchanged. - prompts rekeyed by workflow id (get_system_prompt falls back to the default coding prompt for an unknown id rather than raising). - pipeline.py: _execute_agent_step loads from resolved_workflow, always-on; the 3 task_type=='pr_review' branches → workflow.read_only; ensure_pr takes an explicit strategy (create/push_resolve/resolve) from the workflow step, resolving the deferred code-review item. - post_hooks.ensure_pr: strategy param replaces task_type self-inspection. repo.py: PR-branch resume keys off config.is_pr_workflow. - server.py/entrypoint.py: thread resolved_workflow; validation keyed on PR workflow ids. - New faithful Phase-1 workflow files coding/pr-iteration-v1 (push_resolve) + coding/pr-review-v1 (read_only, resolve, no Write/Edit per rule 4). All 4 first-party files validate; CDK descriptor drift-guard green. - docs: API_CONTRACT.md migration table + resolved_workflow responses; Starlight mirror synced. Agent suite 922 passed, ruff + ty clean. With tasks 6-7 (e829c4f) this completes the breaking cutover across API + CLI + agent. * fix(agent): use ty ignore syntax + annotate workflow test handlers (#248) The full build was never completed on this branch; ty type-checking and ruff format gates were red. No src logic changes — type/format hygiene only. - Tests used mypy-style `# type: ignore[arg-type]`, but the project type checker is ty (`# ty: ignore[rule]`), so the suppression silently no-op'd. - test_workflow_runner.py: annotate inline step handlers as StepHandler-compatible `(step: Step, ctx: StepContext) -> StepOutcome`; type the `handlers` dicts as `dict[str, StepHandler]` (ty treats the alias params as positional-only → dict-value invariance rejects named-param funcs); convert the stale ignore to `# ty: ignore[invalid-argument-type]`. - test_entrypoint.py: assert `resolved_workflow is not None` before subscripting (TaskConfig.resolved_workflow is `dict | None`). - Apply ruff format reflows (runner.py, validator.py, test_workflow_tree_valid.py) the branch was committed without. * feat(agent): Cedar read-only enforcement keys off context.read_only (#248 Phase 2a) Migrate read-only enforcement from the literal `Agent::TaskAgent::"pr_review"` principal match to the `context.read_only` attribute, so the Write/Edit hard-deny attaches to the *property* and fires for every read-only workflow uniformly — not just coding/pr-review. Removes the Phase-2b principal bridge. This is the security-load-bearing step: an error here *silently weakens* enforcement (the rule stops matching) rather than failing loudly. Guarded by new contracts/cedar-parity/ golden fixtures verified on BOTH the cedarpy (Python) and cedar-wasm (TypeScript) engines. Policy (both bindings, kept byte-for-byte identical — drift guard enforces): - agent/policies/hard_deny.cedar + cdk/.../builtin-policies.ts: pr_review_forbid_write/edit (principal == "pr_review") → read_only_forbid_write/edit (when context.read_only == true, any principal). Wiring: - policy.py: PolicyEngine gains read_only kwarg; threads read_only into every Cedar request context (probe + base_context). - models.py: TaskConfig.read_only; config.py derives it from the resolved workflow; runner.py passes config.read_only to PolicyEngine. - workflow/loader.py: remove the read_only ⇒ "pr_review" bridge in policy_principal_for — the principal is now an identity/audit tag only; pr-review keeps its own id-derived principal. Parity fixtures (new): read-only-forbid-write, read-only-forbid-edit (read_only=true ⇒ deny), read-only-false-permits-write (read_only=false ⇒ base_permit — proves the deny is gated on the property, not always-on). Tests: TestPrReviewPermissions → TestReadOnlyPermissions, now asserting the deny fires for any read-only principal AND that read_only=false permits Write; test_hooks + test_config updated. agent 927 / cdk 1822 green; ty + ruff clean. Docs: ADR-014 addendum (2026-06-08) records dropping the isolated-PR/ordering requirement (2b shipped first behind the bridge, so read-only was never unprotected); WORKFLOWS.md §"Replacing the Cedar principal" + phasing table updated to past tense. * docs(workflows): resolve repo-optional open questions + freeze schema (#248) ADR-014 addendum (2026-06-08) settles the two open questions that gated the Phase-0 schema freeze (WORKFLOWS.md open questions #1, #2). With the one implied schema reshape applied, the schema is frozen — later phases add handlers and plumbing, not schema fields. Decision 1 — Memory actorId for repo-less tasks: per-user user:{cognito_sub}. Caller-scoped (no cross-tenant bleed; mirrors the per-user trace prefix); cross- workflow pooling explicitly not adopted. NO schema field — a fixed platform fallback, applied in memory.py in Phase 3 when repo is absent. Decision 2 — Artifact delivery: named Python deliverers (open target string), shared S3 plumbing pinned now. deliver_artifact.target resolves to a registered deliverer (workflow/deliverers.py → DELIVERERS), same pattern as STEP_HANDLERS — a new delivery method is a registered deliverer, not a schema change. Plumbing frozen: task-scoped key artifacts/{task_id}/, prefix-scoped SessionRole grant, size limit, TaskDetail URL surfacing, workflow:{id} repo-less tenant tag. Schema reshape applied: - workflow.schema.json: steps[].target drops its enum (stays type:string, minLength:1); the closed set moves into the deliverer registry. - models.py: DeliverTarget Literal → str. - New workflow/deliverers.py: Deliverer dataclass + DELIVERERS registry + produced_outcomes(); single source of truth for "what each target produces." - validator.py rule 11: consults the registry (_deliverer_produces) instead of the hardcoded _DELIVER_TARGET_OUTCOMES map. The three first-party names keep their exact produced-outcome sets, so NO existing workflow / fixture / golden vector changes verdict; an unknown deliverer name now produces nothing (new guard, tested). Tests: test_workflow_deliverers.py (registry contract) + a rule-11 unknown-deliverer case. agent 932 green; ty + ruff clean. deliver_artifact remains a NotImplementedError stub until Phase 3 — only the contract is frozen. * feat(api,agent): repo-optional admission for repo-less workflows (#248 Phase 3) First slice of Phase 3 (repo-optional tasks): the admission path now accepts a submission with no repo when the resolved workflow does not require one, so a repo-less knowledge workflow (e.g. default/agent-v1) can be admitted, hydrated, and run from task_description alone. The deliverer/S3 + agent-runtime clone-skip slices follow separately. Semantics: requires_repo decides whether a repo is MANDATORY; requires_repo:false means repo-OPTIONAL (a repo-less workflow may still target a repo). The repo-less behavior keys off repo *absence*, not the workflow flag — a repo-optional workflow given a repo still takes the repo-bound path. Types (CDK + CLI mirror, kept in sync — check:types-sync green): - CreateTaskRequest.repo, TaskRecord.repo → optional; TaskDetail.repo, TaskSummary.repo → string | null. Mappers null-coalesce. Admission (create-task-core.ts): resolve the workflow FIRST, then gate repo on workflow.requiresRepo (missing repo on a repo-bound workflow → 400). Onboarding + blueprint-cap lookup runs only when a repo is present; repo-less takes the platform default cap. Record omits repo when absent; replay-required-fields drops 'repo'; event metadata null-coalesces. Hydration (context-hydration.ts): new repo-less branch (keyed on !task.repo) that assembles from task_description + per-user memory, returning before the repo-bound fetch path; assembleUserPrompt drops the Repository: line when repo is absent. Memory (memory.ts + orchestrator.ts): loadMemoryContext/writeMinimalEpisode take an actorNamespace (repo for coding, user:{user_id} for repo-less — ADR-014 addendum). Preflight already honored requiresRepo (Phase-1 seam) — widened its repo param to string|undefined. fanout-task-events skips the GitHub channel for a repo-less task. loadBlueprintConfig skips the RepoTable lookup when repo is absent. Agent (config.py): build_config loads the workflow up-front and requires repo_url/github_token only when requires_repo; repo-less runs from task_description. CLI shows '—' for a repo-less task. Tests: repo-less acceptance (create-task-core, create-task, context-hydration, CLI format) + repo-bound-missing-repo rejection; memory test log key repo → actor_namespace. agent 932 / cdk 1825 / cli 296 green; ty + ruff + eslint clean. deliver_artifact is still a NotImplementedError stub — a repo-less task can be admitted and hydrated but not yet deliver an artifact (next slice). * feat(agent): repo-less pipeline path via the workflow runner (#248 Phase 3) Second Phase-3 slice: a repo-less workflow now runs the agent loop in-container with no clone / build / PR. When config.requires_repo is false, run_task delegates to _run_repoless_task, which drives the workflow's steps (hydrate_context → run_agent) through the workflow step runner — coding vs knowledge now differ by the workflow's steps list, per ADR-014's end-state — and assembles a terminal TaskResult with pr_url=None. Scope boundary (deliberate): only hydrate_context + run_agent are driven here. deliver_artifact is the workflow's terminal step but its handler is a stub until the artifacts-bucket S3/IAM contract ships (next slice), so it is excluded via only_kinds; delivery is deferred, the agent run is not. The coding path is left byte-identical — its hard-won inline cancel-safety (skip ensure_pr on a cancelled task) is NOT yet runner-driven, so routing the full coding step list through the runner is a separate cancel-aware change, intentionally not done here. - models.py: TaskConfig.requires_repo (defaults True; coding unaffected). config.py build_config sets it from the resolved workflow. - pipeline.py: requires_repo branch before the repo-coupled segment; _run_repoless_task runs the runner + assembles/persists the terminal result. run_task repo_url now defaults to "" (repo-less callers omit it). - prompts/default_agent.py: repo-less system prompt (no git/branch/PR placeholders); registered for default/agent-v1. - prompt_builder.build_repoless_system_prompt + shared _render_memory_context. Tests: repo-less pipeline run (no setup_repo, no PR, success, repo-less prompt); repo-less prompt has no repo placeholders. agent 934 green; ty + ruff clean. * fix(agent): address PR-review findings on Phase 2a/3 (#248) Findings from the plugin review (code-reviewer, silent-failure-hunter, pr-test-analyzer, type-design-analyzer) on 4c34c8b..HEAD. HIGH — repo-less task reported COMPLETED while delivering nothing (silent-failure-hunter). _run_repoless_task skips deliver_artifact (stub until the artifacts contract ships), but the workflow's terminal outcome (e.g. `comment`) was never produced, yet status was success. Now: if the primary terminal outcome is delivery-backed and delivery was skipped, the task is a loud FAILED with an explicit error + delivery_deferred milestone naming the gap. MEDIUM — config.py load-failure fallback bug + fail-open (code-reviewer, silent-failure-hunter). The WorkflowValidationError branch compared workflow_id against DEFAULT_WORKFLOW_ID (the *coding* default) for requires_repo, contradicting its comment, and defaulted read_only=False (fail-open) for unknown ids. Now: log the failure (was silent); fail CLOSED — read_only=True for any id not in _KNOWN_WRITEABLE_WORKFLOW_IDS; requires_repo keyed off the real repo-less default (REPO_LESS_DEFAULT_WORKFLOW_ID = default/agent-v1). MEDIUM — TaskConfig missing cross-field invariant (type-design-analyzer). Added _validate_requires_repo_has_repo (mirrors _validate_trace_requires_user_id): requires_repo=True ⇒ non-empty repo_url, enforced at construction. repo_url and github_token now default to "" so a repo-less TaskConfig is constructible; the validator preserves the repo-bound invariant. LOW/secondary — rule-8 now also checks deliver_artifact.target resolves in DELIVERERS (type-design-analyzer), so an unknown target is caught universally, not only when it collides with the primary outcome. Test gaps closed (pr-test-analyzer): repo-OPTIONAL workflow given a valid repo (admission + hydration); repo-less memory keyed user:{user_id} (asserted); malformed-repo-on-repo-bound rejection; preflight repo-less short-circuit + repo-bound-no-repo invariant; repo-less agent_result=None → FAILED; config load-failure fail-closed; TaskConfig validator; rule-8 deliver target. agent 941 / cdk 1831 / cli 296 green; ty + ruff + type-sync clean. * feat(agent,api): deliver_artifact + repo-less memory — close criterion #4 (#248 Phase 3) Final Phase-3 slice: a repo-less knowledge task now runs end-to-end — admitted, hydrated, agent loop, AND delivers its output. Closes acceptance criterion #4 (a non-coding task runs without a repo) for the s3/comment deliverers. Infra (IAM + env): - agent-session-role.ts: s3:PutObject grant scoped to artifacts/${aws:PrincipalTag/task_id}/* on the trace-artifacts bucket (mirrors the traces/ pattern; task_id-scoped per ADR-014 addendum). cdk-nag reason updated. - agent.ts: ARTIFACTS_BUCKET_NAME runtime env (same bucket as traces). Deliverers (agent/src/workflow/deliverers.py): - DELIVERERS gains a deliver(target, ctx) dispatcher + DeliveryResult. The s3 deliverer uploads the agent's result text to artifacts/{task_id}/result.md via the tenant-scoped S3 client (5 MiB cap); the comment deliverer records a delivered_comment milestone for the channel fanout; s3_and_comment does both. - AgentResult.result_text captures ResultMessage.result (the knowledge-task deliverable); runner.py populates it on success. - runner._handle_deliver_artifact: now implemented (was NotImplementedError), routes through deliver(); a delivery failure is a failed step (terminal FAILED), never a silent skip. Pipeline (pipeline.py): - _run_repoless_task drives the FULL step list (deliver_artifact included); replaces the delivery-deferred FAILED gate with the real runner verdict (agent succeeded but workflow failed ⇒ attributed FAILED). Sets TaskResult.artifact_uri. - Repo-less episodic memory write keyed user:{user_id} (fail-open). Memory (memory.py): write_task_episode repo→actor param; _validate_actor accepts owner/repo OR user:{id} so the same write path serves coding + knowledge tasks. Types: TaskResult/TaskRecord/TaskDetail + CLI mirror gain artifact_uri (mirrors trace_s3_uri); CLI detail view shows an Artifact: line. check:types-sync green. Tests: deliverer dispatch (s3 upload + key/body, comment milestone, s3_and_comment, missing bucket, empty text, oversize); repo-less pipeline now succeeds + sets artifact_uri=null for comment-only; session-role artifacts grant; deliver_artifact dropped from the fail-loud stub list. agent 947 / cdk 1832 / cli 296 green; synth + cdk-nag clean; ty + ruff + type-sync clean. * feat(agent): make repo-less delivery retrievable + ship a knowledge workflow (#248 Phase 3) Three should-fix items closing the gap where "criterion #4" was met by machinery but the default repo-less path delivered to a channel that isn't wired. 1. Stale docstring: _run_repoless_task said deliver_artifact was a deferred stub — it ships now. Updated to describe the real full-step-list run. 2. Default workflow delivers retrievably. The channel fan-out only knows slack/email/github; an api-origin repo-less task (the common default case) has no channel, so a `comment`-only deliverer surfaced nothing the user could get. default/agent-v1 now uses `target: s3_and_comment` (primary outcome `artifact`): the S3 upload to artifacts/{task_id}/ is always retrievable, the comment milestone is still rendered by the fan-out when a channel exists. 3. Ship a runnable knowledge workflow: agent/workflows/knowledge/web-research-v1.yaml — the concrete repo-less demonstration (research → S3 artifact, no repo). Built-in capabilities only (registry MCP/skills are Phase 4); CDK descriptor mirror + drift-guard + resolver tests updated. Also fixes a latent bug the knowledge workflow surfaced: a repo-less id with no registered prompt fell back to the CODING prompt (leaking unsubstitutable {repo_url}/{branch_name}). get_system_prompt gains repo_less= so repo-less workflows fall back to the repo-less default-agent prompt instead. Tests: repo-less pipeline now asserts artifact_uri set (s3_and_comment); repo-less prompt fallback; knowledge workflow validates/resolves. agent 949 / cdk 1833 green; ty + ruff + type-sync clean; docs synced. * fix(agent): address full-branch review — 2 repo-less blockers + cleanups (#248) Findings from the four-reviewer full-branch pass. Two were blockers that would have shipped the repo-less feature broken despite green unit tests (the tests didn't cross the server/task_state boundaries). BLOCKER 1 — repo-less task rejected on the AgentCore backend. server.py _validate_required_params required repo_url unconditionally, so /invocations returned 400 for every repo-less task before the pipeline ran (ECS backend dodged it by calling run_task directly). Now gates repo_url on the workflow's requires_repo (resolved via load_workflow; fails SAFE — assume required — on a load error). Test: repo-less payload accepted, repo-bound still requires repo. BLOCKER 2 — artifact_uri computed but never persisted. task_state.write_terminal's field allowlist had no artifact_uri branch, so the URI was dropped before DynamoDB and TaskDetail.artifact_uri was always null (delivery not retrievable despite the S3 object existing). Added the persist branch + tests. Secondary: - comment deliverer no longer overstates: _post_comment docstring + WORKFLOWS.md state plainly that delivered_comment is recorded for the event stream (visible in `bgagent watch`) but is NOT yet routed to an external channel (not in the fan-out ROUTABLE_MILESTONES). comment_posted=True only when actually recorded. - test pins the config→engine read_only seam (runner.py PolicyEngine(read_only=)): dropping it now fails a test instead of silently disabling the Write/Edit deny. - stale phase-boundary comments corrected (post_review docstring, runner module docstring + only_kinds + run_agent guard, pr-review/pr-iteration YAML headers) — they described Phase 2a/2b/3 as pending when those shipped on this branch. - ADR-014: replaced the dangling pre-rebase commit hash 838c72e (unreachable after rebase) with a descriptive reference. agent 954 / cdk 1833 green; ty + ruff + type-sync clean; docs synced. * test(cli): cover artifact_uri render lines (#248 Phase 3) The Artifact: display lines added to formatTaskDetail / formatStatusSnapshot introduced uncovered truthy branches that dropped CLI branch coverage below the 84% gate (full `mise run build` caught it; the per-package run had passed on the prior content). Adds non-null + null cases for both renderers, mirroring the existing trace_s3_uri tests. cli 300 passed; full build green. * fix(agent): address PR review findings + green CI (#248) CI: ruff-format the 4 files that tripped the build self-mutation (config.py, deliverers.py, test_pipeline.py, test_task_state.py) so "Fail build on mutation" passes. Review findings: - prompts/__init__.py: get_system_prompt now logs WARN when an id has no registered prompt and falls back, restoring the signal the old build_system_prompt emitted (silent-failure-hunter). - pipeline._run_repoless_task: chain wf_result.failed_step.error into the synthesized AgentResult so a run_agent failure on the repo-less path is diagnosable instead of a generic message (code-reviewer). - server._validate_required_params: narrow the broad `except Exception` around load_workflow to WorkflowValidationError/ImportError so a genuine programming error surfaces loudly instead of being masked as "could not resolve requires_repo" (still fails safe — repo required). - workflows.ts workflowIsReadOnly: document why the unknown-id default is `false` (conservative admission: needsWrite=!readOnly) and must NOT be "aligned" to the agent-side fail-closed read-only default. - runner.py / pr-review-v1.yaml: comment fixes — artifact_key→artifact_uri, drop never-written review_posted, rule 4→6, post_review handler is registered-but-unimplemented not "no shipped handler" (comment-analyzer). Tests (pr-test-analyzer gaps): - test_memory.py: cover _validate_actor (user:{id} namespace + rejects) and a write_task_episode(actor="user:...") reaching create_event. - test_pipeline.py: cover the repo-less delivery-gate-failure branch (agent succeeds but deliver_artifact fails → terminal FAILED naming the step). Docs: ROADMAP.md "Task types" now reflects workflow-driven tasks + repo-less knowledge workflows; Starlight mirror regenerated. agent 962 passed, ruff+ty clean, cdk compile+workflows tests green. * fix(agent): ship first-party workflow files in the image + sync user docs (#248) The Dockerfile copied agent/src, agent/policies, and contracts but never agent/workflows, so the runtime loader (which resolves /app/workflows/<domain>/ <name>-vN.yaml) failed every workflow load with WorkflowValidationError. Add COPY agent/workflows/ /app/workflows/ so the five shipped workflows + the JSON schema land in the image. Verified in a local build: load_workflow( 'coding/new-task-v1') resolves to 1.0.0. Also bring the user-facing guides in line with the workflow-driven model: USER_GUIDE/QUICK_START now document workflow_ref, the --workflow flag, and resolved_workflow responses (the retired task_type is shown only as a migration mapping). Rename the "Task types" section to "Workflows", update the Starlight sync map + sidebar slug, and regenerate the mirrors. * fix(agent,api,cli): address PR-review findings on the workflow seam (#248) Resolve five verified findings from the #248 review: HIGH — thread agent_config.allowed_tools to the SDK. runner.py hardcoded the full tool surface, so a workflow's declared allowed_tools never reached ClaudeAgentOptions and the read_only:false read-leaning lanes (default/agent-v1, web-research-v1) could still get Write/Bash. Add TaskConfig.allowed_tools (populated from the resolved workflow), resolve it in run_agent via _resolve_allowed_tools (drops Write/Edit when read_only), and add SDK-layer tests. Cedar missing-read_only fail-open — add the ADR-014-promised parity vector. With context omitting read_only, both engines fail OPEN (Allow + base_permit); safety rests on policy.py::_eval_tier re-raising on diagnostics.errors. Add contracts/cedar-parity/read-only-missing-attribute.json (verified on cedarpy AND cedar-wasm) plus test_policy.py tests pinning the re-raise + always-inject discipline. MEDIUM — repo-less artifact success gate. pipeline.py now fails the task when the primary outcome is `artifact` but no artifact_uri was produced (matches the WORKFLOWS.md "agent-success AND S3 key present" contract), not just when the deliver step raised. MEDIUM — rule 13 model allow-list admission. Add WORKFLOW_MODEL_ALLOWLIST + disallowedWorkflowModel() and enforce it in create-task-core (unpermitted model => 400, no silent downgrade); descriptor sync-test asserts modelId matches the YAML and stays on the allow-list. LOW — CLI repo-less submission. --repo is no longer a hard requiredOption: required unless --workflow selects a repo-less workflow (still required with --pr/--review-pr). Lets bgagent reach knowledge/web-research-v1. Rule 10 (one production version per lineage) remains intentionally unenforced at the single-file layer — it is a registry/promotion-time property. * fix(agent,api,cli): address PR-review #296 resolution-ladder + deliverer findings (#248) Resolve the remaining findings from isadeks's review (findings #2/#4 were already fixed in the prior commit): #1 — bare submit silently stopped opening PRs. The server ladder resolves an absent workflow_ref to the repo-less default/agent-v1 (by design), so a CLI `submit --repo X --task Y` regressed from "opens a PR" to "S3 artifact". The CLI now sends coding/new-task-v1 explicitly when a repo is present and no workflow/ pr flag is given, preserving the old new_task default. #3 — repo-optional + repo disagreed across the boundary. The agent took the repo-less path off config.requires_repo alone, while the orchestrator built a repo-bound prompt off repo presence. pipeline.run_task now takes the repo-less path only when requires_repo is false AND no repo was supplied; a repo-optional workflow given a repo runs the repo-bound path (clone/build/PR), matching CDK admission and the prompt. #6 — resolveWorkflowRef silently discarded a @Version constraint. It now honors the constraint (Phase 1: must match the single shipped version) and returns null otherwise; create-task-core 400s with a distinct "version not available" message via resolveWorkflowRefError, instead of quietly running the shipped version. #7 — deliver_artifact unset-target default disagreed between validator (lenient full set) and runtime (s3). Introduce DEFAULT_DELIVER_TARGET as the single source of truth; produced_outcomes(None) now models that exact default, so a primary:comment workflow that omits target is correctly flagged by rule 11. #10 — descriptor/YAML drift had no parity test. The sync test now asserts required_inputs (allOf/oneOf) parity in addition to id/version/requires_repo/ read_only/model, and a new test cross-checks the agent-side _KNOWN_WRITEABLE_WORKFLOW_IDS (the third hand-maintained copy) against the writeable repo-bound descriptors so drift fails CI. * fix(agent): close PR-review #296 follow-ups — post-hook fail-soft, prompt, cap, parity (#248) #5 (High) — post-hook load_workflow had no fail-soft fallback, so a reload failure AFTER run_agent mutated/committed the tree stranded the work as FAILED with no PR. read_only now comes from config (already fallback-computed and the verdict that drove Cedar); the reload — needed only for the ensure_pr strategy — is wrapped in the same WorkflowValidationError fallback build_config uses, defaulting to "create" so the PR still opens. #8 — knowledge/web-research-v1 had no registered prompt and silently degraded to the generic default-agent prompt. Add a research-specialized, repo-less prompt (prompts/web_research.py) and register it. #9 — the 5 MiB artifact cap was checked AFTER encoding to UTF-8 (a second full copy). Reject on the character count up front (chars ≤ bytes in UTF-8) so the cap actually bounds memory on the MicroVM before materializing the bytes. #10 residual — add a parity test for the agent-side REPO_LESS_DEFAULT_WORKFLOW_ID / DEFAULT_WORKFLOW_ID constants: the repo-less default must match the CDK platform default and be requires_repo:false in the YAML; the coding default must be repo-bound. Closes the last hand-maintained-copy axis #10 named. Perf — memoize load_workflow (@cache): files are image-baked and immutable per process and Workflow is frozen, so the 3-4 loads per task now parse once. Leaves the coding-lane decorative-`gate` item (run_workflow only drives run_agent while verify_build/ensure_pr stay inline) as a tracked follow-up — it is a larger half-migrated-runner refactor both reviewers flagged as non-blocking. --------- Co-authored-by: bgagent <bgagent@noreply.github.com>
1 parent 1a1b6cc commit 7d10254

128 files changed

Lines changed: 10042 additions & 700 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

agent/Dockerfile

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,12 @@ COPY agent/policies/ /app/policies/
7474
# by ``cdk/src/handlers/shared/types.ts`` at synth time. See
7575
# ``contracts/README.md`` for the contract.
7676
COPY contracts/ /app/contracts/
77+
# First-party workflow files (#248). ``agent/src/workflow/loader.py`` resolves
78+
# ``_WORKFLOWS_ROOT`` to ``/app/workflows`` (parents[2] of /app/src/workflow/)
79+
# and loads ``<domain>/<name>-vN.yaml`` plus ``schema/workflow.schema.json`` at
80+
# task time; without these files every workflow load fails with
81+
# ``WorkflowValidationError: workflow '...' not found at /app/workflows/...``.
82+
COPY agent/workflows/ /app/workflows/
7783
COPY agent/prepare-commit-msg.sh /app/
7884
COPY agent/test_sdk_smoke.py agent/test_subprocess_threading.py /app/
7985

agent/policies/hard_deny.cedar

Lines changed: 15 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -12,24 +12,30 @@
1212
@rule_id("base_permit")
1313
permit (principal, action, resource);
1414

15-
// pr_review tasks may never invoke Write. Absolute; cannot be overridden
16-
// by per-blueprint customization or --pre-approve.
15+
// Read-only workflows may never invoke Write. Absolute; cannot be overridden
16+
// by per-blueprint customization or --pre-approve. Keyed on the read_only
17+
// context attribute (not a principal literal) so the deny attaches to the
18+
// *property* and fires for every read-only workflow uniformly — not just
19+
// coding/pr-review. (#248 Phase 2a — replaces the literal
20+
// Agent::TaskAgent::"pr_review" match; see ADR-014 addendum 2026-06-08.)
1721
@tier("hard")
18-
@rule_id("pr_review_forbid_write")
22+
@rule_id("read_only_forbid_write")
1923
forbid (
20-
principal == Agent::TaskAgent::"pr_review",
24+
principal,
2125
action == Agent::Action::"invoke_tool",
2226
resource == Agent::Tool::"Write"
23-
);
27+
)
28+
when { context.read_only == true };
2429

25-
// pr_review tasks may never invoke Edit.
30+
// Read-only workflows may never invoke Edit.
2631
@tier("hard")
27-
@rule_id("pr_review_forbid_edit")
32+
@rule_id("read_only_forbid_edit")
2833
forbid (
29-
principal == Agent::TaskAgent::"pr_review",
34+
principal,
3035
action == Agent::Action::"invoke_tool",
3136
resource == Agent::Tool::"Edit"
32-
);
37+
)
38+
when { context.read_only == true };
3339

3440
// Reject `rm -rf /` and similar absolute-root destructive commands.
3541
@tier("hard")

agent/pyproject.toml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,12 @@ dependencies = [
3434
# commit. See docs/design/CEDAR_HITL_GATES.md §15.6 (decision #23) and
3535
# the parity-contract banner in mise.toml.
3636
"cedarpy==4.8.4", #https://github.com/k9securityio/cedar-py — EXACT pin (no ^/~), parity with @cedar-policy/cedar-wasm@4.8.2 (both Cedar Rust 4.8.2)
37+
# Workflow-driven tasks (#248): the step runner loads YAML workflow files
38+
# and validates them against agent/workflows/schema/workflow.schema.json.
39+
# Both were previously only transitively present; declared directly so the
40+
# workflow loader does not depend on another package's transitive pin.
41+
"pyyaml==6.0.3", #https://pypi.org/project/PyYAML/
42+
"jsonschema==4.26.0", #https://pypi.org/project/jsonschema/
3743
]
3844

3945
[tool.uv]

agent/src/config.py

Lines changed: 73 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -5,13 +5,26 @@
55
import uuid
66
from datetime import UTC
77

8-
from models import AttachmentConfig, TaskConfig, TaskType
8+
from models import AttachmentConfig, TaskConfig
99
from shell import log
1010

1111
AGENT_WORKSPACE = os.environ.get("AGENT_WORKSPACE", "/workspace")
1212

13-
# Task types that operate on an existing pull request.
14-
PR_TASK_TYPES = frozenset(("pr_iteration", "pr_review"))
13+
# The platform default workflow id used when a payload omits resolved_workflow
14+
# (local/batch runs). Mirrors the create-task boundary's coding default.
15+
DEFAULT_WORKFLOW_ID = "coding/new-task-v1"
16+
# The repo-less platform default workflow (#248 Phase 3) — the one first-party
17+
# id whose ``requires_repo`` is false. Used by the load-failure fallback to
18+
# decide repo-optionality without loading the file.
19+
REPO_LESS_DEFAULT_WORKFLOW_ID = "default/agent-v1"
20+
# First-party workflow ids that operate on an existing pull request.
21+
PR_WORKFLOW_IDS = frozenset(("coding/pr-iteration-v1", "coding/pr-review-v1"))
22+
# First-party workflow ids that are writeable (NOT read-only). Used only by the
23+
# load-failure fallback to bias an unrecognised id toward read-only (fail closed
24+
# on the write-deny invariant). pr-review-v1 is intentionally excluded (it is
25+
# read-only); default/agent-v1 is excluded because its conservative posture
26+
# should fail closed too.
27+
_KNOWN_WRITEABLE_WORKFLOW_IDS = frozenset(("coding/new-task-v1", "coding/pr-iteration-v1"))
1528

1629

1730
def resolve_github_token() -> str:
@@ -314,7 +327,7 @@ def _refresh(current: dict) -> dict | None:
314327

315328

316329
def build_config(
317-
repo_url: str,
330+
repo_url: str = "",
318331
task_description: str = "",
319332
issue_number: str = "",
320333
github_token: str = "",
@@ -325,7 +338,7 @@ def build_config(
325338
dry_run: bool = False,
326339
task_id: str = "",
327340
system_prompt_overrides: str = "",
328-
task_type: str = "new_task",
341+
resolved_workflow: dict | None = None,
329342
branch_name: str = "",
330343
pr_number: str = "",
331344
channel_source: str = "",
@@ -351,22 +364,59 @@ def build_config(
351364
"ANTHROPIC_MODEL", "us.anthropic.claude-sonnet-4-6"
352365
)
353366

367+
# Resolve the workflow id (the create-task boundary already pinned it; local
368+
# batch runs default to the coding workflow). Required-input validation is
369+
# owned by the create-task boundary now; the agent re-checks only the
370+
# pr_number/issue/description shape needed to run.
371+
workflow = resolved_workflow or {"id": DEFAULT_WORKFLOW_ID, "version": "1.0.0"}
372+
workflow_id = workflow.get("id", DEFAULT_WORKFLOW_ID)
373+
is_pr_workflow = workflow_id in PR_WORKFLOW_IDS
374+
375+
# Load the workflow up-front: it drives the Cedar principal, the read_only
376+
# flag, AND whether a repo is required (#248 Phase 3). Fall back to id-based
377+
# mapping when the file can't be loaded (e.g. a registry-only id in a future
378+
# phase) — a repo-less default is the safe assumption only for non-coding.
379+
from workflow import WorkflowValidationError, load_workflow, policy_principal_for
380+
381+
try:
382+
workflow_obj = load_workflow(workflow_id)
383+
policy_principal = policy_principal_for(workflow_obj)
384+
workflow_read_only = workflow_obj.read_only
385+
workflow_requires_repo = workflow_obj.resolved_requires_repo
386+
workflow_allowed_tools = list(workflow_obj.agent_config.allowed_tools)
387+
except WorkflowValidationError as exc:
388+
# The pinned workflow file failed to load (corrupt YAML, schema drift, a
389+
# future registry-only id). This is the one place read_only/requires_repo
390+
# can be wrong without a loud failure, so: (1) log it, and (2) fail
391+
# *closed* — assume read-only (deny writes) for any id we don't recognise
392+
# as a known writeable coding workflow, rather than fail-open to writeable.
393+
log("ERROR", f"workflow {workflow_id!r} failed to load ({exc}); using fallback policy")
394+
policy_principal = "pr_review" if workflow_id == "coding/pr-review-v1" else "new_task"
395+
# Known writeable coding workflows are the only ids that fall back to
396+
# writeable; everything else (incl. an unrecognised id) is read-only.
397+
workflow_read_only = workflow_id not in _KNOWN_WRITEABLE_WORKFLOW_IDS
398+
# requires_repo: the repo-less platform default is the only id that does
399+
# NOT require a repo; every other id (coding or unknown) requires one.
400+
workflow_requires_repo = workflow_id != REPO_LESS_DEFAULT_WORKFLOW_ID
401+
# Tool surface is unknown without the file; empty = the runner falls back
402+
# to its built-in full surface. read_only (above, fail-closed) still drops
403+
# Write/Edit, so the write-deny invariant holds even on this path.
404+
workflow_allowed_tools = []
405+
354406
errors = []
355-
if not resolved_repo_url:
356-
errors.append("repo_url is required (e.g., 'owner/repo')")
357-
if not resolved_github_token:
358-
errors.append("github_token is required")
407+
# Repo + GitHub token are required only for repo-bound workflows; a repo-less
408+
# workflow (requires_repo:false) runs from task_description/attachments alone.
409+
if workflow_requires_repo:
410+
if not resolved_repo_url:
411+
errors.append("repo_url is required (e.g., 'owner/repo')")
412+
if not resolved_github_token:
413+
errors.append("github_token is required")
359414
if not resolved_aws_region:
360415
errors.append("aws_region is required for Bedrock")
361-
try:
362-
task = TaskType(task_type)
363-
except ValueError:
364-
errors.append(f"Invalid task_type: '{task_type}'")
365-
task = None
366-
if task and task.is_pr_task:
416+
if is_pr_workflow:
367417
if not pr_number:
368-
errors.append("pr_number is required for pr_iteration/pr_review task type")
369-
elif task and not resolved_issue_number and not resolved_task_description:
418+
errors.append(f"pr_number is required for the {workflow_id!r} workflow")
419+
elif not resolved_issue_number and not resolved_task_description:
370420
errors.append("Either issue_number or task_description is required")
371421

372422
if errors:
@@ -394,7 +444,12 @@ def build_config(
394444
max_turns=max_turns,
395445
max_budget_usd=max_budget_usd,
396446
system_prompt_overrides=system_prompt_overrides,
397-
task_type=task_type,
447+
resolved_workflow=workflow,
448+
policy_principal=policy_principal,
449+
read_only=workflow_read_only,
450+
allowed_tools=workflow_allowed_tools,
451+
requires_repo=workflow_requires_repo,
452+
is_pr_workflow=is_pr_workflow,
398453
branch_name=branch_name,
399454
pr_number=pr_number,
400455
task_id=task_id or uuid.uuid4().hex[:12],

agent/src/entrypoint.py

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414

1515
from config import ( # noqa: F401
1616
AGENT_WORKSPACE,
17-
PR_TASK_TYPES,
17+
PR_WORKFLOW_IDS,
1818
build_config,
1919
get_config,
2020
resolve_github_token,
@@ -29,7 +29,6 @@
2929
RepoSetup,
3030
TaskConfig,
3131
TaskResult,
32-
TaskType,
3332
TokenUsage,
3433
)
3534
from pipeline import main, run_task # noqa: F401

agent/src/memory.py

Lines changed: 21 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,20 @@ def _validate_repo(repo: str) -> None:
5353
)
5454

5555

56+
def _validate_actor(actor: str) -> None:
57+
"""Validate a memory actorId: an ``owner/repo`` OR a ``user:{id}`` namespace.
58+
59+
Repo-less knowledge tasks (#248 Phase 3) key memory on ``user:{user_id}``
60+
instead of a repo (ADR-014 addendum). Accept both forms so the same write
61+
path serves coding and knowledge tasks; reject anything else.
62+
"""
63+
if actor.startswith("user:"):
64+
if not actor[len("user:") :]:
65+
raise ValueError("actor 'user:' namespace requires a non-empty user id")
66+
return
67+
_validate_repo(actor)
68+
69+
5670
def _log_error(func_name: str, err: Exception, memory_id: str, task_id: str) -> None:
5771
"""Log memory write failure with severity based on exception type."""
5872
is_programming_error = isinstance(err, (TypeError, ValueError, AttributeError, KeyError))
@@ -67,7 +81,7 @@ def _log_error(func_name: str, err: Exception, memory_id: str, task_id: str) ->
6781

6882
def write_task_episode(
6983
memory_id: str,
70-
repo: str,
84+
actor: str,
7185
task_id: str,
7286
status: str,
7387
pr_url: str | None = None,
@@ -81,17 +95,18 @@ def write_task_episode(
8195
status, PR URL, cost, duration, and any self-feedback from the
8296
agent's "## Agent notes" section in the PR body.
8397
84-
Uses actorId=repo and sessionId=task_id so the extraction strategy
85-
namespace templates (/{actorId}/episodes/{sessionId}/) place records
86-
into the correct per-repo, per-task namespace.
98+
Uses actorId=``actor`` and sessionId=task_id so the extraction strategy
99+
namespace templates (/{actorId}/episodes/{sessionId}/) place records into
100+
the correct namespace. ``actor`` is an ``owner/repo`` for coding tasks or a
101+
``user:{user_id}`` namespace for repo-less knowledge tasks (#248 Phase 3).
87102
88103
Metadata includes source_type='agent_episode' for provenance tracking
89104
and content_sha256 for integrity auditing on read (schema v3).
90105
91106
Returns True on success, False on failure (fail-open).
92107
"""
93108
try:
94-
_validate_repo(repo)
109+
_validate_actor(actor)
95110
client = _get_client()
96111

97112
parts = [
@@ -124,7 +139,7 @@ def write_task_episode(
124139

125140
client.create_event(
126141
memoryId=memory_id,
127-
actorId=repo,
142+
actorId=actor,
128143
sessionId=task_id,
129144
eventTimestamp=_iso_now(),
130145
payload=[

0 commit comments

Comments
 (0)