Skip to content

Commit 3fde6dc

Browse files
authored
Merge pull request #82 from Blazity/AIW-71-pre-steps
Aiw 71 pre steps
2 parents 60e553f + 36258b9 commit 3fde6dc

15 files changed

Lines changed: 1672 additions & 91 deletions

docs/pre-sandbox-plan.md

Lines changed: 387 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,387 @@
1+
# Pre-Sandbox Phase Plan
2+
3+
## Goal
4+
5+
Add a configurable pre-sandbox phase to `ai-workflow` that runs before any Vercel Sandbox is provisioned. The phase lets the service execute pluggable server-side steps, such as an AI SDK ticket check, and use their output to either enrich downstream agent prompts or halt before sandbox creation.
6+
7+
## Decisions
8+
9+
- Config lives in the `ai-workflow` service repo.
10+
- The config file is required at the repo root as `pre-sandbox.yaml`.
11+
- Minimal valid config:
12+
13+
```yaml
14+
preSandbox:
15+
runOn:
16+
newTicket: true
17+
existingPr: true
18+
mergeConflict: true
19+
steps: []
20+
```
21+
22+
- Missing or invalid config fails the build.
23+
- Step implementations live in `src/pre-sandbox/steps/*.ts`.
24+
- Config references registered step names. Adding a new step requires code changes and redeploy.
25+
- Steps run sequentially.
26+
- Steps run server-side in Workflow step functions, not inside the Vercel Sandbox.
27+
- Steps may use AI SDK and tools.
28+
- Steps may halt sandbox creation.
29+
- Workflow remains responsible for standard Jira and Slack communication.
30+
- Step output can be injected into research, implementation, and review prompts.
31+
- No retries in the first version. Support timeout and failure behavior only.
32+
33+
## Success Criteria
34+
35+
- `pnpm build` fails when `pre-sandbox.yaml` is missing or invalid.
36+
- Config cannot reference an unknown pre-sandbox step.
37+
- A configured pre-sandbox step runs before `provisionSandbox(...)`.
38+
- A halting step prevents sandbox provisioning.
39+
- A halting step can trigger the existing clarification or failure notification path.
40+
- Prompt additions from a step appear only in the selected downstream prompts.
41+
- The showcase AI SDK step can evaluate ticket complexity without repo knowledge.
42+
43+
## Config Shape
44+
45+
Initial config example:
46+
47+
```yaml
48+
preSandbox:
49+
runOn:
50+
newTicket: true
51+
existingPr: true
52+
mergeConflict: true
53+
54+
steps:
55+
- uses: ticket-complexity-check
56+
name: Ticket Complexity Check
57+
timeoutMs: 120000
58+
onFailure: fail
59+
with:
60+
input:
61+
ticket:
62+
- identifier
63+
- title
64+
- description
65+
- acceptanceCriteria
66+
- comments
67+
```
68+
69+
### Fields
70+
71+
- `preSandbox.runOn.newTicket`: run when no PR exists yet for the ticket branch.
72+
- `preSandbox.runOn.existingPr`: run when a PR already exists for the ticket branch.
73+
- `preSandbox.runOn.mergeConflict`: run when an existing PR has conflicts.
74+
- `steps[].uses`: registered step id from `src/pre-sandbox/steps/index.ts`.
75+
- `steps[].name`: display name used in logs and prompt sections.
76+
- `steps[].timeoutMs`: maximum duration for the step.
77+
- `steps[].onFailure`: one of `continue`, `fail`, or `move_to_backlog`.
78+
- `steps[].with`: step-specific config passed to the step implementation.
79+
80+
## Runtime Contract
81+
82+
Create shared types in `src/pre-sandbox/types.ts`.
83+
84+
```ts
85+
export type PreSandboxPromptTarget = "research" | "implementation" | "review";
86+
87+
export interface PreSandboxPromptAddition {
88+
target: PreSandboxPromptTarget[];
89+
title: string;
90+
content: string;
91+
}
92+
93+
export type PreSandboxStepResult =
94+
| {
95+
status: "continue";
96+
promptAdditions?: PreSandboxPromptAddition[];
97+
}
98+
| {
99+
status: "halt";
100+
outcome: "needs_clarification" | "failed";
101+
message: string;
102+
questions?: string[];
103+
promptAdditions?: PreSandboxPromptAddition[];
104+
};
105+
```
106+
107+
`message` is the human-readable reason used for logs and workflow notifications. It is not a separate control path.
108+
109+
## Step Input Contract
110+
111+
The runner builds a controlled input object and passes only the fields selected by config.
112+
113+
```ts
114+
export interface PreSandboxStepContext {
115+
ticket: {
116+
identifier?: string;
117+
title?: string;
118+
description?: string;
119+
acceptanceCriteria?: string;
120+
comments?: Array<{ author: string; body: string; createdAt?: string }>;
121+
labels?: string[];
122+
};
123+
run: {
124+
branchName: string;
125+
isNewTicket: boolean;
126+
hasExistingPr: boolean;
127+
hasMergeConflict: boolean;
128+
};
129+
}
130+
```
131+
132+
For the first version, input field selection only needs to support ticket fields. Additional fields can be added later without changing the step result contract.
133+
134+
## Build-Time Validation
135+
136+
Add:
137+
138+
- `src/pre-sandbox/config.ts`
139+
- `src/pre-sandbox/steps/index.ts`
140+
- `scripts/validate-pre-sandbox-config.ts`
141+
142+
Validation rules:
143+
144+
- `pre-sandbox.yaml` must exist.
145+
- Root key must be `preSandbox`.
146+
- `runOn` booleans must be present.
147+
- `steps` must be an array.
148+
- Each `steps[].uses` must exist in the step registry.
149+
- `timeoutMs`, when present, must be a positive integer.
150+
- `onFailure` must be `continue`, `fail`, or `move_to_backlog`.
151+
- `name`, when present, must be non-empty.
152+
153+
Update `package.json`:
154+
155+
```json
156+
{
157+
"scripts": {
158+
"validate:pre-sandbox": "tsx scripts/validate-pre-sandbox-config.ts",
159+
"build": "pnpm validate:pre-sandbox && rm -rf .nitro/workflow && NODE_OPTIONS=--max-old-space-size=8192 nitro build"
160+
}
161+
}
162+
```
163+
164+
The repo does not currently include a YAML parser dependency. Add a focused YAML parser dependency, then validate the parsed object with Zod.
165+
166+
## Workflow Integration
167+
168+
Current flow in `src/workflows/agent.ts`:
169+
170+
1. Fetch and validate ticket.
171+
2. Load prompts.
172+
3. Notify started.
173+
4. Resolve branch and PR context.
174+
5. Create branch if needed.
175+
6. Fetch attachments.
176+
7. Ensure Arthur task.
177+
8. Resolve agent kind.
178+
9. Provision sandbox.
179+
180+
New flow:
181+
182+
1. Fetch and validate ticket.
183+
2. Load prompts.
184+
3. Notify started.
185+
4. Resolve branch and PR context.
186+
5. Create branch if needed.
187+
6. Fetch attachments.
188+
7. Run pre-sandbox phase.
189+
8. If halted, use existing workflow communication and terminal handling.
190+
9. Ensure Arthur task.
191+
10. Resolve agent kind.
192+
11. Provision sandbox.
193+
194+
The pre-sandbox phase should run after PR context is known, because `runOn` depends on whether the branch already has a PR and whether it has conflicts. It should run before Arthur task creation and before sandbox provisioning.
195+
196+
## Prompt Injection
197+
198+
Extend context assembly functions in `src/sandbox/context.ts` to accept pre-sandbox prompt additions.
199+
200+
Research prompt section format:
201+
202+
```md
203+
## Pre-Sandbox: Ticket Complexity Check
204+
205+
This information was produced before sandbox creation.
206+
207+
<step output>
208+
```
209+
210+
Apply the same section format to implementation and review prompts when selected by step output.
211+
212+
Suggested API changes:
213+
214+
```ts
215+
interface ResearchPlanContextInput {
216+
// existing fields
217+
preSandboxAdditions?: PreSandboxPromptAddition[];
218+
}
219+
220+
interface ImplementationContextInput {
221+
// existing fields
222+
preSandboxAdditions?: PreSandboxPromptAddition[];
223+
}
224+
225+
interface ReviewContextInput {
226+
// existing fields
227+
preSandboxAdditions?: PreSandboxPromptAddition[];
228+
}
229+
```
230+
231+
The runner groups additions by target:
232+
233+
```ts
234+
{
235+
research: [...],
236+
implementation: [...],
237+
review: [...]
238+
}
239+
```
240+
241+
## Failure And Halt Behavior
242+
243+
Step execution failure:
244+
245+
- `onFailure: continue`: log the failure, continue to the next step, do not inject output.
246+
- `onFailure: fail`: halt workflow as failed, unregister run, move ticket to Backlog, notify through existing `failed` event.
247+
- `onFailure: move_to_backlog`: same terminal ticket movement as failure, but keep the message oriented around pre-sandbox rejection.
248+
249+
Step returns `halt`:
250+
251+
- `outcome: needs_clarification`: unregister run, post clarification questions, move ticket to Backlog, notify through existing `needs_clarification` event.
252+
- `outcome: failed`: unregister run, move ticket to Backlog, notify through existing `failed` event.
253+
254+
The workflow should own Jira and Slack communication so behavior stays consistent with research, implementation, and review phases.
255+
256+
## Showcase Step
257+
258+
Add `src/pre-sandbox/steps/ticket-complexity-check.ts`.
259+
260+
Purpose:
261+
262+
- Use AI SDK to review only the ticket text.
263+
- Decide whether the ticket is small enough and clear enough to send into sandbox execution.
264+
- No repo access.
265+
- No internal docs access.
266+
267+
Expected behavior:
268+
269+
- Continue when the ticket is clear enough.
270+
- Halt with `needs_clarification` when the ticket is too broad, too vague, or missing essential acceptance criteria.
271+
- Return prompt additions for `research` and `implementation` when continuing.
272+
273+
Example output when continuing:
274+
275+
```ts
276+
{
277+
status: "continue",
278+
promptAdditions: [
279+
{
280+
target: ["research", "implementation"],
281+
title: "Ticket Complexity Check",
282+
content: "The ticket looks implementable without additional clarification. Main risk: acceptance criteria do not mention empty states."
283+
}
284+
]
285+
}
286+
```
287+
288+
Example output when halting:
289+
290+
```ts
291+
{
292+
status: "halt",
293+
outcome: "needs_clarification",
294+
message: "Ticket is too broad to implement safely without repo knowledge.",
295+
questions: [
296+
"Which user journey is in scope for the first implementation?",
297+
"What acceptance criteria define completion?"
298+
]
299+
}
300+
```
301+
302+
## Implementation Steps
303+
304+
1. Add `pre-sandbox.yaml`
305+
- Create the required root config file.
306+
- Start with an empty `steps` array or the showcase `ticket-complexity-check` disabled until its env requirements are settled.
307+
- Verify with config parser tests.
308+
309+
2. Add config schema and loader
310+
- Parse YAML.
311+
- Validate with Zod.
312+
- Validate step ids against registry.
313+
- Verify invalid config cases in unit tests.
314+
315+
3. Add build validation script
316+
- Add `scripts/validate-pre-sandbox-config.ts`.
317+
- Add `validate:pre-sandbox` script.
318+
- Run it before `nitro build`.
319+
- Verify missing file and unknown step fail.
320+
321+
4. Add step registry
322+
- Add `src/pre-sandbox/steps/index.ts`.
323+
- Export a typed registry keyed by `uses`.
324+
- Verify registry ids match config validation.
325+
326+
5. Add runner
327+
- Add `src/pre-sandbox/runner.ts`.
328+
- Apply `runOn` conditions.
329+
- Execute steps sequentially.
330+
- Enforce timeout.
331+
- Normalize prompt additions by target.
332+
- Verify continue, halt, timeout, and failure behavior.
333+
334+
6. Add prompt injection
335+
- Update `src/sandbox/context.ts`.
336+
- Add tests in `src/sandbox/context.test.ts`.
337+
- Verify additions appear in selected prompts only.
338+
339+
7. Add showcase AI SDK step
340+
- Add `ticket-complexity-check`.
341+
- Use structured AI output.
342+
- Keep tools limited to ticket communication decisions for the first version.
343+
- Mock AI SDK in tests.
344+
345+
8. Wire into `agentWorkflow`
346+
- Run after PR context and attachments are available.
347+
- Halt before Arthur task creation and sandbox provisioning.
348+
- Pass grouped prompt additions into research, implementation, and review context assembly.
349+
- Verify halted pre-sandbox path never calls `provisionSandbox`.
350+
351+
## Test Plan
352+
353+
Unit tests:
354+
355+
- Config loader accepts the minimal file.
356+
- Config loader rejects missing `preSandbox`.
357+
- Config loader rejects unknown `uses`.
358+
- Config loader rejects invalid `onFailure`.
359+
- Runner skips based on `runOn`.
360+
- Runner executes steps sequentially.
361+
- Runner groups prompt additions by target.
362+
- Runner halts on `needs_clarification`.
363+
- Runner handles `onFailure: continue`.
364+
- Runner handles `onFailure: fail`.
365+
- Prompt assembly includes pre-sandbox blocks in selected phases only.
366+
367+
Workflow-level tests:
368+
369+
- Continuing pre-sandbox run reaches sandbox provisioning.
370+
- Halting pre-sandbox run unregisters the run, moves the ticket to Backlog, and sends the standard notification.
371+
- Halting pre-sandbox run does not provision a sandbox.
372+
373+
Build validation:
374+
375+
- `pnpm validate:pre-sandbox` passes with valid config.
376+
- `pnpm validate:pre-sandbox` fails with missing file.
377+
- `pnpm validate:pre-sandbox` fails with unknown step id.
378+
379+
## Deferred
380+
381+
- Parallel step groups.
382+
- Retries.
383+
- HTTP/plugin step loading.
384+
- Target repo supplied config.
385+
- Rich input selection beyond ticket fields.
386+
- Persisting pre-sandbox artifacts outside workflow state.
387+
- Internal docs/resource fetching steps.

0 commit comments

Comments
 (0)