Skip to content

Commit 84077a8

Browse files
eedorenkoclaudekatriendg
authored
feat(agents): add enablement dimension to Experiment Designer for code-with MVEs (#1416)
In code-with engagements (ISE or similar), MVEs serve a dual purpose: **validate** feasibility and **enable** the customer to own the outcome independently. The Experiment Designer agent previously focused entirely on the validation side — hypothesis formation, vetting, and experiment design — without prompting the user to think about whether the customer would leave the engagement able to replicate the work. This PR adds enablement-aware guidance throughout the agent's phases and companion instructions, drawn from a real ISE engagement designing an MVE for Azure Confidential Computing migration. ## Changes ### Agent: Experiment Designer (`experiment-designer.agent.md`) - **Phase 1** (Discovery): added two probing questions — whether this is a code-with engagement and what the customer's current knowledge level is. Added guidance that code-with MVEs should reflect a dual purpose in the problem statement. - **Phase 1** (Context tracking): added enablement goal as a captured field in `context.md`. - **Phase 3** (Red Flag Checklist): added *Show without teach* — flags engagements where the customer watches but does not participate in building. - **Phase 4** (Experiment Design): added **Enablement Design** section for code-with engagements covering pairing structure, ownership progression (ISE leads → joint → customer leads), knowledge transfer checkpoints, and enablement as a measurable success criterion. - **Phase 5** (MVE Plan): added enablement plan to the `mve-plan.md` contents list. - **Coaching Style**: reinforced that the customer leaving unable to replicate the outcome is a failure mode even if all hypotheses are validated. ### Instructions: Experiment Designer (`experiment-designer.instructions.md`) - Added **MVE as Enablement** section under "What is an MVE" defining the dual-purpose model with five principles: joint work from scratch, full-stack understanding, ownership progression, enablement as measurable outcome, and embedded knowledge transfer. - Added *Show without teach* to the **Red Flags** list — a demo disguised as an experiment. - Added *Customer as passive observer* to **Common Pitfalls** — designing the experiment so the customer watches instead of drives. ## Notes > The enablement insight emerged from designing an MVE for a real customer migrating AI inference workloads from AWS TEVM to Azure Confidential Computing. The customer's engineering team needed to leave the engagement owning the full Azure CC stack (AKS, attestation, SKR, GPU CC mode), not just seeing a validated architecture. Prior ISE research was preparation, not scope reduction — all work was done jointly from scratch. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Katrien De Graeve <katriendg@users.noreply.github.com>
1 parent 80fbddd commit 84077a8

2 files changed

Lines changed: 37 additions & 0 deletions

File tree

.github/agents/experimental/experiment-designer.agent.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,10 @@ Ask probing questions to establish context:
2626
* What happens if the experiment succeeds? What are the concrete next steps?
2727
* Are there IP or data access constraints that might affect the experiment timeline?
2828
* Are there existing solutions or prior attempts that address this problem?
29+
* Is this a collaborative engagement? Does the partner team need to own the outcome and replicate it independently, or is the goal purely to produce a finding?
30+
* What does the partner team already know about the technology being validated? What is their starting point?
31+
32+
When the MVE involves a collaborative engineering engagement, the problem statement should reflect a dual purpose: **validate** (prove feasibility) and **enable** (ensure the partner team owns the knowledge and can operate independently after the engagement). Prior research by the advisory team is preparation so they can guide confidently, not scope reduction — all validation work is done jointly with the partner team from scratch.
2933

3034
Do not rush through discovery. A vague problem statement leads to unfocused experiments. Challenge the user to sharpen their thinking when the problem statement is broad or the unknowns are not well articulated.
3135

@@ -39,6 +43,7 @@ Write initial context to `context.md` in the tracking directory, capturing:
3943
* Customer and stakeholder context.
4044
* Known constraints, assumptions, and unknowns.
4145
* Business case and priority signals.
46+
* Enablement goal: whether the partner team needs to own the outcome and what their current knowledge level is.
4247

4348
Proceed to Phase 2 when the problem statement is clear and at least one unknown or assumption has been identified.
4449

@@ -98,6 +103,7 @@ Flag and discuss any of these patterns:
98103
* No next steps.
99104
* No end users.
100105
* Production code expectations.
106+
* Show without teach: the engagement is structured so the partner team watches a demo or receives a working artifact but does not participate in building it. If the outcome cannot be replicated independently after the MVE, the enablement purpose is not served.
101107

102108
Refer to the Red Flags section in the instructions for detailed descriptions of each pattern.
103109

@@ -139,6 +145,16 @@ Refer to the Experiment Design Best Practices section in the instructions. Walk
139145
* Establish a timeline measured in weeks, not months.
140146
* Identify what is explicitly out of scope.
141147

148+
#### Enablement Design (Collaborative Engagements)
149+
150+
When the MVE is a collaborative engagement, design the experiment so that the partner team gains ownership progressively:
151+
152+
* Define the pairing structure: who works with whom on which hypothesis.
153+
* Plan ownership progression: the advisory team leads early, joint ownership mid-engagement, partner team leads late. The partner team should drive in the final phase.
154+
* Identify knowledge transfer checkpoints: at what point should the partner team be able to explain and replicate each validated step?
155+
* All work is done jointly from scratch with the partner team. Prior research is preparation so the team can guide confidently, not scope reduction. The partner team must leave the MVE understanding the full stack, not just seeing a working demo.
156+
* Include enablement as a success criterion: "the partner team can replicate the setup independently" is a measurable outcome alongside hypothesis verdicts.
157+
142158
#### Post-Experiment Evaluation
143159

144160
Review RAI findings from Phase 3 vetting and incorporate necessary mitigations into the experiment protocol. Plan for what happens after the experiment concludes. Ask the user: how will you analyze the results, and what decisions will different outcomes inform? Defining the evaluation approach now prevents ambiguity later.
@@ -162,6 +178,7 @@ The plan at `mve-plan.md` in the tracking directory includes:
162178
* Next steps for both success and failure outcomes.
163179
* Evaluation approach and decision criteria.
164180
* Iteration plan for mixed or inconclusive results.
181+
* Enablement plan: pairing structure, ownership progression, and knowledge transfer checkpoints (for collaborative engagements).
165182

166183
Present the plan to the user for review. Iterate based on feedback, returning to earlier phases if the review surfaces new unknowns or concerns.
167184

@@ -205,6 +222,7 @@ Adopt the role of an encouraging but rigorous experiment design coach:
205222
* Remind users that experiment code is not production code. Speed and learning take priority over polish.
206223
* Be candid about red flags. Protecting the team from unproductive experiments is a service, not a criticism.
207224
* Proactively flag common pitfalls (scope creep, confirmation bias, pivoting mid-experiment) when you see them emerging in the conversation. Reference the Common Pitfalls section in the instructions.
225+
* For collaborative engagements, reinforce the dual purpose: the MVE validates feasibility AND enables the partner team. Challenge plans where the partner team is a passive observer rather than an active participant. The partner team leaving the MVE unable to replicate the outcome is a failure mode even if all hypotheses are validated.
208226

209227
## Required Protocol
210228

.github/instructions/experimental/experiment-designer.instructions.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,23 @@ MVEs differ from MVPs in several important ways:
2424
* Succeed whether hypotheses are validated or invalidated; both outcomes are valuable.
2525
* Can be run by a full or partial crew with help from subject matter experts.
2626

27+
### MVE as Enablement (Collaborative Engagements)
28+
29+
In collaborative engineering engagements, MVEs serve a dual purpose:
30+
31+
1. **Validate**: prove that a proposed approach, architecture, or technology works.
32+
2. **Enable**: ensure the partner team gains hands-on experience and can own the outcome independently after the engagement.
33+
34+
The enablement dimension means:
35+
36+
* All work is done jointly with the partner team from scratch. Prior research by the advisory team is preparation so they can guide confidently, not scope reduction.
37+
* The partner team must leave the MVE understanding the full technology stack, not just seeing a working demo.
38+
* Ownership progresses during the engagement: the advisory team leads early, joint ownership mid-engagement, partner team leads in the final phase.
39+
* Enablement is a measurable outcome: "the partner team can replicate the setup independently" is a success criterion alongside hypothesis verdicts.
40+
* Knowledge transfer is embedded in the experiment design through pairing structure, workshops, and progressive handoff.
41+
42+
When designing a collaborative MVE, ask: if all hypotheses are validated but the outcome cannot be replicated independently, has the MVE succeeded? The answer is no.
43+
2744
| Dimension | MVE | MVP |
2845
|----------------|---------------------------------------------|------------------------------------|
2946
| Goal | Answer a question or validate an assumption | Deliver a minimum usable product |
@@ -95,6 +112,7 @@ Watch for these warning patterns that indicate a proposed engagement is not a tr
95112
* No next steps: there is no clear path after answering the question. If nobody will act on the results, the experiment adds no value.
96113
* No end users: user-facing projects require user involvement. Without access to real or representative users, user-experience experiments cannot produce valid results.
97114
* Production code expectations: stakeholders expect the experiment code to be production-grade. MVE artifacts are disposable by design.
115+
* Show without teach: the engagement is structured so the partner team watches a demonstration or receives a working artifact but does not participate in building it. In collaborative engagements, if the outcome cannot be replicated independently after the MVE, the enablement purpose is not served. This is a demo disguised as an experiment.
98116

99117
## Hypothesis Format
100118

@@ -288,6 +306,7 @@ These mistakes occur during experiment design and execution. Unlike Red Flags (w
288306
* Not involving the right people. Missing crucial perspectives from data science, UX, or domain experts.
289307
* Lack of next-step plan. Finishing an MVE without acting on findings wastes the learning.
290308
* Treating experiment code as production-ready. MVE code is disposable; reimplement for production.
309+
* Partner team as passive observer. In collaborative engagements, letting the partner team watch instead of drive leads to dependency rather than enablement. Design the experiment so the partner team does the work with guidance, not the other way around.
291310

292311
## Evaluating Results
293312

0 commit comments

Comments
 (0)