Skip to content

Commit b1c3f4b

Browse files
authored
Merge pull request #21 from CopilotKit/claude/improve-mechanical-diagrams-9pafB
Improve complex mechanical/illustrative diagram quality
2 parents fada403 + f3901bd commit b1c3f4b

9 files changed

+1840
-6
lines changed
Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
# Improve complex mechanical/illustrative diagram quality
2+
3+
## Problem
4+
5+
When users ask questions like "how do cars work?" or "how do airplanes work?", the agent generates SVG/HTML diagrams via the `widgetRenderer` component that are **functional but lack precision and intentionality**.
6+
7+
### Screenshots
8+
9+
**Car diagram** — rough silhouette, floating component boxes, crude annotations, oversimplified pedal sub-diagram:
10+
11+
![car-diagram](./assets/car-diagram-screenshot.png)
12+
13+
**Airplane diagram** — imprecise plane shape, rough control surfaces mini-diagram, adequate force arrows but weak airfoil representation:
14+
15+
![airplane-diagram](./assets/airplane-diagram-screenshot.png)
16+
17+
### Root causes
18+
19+
1. **No planning step** — the agent generates complex illustrative diagrams in a single pass with no composition planning (what to emphasize, where components sit spatially, how controls map to visuals)
20+
2. **Thin illustrative guidance** — the existing skill files give strong rules for flowcharts and structural diagrams but only **9 lines** to illustrative diagrams (`svg-diagram-skill.txt` lines 157-165)
21+
3. **No shape construction method** — the LLM generates freehand SVG paths that produce rough, unrecognizable silhouettes
22+
4. **No control-to-visual binding pattern** — interactive elements (sliders, presets) are architecturally disconnected from the SVG elements they should modify
23+
24+
### Specific deficiencies
25+
26+
- Object silhouettes (car body, airplane fuselage) are rough freehand paths rather than recognizable shapes
27+
- Internal components (engine, transmission, control surfaces) are colored boxes floating over the outline without spatial accuracy
28+
- Annotations use crude arrows/lines without a clear visual hierarchy (primary vs secondary labels)
29+
- Sub-diagrams (pedal arrangement, control surface detail) are oversimplified rectangles
30+
- Interactive controls don't visually update the diagram — no feedback loop
31+
32+
---
33+
34+
## Proposed PRs
35+
36+
All changes are to **skill/prompt files only** — no runtime code changes needed. New `.txt` files are auto-discovered by `load_all_skills()` in `apps/agent/skills/__init__.py`.
37+
38+
### PR 1: Expand illustrative diagram rules in `svg-diagram-skill.txt`
39+
40+
**Scope:** `apps/agent/skills/svg-diagram-skill.txt` lines 157-165 (+ MCP mirror)
41+
42+
The current "Illustrative Diagram" section is only 9 lines. Expand with:
43+
- **Composition grid rule**: divide 680px viewBox into zones — main illustration (x=40-480), annotation margin (x=490-640), optional sub-diagram zone (bottom 25%)
44+
- **Depth layering order**: background -> silhouette -> internals -> connectors -> labels
45+
- **Shape construction rule**: build recognizable objects from 4-8 geometric primitives, not a single complex path
46+
- **Force/motion arrow conventions**: larger arrowheads (`markerWidth=8`), color-coded by type (warm = resistance, cool = propulsion, gray = gravity)
47+
- **Cross-section vs side-view guidance**: when to use each and how to indicate cut planes
48+
49+
**Impact:** Immediate quality improvement for all illustrative diagrams. Smallest change, do first.
50+
51+
---
52+
53+
### PR 2: Add diagram planning protocol (pre-generation thinking step)
54+
55+
**Scope:** `apps/agent/skills/svg-diagram-skill.txt` (+ MCP mirror)
56+
57+
Before generating any illustrative diagram, the agent must complete a structured plan:
58+
59+
1. **Subject decomposition** — identify 3-5 key subsystems, classify as primary (prominent) vs secondary (annotation-level)
60+
2. **Spatial layout** — assign components to a coordinate grid before writing SVG
61+
3. **Educational priority** — what is the single most important thing to convey?
62+
4. **Composition sketch** — define bounding rectangles for illustration, annotations, controls, sub-diagrams
63+
5. **Control-to-visual mapping** — list which SVG element IDs each interactive control will modify
64+
6. **Shape fidelity check** — list 4-6 defining visual features that make the subject recognizable (e.g., for a car: wheel wells, hood slope, windshield angle, roof line)
65+
66+
Pure prompt enhancement — no code changes. **Highest single-item impact.**
67+
68+
---
69+
70+
### PR 3: Add progressive rendering pattern for interactive diagrams
71+
72+
**Scope:** `apps/agent/skills/master-agent-playbook.txt`, `apps/agent/skills/svg-diagram-skill.txt` (+ MCP mirrors)
73+
74+
Replace scattered `oninput` handlers with a structured architecture:
75+
76+
- **Explicit HTML sections** via comments: `<!-- SECTION 1: Main SVG -->`, `<!-- SECTION 2: Sub-diagrams -->`, `<!-- SECTION 3: Stats -->`, `<!-- SECTION 4: Controls -->`, `<!-- SECTION 5: JS state + bindings -->`
77+
- **Centralized state-and-render pattern**:
78+
```js
79+
const state = { throttle: 50, gear: 3, mode: 'city' };
80+
function updateState(key, value) { state[key] = value; render(); }
81+
function render() { /* update ALL visual elements from state */ }
82+
```
83+
- **Element ID convention**: `viz-{component}-{property}` (e.g., `viz-engine-fill`, `viz-speed-text`)
84+
- **Preset pattern**: presets as calls to `updateState` with multiple values
85+
86+
Ensures controls and visuals are architecturally connected rather than accidentally coupled.
87+
88+
---
89+
90+
### PR 4: Create dedicated `mechanical-illustration-skill.txt`
91+
92+
**Scope:** New file `apps/agent/skills/mechanical-illustration-skill.txt` (+ MCP mirror)
93+
94+
Comprehensive skill for mechanical/illustrative diagrams:
95+
96+
- **Shape construction method** — decompose objects into geometric primitives (body = rounded rect, wheels = circles, windows = trapezoids)
97+
- **Spatial accuracy rules** — grid overlay technique for component placement
98+
- **Annotation hierarchy** — primary (14px/500 weight, solid leader lines), secondary (12px/400, dashed leaders), tertiary (11px/400, inline)
99+
- **Sub-diagram quality standards** — minimum 150x100px, visually connected to main diagram via callout
100+
- **Interactive control binding template** — reusable JS pattern for slider/button -> SVG element updates
101+
- **Two worked reference compositions** — car drivetrain and airfoil cross-section with full SVG examples
102+
103+
Auto-loaded by the existing `load_all_skills()` glob — no code changes.
104+
105+
---
106+
107+
### PR 5: Add SVG shape library with reusable path fragments
108+
109+
**Scope:** New file `apps/agent/skills/svg-shape-library.txt` (+ MCP mirror)
110+
111+
Pre-designed SVG path fragments the agent can reference and adapt:
112+
113+
- **Vehicles**: sedan side profile, airplane side profile, airplane top-down
114+
- **Mechanical parts**: gear/cog, piston, spring, valve, wheel with spokes
115+
- **Physics shapes**: airfoil cross-section (NACA-style), force arrow with proper head
116+
- **Annotation elements**: zoom callout box, dimension line with tick marks
117+
- **Control surfaces**: rudder, aileron, elevator (neutral and deflected)
118+
119+
Each entry includes normalized coordinates (0-100 units), transform instructions for positioning, and customization notes.
120+
121+
**Note:** Adds ~30-50KB to system prompt. Monitor context budget. Consider including only 6-8 most common shapes initially.
122+
123+
---
124+
125+
## Recommended sequencing
126+
127+
| Order | PR | Effort | Impact |
128+
|-------|-----|--------|--------|
129+
| 1 | PR 1: Expand illustrative rules | Small | Medium |
130+
| 2 | PR 2: Diagram planning protocol | Small | **High** |
131+
| 3 | PR 3: Progressive rendering | Medium | Medium |
132+
| 4 | PR 4: Mechanical illustration skill | Medium | High |
133+
| 5 | PR 5: SVG shape library | Large | High |
134+
135+
PRs 1 and 2 can be merged independently. PR 4 builds on conventions from PR 1. PR 3 and PR 5 are independent of each other.
136+
137+
## Verification
138+
139+
After each PR, test with these prompts and compare before/after:
140+
- "how do cars work?"
141+
- "can you explain how airplanes fly? I'm a visual person"
142+
- "explain how a combustion engine works"
143+
- Verify dark mode compatibility
144+
- Check system prompt token count stays within model context budget
145+
146+
## Key files
147+
148+
- `apps/agent/skills/svg-diagram-skill.txt` — primary skill to expand (illustrative section at lines 157-165)
149+
- `apps/agent/skills/master-agent-playbook.txt` — interactive widget templates (Part 2)
150+
- `apps/agent/skills/__init__.py` — auto-discovers new `.txt` skill files (line 23, glob pattern)
151+
- `apps/agent/main.py` — injects skills into system prompt (line 49)
152+
- `apps/mcp/skills/` — MCP mirrors of all skill files

apps/agent/skills/master-agent-playbook.txt

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -218,6 +218,106 @@ render();
218218
}
219219
```
220220

221+
### Architecture: Progressive Rendering for Complex Interactive Diagrams
222+
223+
When building interactive diagrams with controls (sliders, buttons, presets), use this structured architecture instead of scattered inline handlers. This ensures controls and visuals stay connected.
224+
225+
**HTML Section Structure** — organize your HTML in explicit, commented sections:
226+
227+
```html
228+
<!-- SECTION 1: Main illustration SVG -->
229+
<svg id="viz-main" width="100%" viewBox="0 0 680 400" xmlns="http://www.w3.org/2000/svg">
230+
<!-- Use viz- prefixed IDs for all dynamic elements -->
231+
<rect id="viz-engine-fill" .../>
232+
<text id="viz-speed-text" ...>0 rpm</text>
233+
<line id="viz-force-arrow" .../>
234+
</svg>
235+
236+
<!-- SECTION 2: Sub-diagram(s) -->
237+
<div style="display:flex;gap:16px;margin:12px 0">
238+
<svg id="viz-sub-controls" viewBox="0 0 200 120">...</svg>
239+
<div id="viz-stats">
240+
<div>Engine speed: <strong id="viz-rpm">0</strong> rpm</div>
241+
<div>Wheel force: <strong id="viz-force">0</strong></div>
242+
</div>
243+
</div>
244+
245+
<!-- SECTION 3: Interactive controls -->
246+
<div class="controls">
247+
<label>Throttle
248+
<input type="range" min="0" max="100" value="30"
249+
oninput="updateState('throttle', +this.value)">
250+
</label>
251+
<label>Gear
252+
<input type="range" min="1" max="6" value="2"
253+
oninput="updateState('gear', +this.value)">
254+
</label>
255+
<button onclick="applyPreset('city')">City</button>
256+
<button onclick="applyPreset('highway')">Highway</button>
257+
</div>
258+
259+
<!-- SECTION 4: JavaScript state + bindings -->
260+
<script>
261+
// Centralized state — single source of truth
262+
const state = { throttle: 30, gear: 2, mode: 'city' };
263+
264+
// Single update function — all controls call this
265+
function updateState(key, value) {
266+
state[key] = value;
267+
render();
268+
}
269+
270+
// Presets — just state objects applied at once
271+
const presets = {
272+
city: { throttle: 30, gear: 2 },
273+
highway: { throttle: 70, gear: 5 },
274+
braking: { throttle: 0, gear: 3 }
275+
};
276+
function applyPreset(name) {
277+
Object.assign(state, presets[name]);
278+
state.mode = name;
279+
// Also update slider positions to match
280+
document.querySelectorAll('input[type=range]').forEach(el => {
281+
const key = el.closest('label')?.textContent.trim().toLowerCase();
282+
if (key && state[key] !== undefined) el.value = state[key];
283+
});
284+
render();
285+
}
286+
287+
// Render — reads state, updates ALL visual elements
288+
function render() {
289+
const rpm = Math.round(state.throttle * 40 + state.gear * 200);
290+
const force = Math.round(state.throttle * (7 - state.gear) * 0.5);
291+
292+
// Update SVG elements by ID
293+
document.getElementById('viz-rpm').textContent = rpm.toLocaleString();
294+
document.getElementById('viz-force').textContent = force;
295+
document.getElementById('viz-speed-text').textContent = rpm + ' rpm';
296+
297+
// Visual feedback — change fills, transforms, etc.
298+
const el = document.getElementById('viz-engine-fill');
299+
if (el) el.setAttribute('opacity', 0.3 + state.throttle * 0.007);
300+
}
301+
302+
// Initial render
303+
render();
304+
</script>
305+
```
306+
307+
**Element ID Convention**: `viz-{component}-{property}`
308+
- `viz-engine-fill` — engine block fill color/opacity
309+
- `viz-speed-text` — speed readout text content
310+
- `viz-force-arrow` — force arrow being resized
311+
- `viz-rpm` — RPM stat display
312+
313+
This convention makes it trivial to trace which control affects which visual element. Every `updateState` call triggers `render()`, which updates every dynamic element from the centralized state.
314+
315+
**Key rules**:
316+
- Never use anonymous `oninput="document.getElementById(...)"` handlers. Always go through `updateState` → `render`.
317+
- Every slider/button must produce a visible change in the SVG — if a control doesn't update anything visual, remove it.
318+
- Preset buttons must also update slider positions to stay in sync.
319+
- The `render()` function should be idempotent — calling it twice with the same state produces the same output.
320+
221321
---
222322

223323
## Part 3: Skill — Data Visualization

0 commit comments

Comments
 (0)