|
| 1 | +# Improve complex mechanical/illustrative diagram quality |
| 2 | + |
| 3 | +## Problem |
| 4 | + |
| 5 | +When users ask questions like "how do cars work?" or "how do airplanes work?", the agent generates SVG/HTML diagrams via the `widgetRenderer` component that are **functional but lack precision and intentionality**. |
| 6 | + |
| 7 | +### Screenshots |
| 8 | + |
| 9 | +**Car diagram** — rough silhouette, floating component boxes, crude annotations, oversimplified pedal sub-diagram: |
| 10 | + |
| 11 | + |
| 12 | + |
| 13 | +**Airplane diagram** — imprecise plane shape, rough control surfaces mini-diagram, adequate force arrows but weak airfoil representation: |
| 14 | + |
| 15 | + |
| 16 | + |
| 17 | +### Root causes |
| 18 | + |
| 19 | +1. **No planning step** — the agent generates complex illustrative diagrams in a single pass with no composition planning (what to emphasize, where components sit spatially, how controls map to visuals) |
| 20 | +2. **Thin illustrative guidance** — the existing skill files give strong rules for flowcharts and structural diagrams but only **9 lines** to illustrative diagrams (`svg-diagram-skill.txt` lines 157-165) |
| 21 | +3. **No shape construction method** — the LLM generates freehand SVG paths that produce rough, unrecognizable silhouettes |
| 22 | +4. **No control-to-visual binding pattern** — interactive elements (sliders, presets) are architecturally disconnected from the SVG elements they should modify |
| 23 | + |
| 24 | +### Specific deficiencies |
| 25 | + |
| 26 | +- Object silhouettes (car body, airplane fuselage) are rough freehand paths rather than recognizable shapes |
| 27 | +- Internal components (engine, transmission, control surfaces) are colored boxes floating over the outline without spatial accuracy |
| 28 | +- Annotations use crude arrows/lines without a clear visual hierarchy (primary vs secondary labels) |
| 29 | +- Sub-diagrams (pedal arrangement, control surface detail) are oversimplified rectangles |
| 30 | +- Interactive controls don't visually update the diagram — no feedback loop |
| 31 | + |
| 32 | +--- |
| 33 | + |
| 34 | +## Proposed PRs |
| 35 | + |
| 36 | +All changes are to **skill/prompt files only** — no runtime code changes needed. New `.txt` files are auto-discovered by `load_all_skills()` in `apps/agent/skills/__init__.py`. |
| 37 | + |
| 38 | +### PR 1: Expand illustrative diagram rules in `svg-diagram-skill.txt` |
| 39 | + |
| 40 | +**Scope:** `apps/agent/skills/svg-diagram-skill.txt` lines 157-165 (+ MCP mirror) |
| 41 | + |
| 42 | +The current "Illustrative Diagram" section is only 9 lines. Expand with: |
| 43 | +- **Composition grid rule**: divide 680px viewBox into zones — main illustration (x=40-480), annotation margin (x=490-640), optional sub-diagram zone (bottom 25%) |
| 44 | +- **Depth layering order**: background -> silhouette -> internals -> connectors -> labels |
| 45 | +- **Shape construction rule**: build recognizable objects from 4-8 geometric primitives, not a single complex path |
| 46 | +- **Force/motion arrow conventions**: larger arrowheads (`markerWidth=8`), color-coded by type (warm = resistance, cool = propulsion, gray = gravity) |
| 47 | +- **Cross-section vs side-view guidance**: when to use each and how to indicate cut planes |
| 48 | + |
| 49 | +**Impact:** Immediate quality improvement for all illustrative diagrams. Smallest change, do first. |
| 50 | + |
| 51 | +--- |
| 52 | + |
| 53 | +### PR 2: Add diagram planning protocol (pre-generation thinking step) |
| 54 | + |
| 55 | +**Scope:** `apps/agent/skills/svg-diagram-skill.txt` (+ MCP mirror) |
| 56 | + |
| 57 | +Before generating any illustrative diagram, the agent must complete a structured plan: |
| 58 | + |
| 59 | +1. **Subject decomposition** — identify 3-5 key subsystems, classify as primary (prominent) vs secondary (annotation-level) |
| 60 | +2. **Spatial layout** — assign components to a coordinate grid before writing SVG |
| 61 | +3. **Educational priority** — what is the single most important thing to convey? |
| 62 | +4. **Composition sketch** — define bounding rectangles for illustration, annotations, controls, sub-diagrams |
| 63 | +5. **Control-to-visual mapping** — list which SVG element IDs each interactive control will modify |
| 64 | +6. **Shape fidelity check** — list 4-6 defining visual features that make the subject recognizable (e.g., for a car: wheel wells, hood slope, windshield angle, roof line) |
| 65 | + |
| 66 | +Pure prompt enhancement — no code changes. **Highest single-item impact.** |
| 67 | + |
| 68 | +--- |
| 69 | + |
| 70 | +### PR 3: Add progressive rendering pattern for interactive diagrams |
| 71 | + |
| 72 | +**Scope:** `apps/agent/skills/master-agent-playbook.txt`, `apps/agent/skills/svg-diagram-skill.txt` (+ MCP mirrors) |
| 73 | + |
| 74 | +Replace scattered `oninput` handlers with a structured architecture: |
| 75 | + |
| 76 | +- **Explicit HTML sections** via comments: `<!-- SECTION 1: Main SVG -->`, `<!-- SECTION 2: Sub-diagrams -->`, `<!-- SECTION 3: Stats -->`, `<!-- SECTION 4: Controls -->`, `<!-- SECTION 5: JS state + bindings -->` |
| 77 | +- **Centralized state-and-render pattern**: |
| 78 | + ```js |
| 79 | + const state = { throttle: 50, gear: 3, mode: 'city' }; |
| 80 | + function updateState(key, value) { state[key] = value; render(); } |
| 81 | + function render() { /* update ALL visual elements from state */ } |
| 82 | + ``` |
| 83 | +- **Element ID convention**: `viz-{component}-{property}` (e.g., `viz-engine-fill`, `viz-speed-text`) |
| 84 | +- **Preset pattern**: presets as calls to `updateState` with multiple values |
| 85 | + |
| 86 | +Ensures controls and visuals are architecturally connected rather than accidentally coupled. |
| 87 | + |
| 88 | +--- |
| 89 | + |
| 90 | +### PR 4: Create dedicated `mechanical-illustration-skill.txt` |
| 91 | + |
| 92 | +**Scope:** New file `apps/agent/skills/mechanical-illustration-skill.txt` (+ MCP mirror) |
| 93 | + |
| 94 | +Comprehensive skill for mechanical/illustrative diagrams: |
| 95 | + |
| 96 | +- **Shape construction method** — decompose objects into geometric primitives (body = rounded rect, wheels = circles, windows = trapezoids) |
| 97 | +- **Spatial accuracy rules** — grid overlay technique for component placement |
| 98 | +- **Annotation hierarchy** — primary (14px/500 weight, solid leader lines), secondary (12px/400, dashed leaders), tertiary (11px/400, inline) |
| 99 | +- **Sub-diagram quality standards** — minimum 150x100px, visually connected to main diagram via callout |
| 100 | +- **Interactive control binding template** — reusable JS pattern for slider/button -> SVG element updates |
| 101 | +- **Two worked reference compositions** — car drivetrain and airfoil cross-section with full SVG examples |
| 102 | + |
| 103 | +Auto-loaded by the existing `load_all_skills()` glob — no code changes. |
| 104 | + |
| 105 | +--- |
| 106 | + |
| 107 | +### PR 5: Add SVG shape library with reusable path fragments |
| 108 | + |
| 109 | +**Scope:** New file `apps/agent/skills/svg-shape-library.txt` (+ MCP mirror) |
| 110 | + |
| 111 | +Pre-designed SVG path fragments the agent can reference and adapt: |
| 112 | + |
| 113 | +- **Vehicles**: sedan side profile, airplane side profile, airplane top-down |
| 114 | +- **Mechanical parts**: gear/cog, piston, spring, valve, wheel with spokes |
| 115 | +- **Physics shapes**: airfoil cross-section (NACA-style), force arrow with proper head |
| 116 | +- **Annotation elements**: zoom callout box, dimension line with tick marks |
| 117 | +- **Control surfaces**: rudder, aileron, elevator (neutral and deflected) |
| 118 | + |
| 119 | +Each entry includes normalized coordinates (0-100 units), transform instructions for positioning, and customization notes. |
| 120 | + |
| 121 | +**Note:** Adds ~30-50KB to system prompt. Monitor context budget. Consider including only 6-8 most common shapes initially. |
| 122 | + |
| 123 | +--- |
| 124 | + |
| 125 | +## Recommended sequencing |
| 126 | + |
| 127 | +| Order | PR | Effort | Impact | |
| 128 | +|-------|-----|--------|--------| |
| 129 | +| 1 | PR 1: Expand illustrative rules | Small | Medium | |
| 130 | +| 2 | PR 2: Diagram planning protocol | Small | **High** | |
| 131 | +| 3 | PR 3: Progressive rendering | Medium | Medium | |
| 132 | +| 4 | PR 4: Mechanical illustration skill | Medium | High | |
| 133 | +| 5 | PR 5: SVG shape library | Large | High | |
| 134 | + |
| 135 | +PRs 1 and 2 can be merged independently. PR 4 builds on conventions from PR 1. PR 3 and PR 5 are independent of each other. |
| 136 | + |
| 137 | +## Verification |
| 138 | + |
| 139 | +After each PR, test with these prompts and compare before/after: |
| 140 | +- "how do cars work?" |
| 141 | +- "can you explain how airplanes fly? I'm a visual person" |
| 142 | +- "explain how a combustion engine works" |
| 143 | +- Verify dark mode compatibility |
| 144 | +- Check system prompt token count stays within model context budget |
| 145 | + |
| 146 | +## Key files |
| 147 | + |
| 148 | +- `apps/agent/skills/svg-diagram-skill.txt` — primary skill to expand (illustrative section at lines 157-165) |
| 149 | +- `apps/agent/skills/master-agent-playbook.txt` — interactive widget templates (Part 2) |
| 150 | +- `apps/agent/skills/__init__.py` — auto-discovers new `.txt` skill files (line 23, glob pattern) |
| 151 | +- `apps/agent/main.py` — injects skills into system prompt (line 49) |
| 152 | +- `apps/mcp/skills/` — MCP mirrors of all skill files |
0 commit comments