You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: add section 6 and slide 7 — DSL generation as industry-validated pattern
Explains why MDL generation is cognitively simpler than MCP tool orchestration:
multi-layer sequential dependency vs. single coherent inference pass, the
"generate code to compute" analogy, and industry validation (Blender/bpy,
Terraform, dbt, Codex). Renumbers downstream sections 6-9 → 7-10 and slides
7-10 → 8-11. Updates slide summary to reflect eleven-slide deck.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: docs/01-project/MXCLI_STRATEGIC_POSITIONING.md
+90-9Lines changed: 90 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -192,7 +192,57 @@ The strategic case above focuses on the execution phase (generating and applying
192
192
193
193
---
194
194
195
-
### 6. Model Tier Requirements and the Compound Cost Multiplier
195
+
### 6. Why DSL Generation Is Cognitively Simpler Than Tool Orchestration
196
+
197
+
The preceding sections quantify the token and cost difference. There is a deeper explanation for why the gap is structural — one that connects to a well-understood principle in how LLMs best interact with complex systems.
198
+
199
+
**Multi-layer sequential dependency vs. single-layer token prediction**
200
+
201
+
MCP tool orchestration requires the LLM to maintain coherence across multiple dependent sequential layers for every operation:
202
+
203
+
1.**Intent** — parse user intent
204
+
2.**Plan** — sequence of operations required
205
+
3.**Tool selection** — which tool for this step
206
+
4.**Parameter construction** — exact parameters, given current state
207
+
5.**State update** — track UUIDs, array indices, reference paths after each tool response
8.**Repeat** — each subsequent operation restarts at layer 3 with updated state
211
+
212
+
Each layer is stochastic, and errors compound upward. A UUID mis-tracked in layer 5 silently corrupts the parameter construction in layer 4 of the next call. The LLM is simulating a stateful interpreter across many sequential, mutually dependent steps — which is not what transformer next-token prediction is trained to do reliably.
213
+
214
+
MDL generation reduces this to a single coherent inference pass:
215
+
216
+
**Intent → tokens**
217
+
218
+
State is already embedded in the generated text. After writing `create entity Customer`, the entity name is directly in context — no UUID tracking, no index recomputation, no state extracted from a history of tool responses. The compiler handles all the deterministic layers: reference resolution, UUID assignment, dependency ordering.
219
+
220
+
**The "generate code to compute" analogy**
221
+
222
+
When an LLM is asked to compute the 1000th prime or perform complex arithmetic, the reliable approach is to write Python and execute it — not to iterate the computation inside the context window. LLMs are trained to generate correct code; they are not trained to maintain arithmetic state across sequential steps.
223
+
224
+
MCP orchestration puts LLMs in "compute it yourself" mode: track UUIDs, recompute array indices, maintain cross-document consistency across 50+ turns. MDL puts LLMs in "generate the script" mode: understand intent, emit declarative statements, let the compiler execute. The same principle applies to both: externalise the deterministic computation to the right tool, keep the LLM in the regime it was trained for.
225
+
226
+
**The pattern appears across the industry**
227
+
228
+
This is not a novel insight. DSL or scripting layers have emerged independently in every domain where LLMs have been applied to complex stateful systems:
229
+
230
+
-**3D and CAD**: Blender generates `bpy` Python scripts, not sequences of UI tool calls. Maya uses MEL scripts. AutoCAD uses AutoLISP.
231
+
-**Infrastructure**: Terraform HCL and CloudFormation YAML, not sequential cloud API calls. Each resource references IDs from previous ones — the same cross-reference problem MDL externalises to the compiler.
232
+
-**Data transformation**: dbt SQL models, not step-by-step ETL calls. The dependency DAG is declared; the engine resolves execution order.
233
+
-**General computation**: GitHub Copilot, Codex, AlphaCode — all "generate code," not "call tools." The canonical case.
234
+
235
+
In every domain where pure tool-call orchestration has been pushed hard, a DSL or scripting layer has emerged. Tool calls become the *implementation* of the compiler — not what the LLM authors. mxcli + MDL is the application of this established pattern to Mendix.
236
+
237
+
**Why this directly explains the model tier and capability gaps**
238
+
239
+
The multi-layer sequential state management of MCP is exactly the task class where smaller models degrade systematically: errors compound across layers, state diverges mid-session, and the model needs Opus-tier capacity just to maintain coherence. Single-pass text generation against a learnable grammar is exactly the task class where 32B local models are competitive with frontier models.
240
+
241
+
The model tier gap (Section 7), the local model viability (Section 8), and the generation time difference (Section 9) are all downstream consequences of this architectural difference. MDL is not a productivity shortcut over MCP — it is a different computational model: LLM as code generator, compiler as executor.
242
+
243
+
---
244
+
245
+
### 7. Model Tier Requirements and the Compound Cost Multiplier
196
246
197
247
The preceding analysis focuses on token count. There is a second cost dimension that changes the strategic picture significantly: which model tier is required to execute the task reliably.
198
248
@@ -237,7 +287,7 @@ MDL allows the full workflow to run on Sonnet. MCP forces Opus on the two most t
237
287
238
288
---
239
289
240
-
### 7. Local Model Compatibility
290
+
### 8. Local Model Compatibility
241
291
242
292
MDL's declarative structure opens a capability that is structurally out of reach for MCP: **reliable execution by local, on-device models**.
243
293
@@ -284,7 +334,7 @@ The validation gate is deterministic so the escalation decision is mechanical. A
284
334
285
335
---
286
336
287
-
### 8. Generation Time — Speed Is a Capability, Not a Convenience
337
+
### 9. Generation Time — Speed Is a Capability, Not a Convenience
288
338
289
339
The preceding sections focus on token and monetary cost. Time is a third cost dimension — and for the buyer who has waited a full working day for a PED-generated app, it is the most visceral one.
290
340
@@ -350,7 +400,7 @@ API token cost is visible on a bill. Developer time cost is invisible but larger
350
400
351
401
---
352
402
353
-
### 9. A Three-Layer Architecture: Compiler, Starlark, Skills
403
+
### 10. A Three-Layer Architecture: Compiler, Starlark, Skills
354
404
355
405
The linting system already embodies the right design principle: Starlark handles quantitative rules (naming conventions, complexity thresholds, structural checks); skill files handle qualitative guidance (architectural judgment, design heuristics). The same split applies to generation — and once applied, it clarifies exactly what belongs in the compiler itself.
356
406
@@ -533,7 +583,7 @@ Query cost is ~500 tokens in, ~200–2k out, regardless of project size. The age
533
583
534
584
# Summary for slides
535
585
536
-
A focused six-slide deck. Slide 1 sets the axis — the dual-backend refactor means "live vs offline" is no longer what separates the tools. Slide 2 frames the territory. Slide 3 makes the efficiency case. Slide 4 is the architectural moat. Slide 5 is the safety upgrade. Slide 6 is the enterprise closer.
586
+
An eleven-slide deck with a core six-slide arc (Slides 1–6) and five deep-dive slides (Slides 7–11) for technical audiences. Slide 1 sets the axis — the dual-backend refactor means "live vs offline" is no longer what separates the tools. Slide 2 frames the territory. Slide 3 makes the efficiency case. Slide 4 is the architectural moat. Slide 5 is the safety upgrade. Slide 6 is the enterprise closer. Slides 7–11 cover: why DSL generation is the industry-validated pattern for agentic work, the compound cost multiplier, local model viability, generation time, and the three-layer compiler/Starlark/skills architecture.
537
587
538
588
## Slide 1 — Interaction Protocol Is the Real Axis
539
589
@@ -676,7 +726,38 @@ Every stage except "agent writes MDL" runs outside the agent's context. Pass/fai
676
726
677
727
---
678
728
679
-
## Slide 7 — The Compound Cost Multiplier: Tokens × Model Tier
729
+
## Slide 7 — DSL Generation: The Industry-Validated Pattern
730
+
731
+
**Thesis:** The MDL approach is not a Mendix-specific optimisation. It is the standard solution to a fundamental limitation of multi-step tool orchestration — arrived at independently in every domain where LLMs drive complex stateful systems.
732
+
733
+
**Why tool orchestration is architecturally hard for LLMs:**
734
+
735
+
MCP requires the LLM to maintain coherence across 7+ dependent sequential layers per operation: intent → plan → tool selection → parameter construction → state tracking → dependency management → error recovery → repeat. Each layer is stochastic; errors compound upward. The LLM simulates a stateful interpreter — which is not what transformer next-token prediction is trained to do reliably.
736
+
737
+
**Why DSL generation is in the LLM's natural regime:**
738
+
739
+
MDL is a single coherent inference pass: intent → tokens. State is embedded in the generated text. The compiler handles what the LLM is bad at — UUID assignment, reference resolution, ordering. The LLM does what it is trained for: generating correct text against a known grammar.
740
+
741
+
**The "generate code to compute" analogy:**
742
+
743
+
LLMs don't compute the 1000th prime by iterating in context — they write Python. The same principle applies: generate MDL and execute it, rather than simulate the execution step by step.
**The conclusion:** In every domain where tool-call orchestration has been pushed hard, a DSL layer has emerged. Tool calls become the compiler's implementation — not what the LLM authors. The model tier gap, the local model viability, and the generation time difference in the following slides are all downstream consequences of this one architectural difference.
755
+
756
+
**Slide message:***"The right answer for 'LLM drives a complex stateful system' is always a DSL layer. The industry validated this pattern before we needed it for Mendix."*
757
+
758
+
---
759
+
760
+
## Slide 8 — The Compound Cost Multiplier: Tokens × Model Tier
680
761
681
762
**Thesis:** the token efficiency argument understates the real cost difference. MCP forces Opus; MDL runs on Sonnet. Multiply the token ratio by the model price ratio and the cost difference is 40–200×, not 10–50×.
682
763
@@ -700,7 +781,7 @@ User data: 500M+ tokens/month at Opus prices reported for PED-heavy workflows. T
700
781
701
782
---
702
783
703
-
## Slide 8 — Local Models: The Cost Floor Reaches Zero
784
+
## Slide 9 — Local Models: The Cost Floor Reaches Zero
704
785
705
786
**Thesis:** MDL's declarative structure enables effective use of local, on-device models for routine work. MCP cannot. This creates a three-tier cost structure with a zero-cost floor — and removes the cloud-data constraint that blocks enterprise adoption.
706
787
@@ -732,7 +813,7 @@ Index tracking, dynamic reference resolution, and partial state diagnosis under
732
813
733
814
---
734
815
735
-
## Slide 9 — Generation Time: A Full Day vs. Ten Minutes
816
+
## Slide 10 — Generation Time: A Full Day vs. Ten Minutes
736
817
737
818
**Thesis:** time is the cost dimension the API bill doesn't show. For the buyer who has waited a full working day for a PED-generated app, it is the most memorable argument in this deck.
738
819
@@ -766,7 +847,7 @@ MDL-generated applications will be better — not because the model is more capa
766
847
767
848
---
768
849
769
-
## Slide 10 — Three Layers: Compiler, Starlark, Skills
850
+
## Slide 11 — Three Layers: Compiler, Starlark, Skills
770
851
771
852
**Thesis:** the linting system already shows the right design. Quantitative rules go in Starlark; qualitative judgment stays in skills. Apply the same split to generation — and the result is a three-layer architecture where each layer does exactly what it is good at.
0 commit comments