- Status: Accepted
- Implementation Status: Implemented
- Implemented In Repo Version: 0.116.0
- Implemented In Platform Version: not applicable (repo-only)
- Implemented On: 2026-03-24
- Date: 2026-03-24
Platform changes are approved based on a description of intent ("deploy netbox to the latest version") rather than a precise description of effect ("this will change 3 container labels, restart 1 container, and update 2 DNS records"). The gap between intent and effect is where surprises happen.
Several platform components already have dry-run capabilities in isolation:
- Ansible playbooks support
--checkmode and produce a changed-task list. - OpenTofu supports
terraform planand produces a resource diff. - Caddy and nginx config changes can be syntax-validated before reload.
But these are disconnected. The operator who wants to know "what will this deployment actually change?" must run at least three separate commands, mentally merge their outputs, and then decide whether to proceed. The agent observation loop (ADR 0071) and the risk scorer (ADR 0116) need this information automatically — they cannot run three ad hoc commands before every intent compilation.
The dry-run diff engine solves this by providing a unified, object-model diff for any execution intent, across all platform surfaces, in a single API call.
We will implement a dry-run semantic diff engine as a Python module platform/diff_engine/ that computes a SemanticDiff — a structured description of predicted changes — for any ExecutionIntent before execution.
@dataclass
class ChangedObject:
surface: str # 'ansible_task' | 'docker_container' | 'dns_record' | 'tls_cert' | ...
object_id: str # fully qualified identifier of the object being changed
change_kind: str # 'create' | 'update' | 'delete' | 'restart' | 'renew'
before: dict | None # current state (from world-state materializer)
after: dict | None # predicted state after change
confidence: str # 'exact' | 'estimated' | 'unknown'
reversible: bool # can this change be trivially reverted?
notes: str | None # human-readable explanation of the change
@dataclass
class SemanticDiff:
intent_id: str
computed_at: str # ISO8601
changed_objects: list[ChangedObject]
total_changes: int
irreversible_count: int
unknown_count: int # objects we could not assess
adapters_used: list[str]
elapsed_ms: intEach platform surface has a diff adapter. Adapters are registered in config/diff-adapters.yaml and invoked by the engine based on the allowed_tools in the compiled intent:
platform/diff_engine/adapters/
├── ansible_adapter.py # runs ansible-playbook --check; parses task output
├── opentofu_adapter.py # runs tofu plan -json; parses resource changes
├── docker_adapter.py # compares desired compose state against current Docker API state
├── dns_adapter.py # compares desired DNS records against current NetBox/resolver state
├── cert_adapter.py # compares desired cert config against current step-ca inventory
├── firewall_adapter.py # compares desired nftables rules against live ruleset
└── vm_adapter.py # compares desired Proxmox VM config against Proxmox API state
Each adapter implements:
class DiffAdapter(Protocol):
surface: str
def compute_diff(
self,
intent: ExecutionIntent,
world_state: WorldStateClient,
) -> list[ChangedObject]:
...The Ansible adapter is the most important because most platform mutations run through Ansible playbooks:
class AnsibleAdapter:
surface = "ansible_task"
def compute_diff(self, intent, world_state):
# Run ansible-playbook --check --diff against the target
result = subprocess.run(
["ansible-playbook", "--check", "--diff",
"-i", inventory_for(intent.target),
playbook_for(intent.action),
"--extra-vars", json.dumps(intent.scope.as_vars())],
capture_output=True, text=True, timeout=60
)
# Parse the JSON callback output
tasks = parse_ansible_check_output(result.stdout)
return [
ChangedObject(
surface="ansible_task",
object_id=f"{task.host}:{task.name}",
change_kind="update",
before=task.before,
after=task.after,
confidence="exact" if task.diff else "estimated",
reversible=True,
)
for task in tasks if task.changed
]The long-term architecture remains the same: the goal compiler (ADR 0112) calls the diff engine immediately after compilation, before presenting the intent to the operator. The compiled intent display includes the semantic diff:
$ lv3 run "deploy netbox"
Compiled intent: deploy service:netbox
Risk class: MEDIUM (score: 42/100)
Predicted changes (3 objects):
✎ docker_container:netbox update restart container with new image tag
✎ ansible_task:netbox:config_file update /opt/netbox/config/configuration.py (2 lines changed)
✎ dns_record:netbox.example.com update TTL 300→60 (temporary low TTL for deployment)
Irreversible: 0 Unknown: 0 Adapters used: docker, ansible, dns
Proceed? [y/N]
The diff engine result is passed to the risk scorer (ADR 0116) as the expected_change_count and irreversible_count signals. The risk score accounts for the actual predicted surface of the change, not a generic estimate.
Not all surfaces have complete adapters at launch. When an intent involves a surface with no adapter, the engine returns a ChangedObject with confidence: unknown and change_kind: unknown. These count toward the unknown_count in the diff summary and are flagged in the approval prompt.
The risk scorer treats unknown_count > 0 as a penalty: changes with unknown surfaces have their mutation_surface score contribution set to the maximum rather than zero.
- Repository implementation landed in
0.116.0with the new platform/diff_engine/ package, the adapter registry at config/diff-adapters.yaml, focused coverage in tests/unit/test_diff_engine.py, and CLI/risk integration through scripts/risk_scorer/context.py, scripts/risk_scorer/models.py, scripts/risk_scorer/dimensions.py, and scripts/lv3_cli.py. - The repository still does not ship the standalone
platform/goal_compiler/runtime proposed in ADR 0112. The implemented integration point is the existing compiled-intent path used bycompile_workflow_intent()andlv3 run, which now computes and rendersSemanticDiffbefore any Windmill submission. - Unsupported or unconfigured surfaces now emit
confidence: unknownobjects instead of silently defaulting the change count. The mutation-surface risk dimension treats those unknowns as maximum-risk evidence rather than zero-risk absence. - Firewall and Proxmox VM semantic adapters remain follow-up work. The initial implemented surfaces are Ansible, OpenTofu, Docker, DNS, and TLS certificate policy.
Positive
- Operators see the actual predicted diff, not an abstract description of intent. Change approvals are based on evidence, not trust.
- The risk scorer has a precise
expected_change_countderived from the real diff rather than a static estimate. - Irreversible change detection catches destructive operations (data deletion, permanent DNS changes) before they execute.
- The diff is stored in the ledger (ADR 0115) as the
before_statebaseline, so post-execution verification can compare actual vs predicted.
Negative / Trade-offs
- Running
ansible-playbook --checkis the most expensive part of the diff engine. For large playbooks it can take 30–60 seconds. This adds latency to every interactivelv3 runinvocation. - Adapter coverage is never complete. Novel infrastructure additions require new adapters before their changes are visible in the diff.
--checkmode in Ansible has known limitations: conditional tasks that depend on gathered facts may produce incorrect or absent diff output. These surface asconfidence: estimatedin the output.
- The diff engine computes predicted changes. It does not execute any mutations.
- Diff computation is read-only with one exception: the Ansible adapter runs
ansible-playbook --check, which may create temporary artefacts on the target host. These are explicitly safe per Ansible's check mode guarantee. - The diff engine does not make approval decisions; it provides evidence to the goal compiler and risk scorer.
- ADR 0048: Command catalog (playbook and tool mappings used by adapters)
- ADR 0085: OpenTofu VM lifecycle (OpenTofu adapter source)
- ADR 0090: Platform CLI (diff output displayed in the
lv3 runinteractive flow) - ADR 0112: Deterministic goal compiler (calls diff engine; includes diff in compiled intent display)
- ADR 0113: World-state materializer (before-state data source for all adapters)
- ADR 0115: Event-sourced mutation ledger (predicted diff stored as before_state baseline)
- ADR 0116: Change risk scoring (consumes expected_change_count and irreversible_count)
- ADR 0119: Budgeted workflow scheduler (uses max_touched_hosts from diff output)