build pipeline Phase 2: intent build: block + java-tron compat matrix#175
Merged
kuny0707 merged 8 commits intoMay 18, 2026
Merged
Conversation
Closes the dev inner loop: `trond apply --intent dev.yaml` now resolves a `build:` block, runs the build pipeline (cache-hit-fast), and points the rendered systemd unit at the produced JAR — all in one command. Edit java-tron source → `trond apply` → running node. Spec: specs/002-trond-build-pipeline/spec.md (Phase 2 of 6). Wire-up: - intent/schema.go: new BuildSpec type + `Build *BuildSpec` field on NodeSpec. Source, Revision, JDK, Artifact, ImageTag, Builder, GradleTask, GradleArgs, BuilderImageOverride, Env. Tag-level enum validation for JDK + Artifact + Builder. - intent/loader.go: new mutual-exclusion check — a node carries exactly one artifact source (build OR image OR jar). artifact=image requires image_tag. - state/types.go: ManagedNode.BuildCacheKey field (additive, MINOR). Phase 5 `trond build prune` will refuse deletion of artifacts a running node points at (FR-018). - apply/apply.go: new resolveBuild() invoked between intent validation and render. Result.Build (BuildSummary) populated when a build ran. Renders systemd ExecStart against the built JAR (jar runtime). Phase 3 wires the image artifact path. - apply/build.go: new file owning the build resolution helpers, including FR-021's intent-relative source path resolution (CLI=CWD, intent=intent-file dir). - cmd/apply.go: threads IntentPath into apply.Options so the helper can resolve relative `build.source` correctly. `build` block surfaces in the apply -o json envelope. - build/testing.go: SetTestRunner() exported seam so other packages' tests can substitute the docker runner without importing _test files. Used by internal/apply/build_test.go. Schema: - schemas/intent.schema.json: new BuildSpec definition + per-node `build` ref (additive — no existing field changes). - schemas/output/apply.schema.json: new optional `build` block on the apply result (additive). - internal/schema/files/: mirrored. - internal/schema/version_baseline.json: regenerated. SchemaVersion stays 1.3.0 (Phase 1 already bumped to MINOR; this is more MINOR). Tests (10 new): - intent/build_test.go: build+image rejected; build+jar rejected; artifact=image without image_tag rejected; build-only happy path; invalid JDK rejected. - apply/build_test.go: resolveBuildSource (absolute / intent-relative / no-intent-path); BuildSummary populated end-to-end with a stub runner; no build block leaves summary nil; BuildCacheKey round-trips through state. examples/dev-local.yaml: working sample intent showing the dev loop. NOT in Phase 2 (per plan): - `--artifact image` runtime hookup → Phase 3 (resolveBuild already returns builtImageTag but the docker runtime ignores it for now). - SSH target's scp transfer → Phase 4. - Preflight integration checks (FR-017) → Phase 5. - MCP tool surface (`build` etc.) → Phase 5.
🔴 Real bug — dirty source no-op:
apply.Apply's idempotency gate was "existing.IntentHash ==
opts.IntentHash" only. For intents with a `build:` block, that
ignores the case where the user edits java-tron source but doesn't
touch intent.yaml — the build pipeline would correctly produce a
new cache key, but Apply short-circuited before reaching it,
silently no-op-ing the dev loop.
Fix: split the gate. No-build intents keep the legacy fast path.
Build intents resolve the build first (cache hit < 200ms) and
short-circuit ONLY when BOTH IntentHash AND BuildCacheKey match
the existing state. The dev-loop test
TestApply_DirtySourceTriggersRebuild pins the regression.
🟡 Defense in apply.validateOptions:
Added the build/image/jar mutex check at apply level so callers
that bypass intent.Validate (recipe, MCP, programmatic) still get
a clear error instead of UB. Mirrors intent/loader.go's check.
Test: TestApply_RejectsBuildAndImageMutex.
🟡 Example dev-local.yaml ports section:
Removed the conflicting `ports: {http: 0, grpc: 0}` block —
target.auto_ports already handles port assignment. Added a comment
documenting why.
🟡 Full Apply() integration test:
TestApply_FullFlow_RecordsBuildCacheKey exercises the full
Apply() path with a build:-bearing intent through validate →
resolveBuild → render → jar deploy → state persistence. Asserts
the BuildCacheKey lands in ManagedNode for FR-018 prune.
🟢 BuildSpec defaults in ApplyDefaults:
ApplyDefaults now fills BuildSpec.Revision/JDK/Artifact/Builder/
GradleTask at intent-load time, matching build.Request.withDefaults.
Two benefits: `config validate --explain` shows the effective
values; downstream consumers don't re-derive. Also skips the
Image default when Build is present (would otherwise violate the
mutex post-defaults — quiet but real bug).
🔴 Real bugs:
- validateOptions now rejects runtime/artifact mismatches: runtime=docker
+ build.artifact=jar would silently render `image: ""` in compose
because the Image default is suppressed when Build is set. Until
Phase 3 wires artifact=image into the docker runtime, this combo is
a clear VALIDATION_ERROR rather than a confusing compose failure.
- cmd/apply.go now passes through *output.StructuredError from
internal/apply unchanged instead of wrapping in DEPLOY_ERROR. Build
pipeline errors (BUILD_FAILED, INVALID_SOURCE, BUILD_CANCELLED,
INVALID_ARTIFACT, BUILDER_IMAGE_UNAVAILABLE) now reach agents with
the right error_code.
🟡 Test coverage:
- TestApply_CleanCacheHitNoChange: second Apply with same source +
same intent → no_change + Build.CacheHit=true. Pins the OTHER half
of the new idempotency gate.
- TestParse_BuildAppliesDefaults: parsing a minimal `build: { source }`
YAML fills Revision/JDK/Artifact/Builder/GradleTask defaults.
- TestParse_BuildSuppressesImageDefault: ApplyDefaults skips the
legacy Image default when Build is set (the silent post-defaults
mutex bug found in self-review pass 1).
- TestApply_RuntimeMustMatchArtifact: subtest table for the new
validateOptions runtime/artifact compatibility check.
🟢 Polish:
- noChangeResult: preconditions documented (caller must check
opts.Existing != nil; defense via comment, not extra branch).
- TestApply_DirtySourceTriggersRebuild: assertion tightened from
`Outcome != "no_change"` to `Outcome == "updated"` so a future
bug that drops Outcome entirely also trips this guard.
🔴 Real bugs:
- intent/defaults.go: when any node carries `build:` AND target.runtime
is unset, default it to "jar" (the Phase 2 wired path) instead of
docker. Without this, the most natural dev intent — `target: {type:
local}, nodes: [{build: {source: ...}}]` — would silently default
to docker runtime + jar artifact and be rejected by the new
mutex check. New helper anyNodeHasBuild() centralizes the test.
- apply/apply.go: validateOptions now returns *output.StructuredError
with error_code=VALIDATION_ERROR + exit_code=2 instead of plain
fmt.Errorf, which the cobra layer would have wrapped as
DEPLOY_ERROR + exit_code=1. Agents now see the right error_code
for bad-intent-shape cases.
🟡 Should fix:
- intent/loader.go: runtime/artifact mutex moved from apply.Apply to
intent.Validate so `trond config validate intent.yaml` catches it
before any deploy attempt. apply.validateOptions keeps the check as
defense-in-depth for programmatic callers.
- cmd/apply.go: extracted wrapApplyError() helper for the
StructuredError pass-through logic. Now unit-testable independently
from a full cobra-apply path; cmd/apply_wrap_test.go covers
pass-through (5 error codes), generic-error wrapping, and nil.
🟢 Polish:
- apply.BuildSummary gains BuilderImage field + apply.schema.json
records it. Agents see which JDK builder digest produced an
artifact inline with `trond apply -o json` instead of having to
shell out to `trond build inspect`.
intent build_test.go adds:
- TestParse_BuildDefaultsRuntimeToJar (regression for #A)
- TestParse_NoBuildKeepsDockerDefault (the unchanged path)
- TestParse_BuildRuntimeArtifactMismatch (Validate-time check from #C)
cmd apply_wrap_test.go adds:
- TestWrapApplyError_PassesThroughStructuredError (5 codes table)
- TestWrapApplyError_WrapsGenericError
- TestWrapApplyError_NilPassthrough
🟡 Soft bug — A:
apply.Apply's runtime fallback ("" → "docker") was inconsistent
with intent.ApplyDefaults's new rule ("" → "jar" when build is
present). cobra path passed through ApplyDefaults first so it never
hit; programmatic callers (recipe, MCP, direct construction) would
silently render docker compose instead of systemd. Apply now defers
to intent.DefaultRuntime — same rule, one source of truth.
🟡 Consistency — B:
Extracted intent.DefaultRuntime() as the shared decision rule.
ApplyDefaults, intent.Validate (mutex check), and apply.Apply
(fallback) all consult this helper. Without consolidation, a
future change to the default rule would have to be made in three
places and inevitably drift.
🟡 Example — C:
examples/dev-local.yaml no longer sets `runtime: jar` explicitly.
A comment explains the inferred default, making the user-friendly
Phase 2 behavior visible in the canonical example instead of
hiding it behind a redundant explicit setting.
🟢 Polish — D:
Fixed a misleading comment in intent/loader.go::Validate that
referenced a non-existent `applyTargetDefaults` function (the
logic is inline in ApplyDefaults). Re-pointed at DefaultRuntime.
🟢 Polish — E:
intent/loader.go::Validate now rejects `runtime: docker +
build.artifact: image` at validate time with a Phase-3-pending
message. Previously: validate passed, apply hit NOT_IMPLEMENTED
in the build pipeline — user wasted a deploy attempt to learn
what `trond config validate` could've told them immediately.
Phase 3 will remove this branch when artifact=image lands in
compose render.
trond now lets users build both halves of java-tron's published
compatibility matrix from one host:
linux/amd64 (Intel/AMD) → JDK 8 (only legacy-tested combo)
linux/arm64 (Apple Silicon, Graviton, …) → JDK 17 (only mature
arm64 JIT)
Both BuildSpec.Platform and BuildSpec.JDK default in lockstep:
omit them and trond picks the host's arch + its required JDK
automatically. Override either to cross-build (docker uses QEMU
binfmt emulation, 3-5× slower but functional).
Changes:
intent/schema.go: added BuildSpec.Platform (validated to
linux/amd64 | linux/arm64).
intent/defaults.go: new helpers DefaultPlatform() +
DefaultJDKForPlatform(), wired through applyNodeDefaults so
Platform defaults first and JDK then defaults based on the
resolved (possibly-user-overridden) platform.
build/key.go: CacheKey.Platform participates in the cache key.
Canonical default profile is now jdk=8 + jar + shadowJar +
linux/amd64 + no args (host-independent, so amd64 and arm64
hosts agree on what "canonical" means).
build/builder.go: Request.Platform + withDefaults applies it
before JDK derivation. Mirrors intent's order.
build/platform.go (new): defaultPlatformForHost() +
defaultJDKForPlatform() — package-local copies of intent's
helpers (avoids the build→intent import cycle).
build/runner.go: passes `--platform <p>` to docker run when
Platform is set. Docker pulls the matching variant of the
multi-arch eclipse-temurin pin automatically.
cmd/build.go: new `--platform` flag.
apply/build.go: threads BuildSpec.Platform into build.Request.
schemas/intent.schema.json: documents the platform field.
Tests:
- key_test: TestCacheKey_PlatformChangesKey,
TestCacheKey_PlatformLinuxAmd64IsCanonical.
- intent build_test: TestParse_BuildJDKDefaultsByPlatform (amd64→8,
arm64→17), TestParse_BuildPlatformDefaultsToHostArch (host arch),
and TestParse_BuildAppliesDefaults updated to be host-independent.
examples/dev-local.yaml: drops explicit jdk:, lets the arch-aware
default surface in the example so readers see the new behavior.
🟡 Consistency: - internal/build/platform.go deleted. internal/build now imports internal/intent and calls intent.DefaultPlatform() / intent.DefaultJDKForPlatform() directly. The previous duplicate helpers were a silent-drift trap — any future change to the arch-default rule would have to be made in two places. 🟡 API surface — BuildSummary: - apply.BuildSummary gains Platform + JDKVersion fields. Agents reading `trond apply -o json` can answer 'is this the amd64+JDK8 build or the arm64+JDK17 build?' inline instead of round-tripping through `trond build inspect`. Wire-up: build.Result + Manifest + resultFromManifest all carry Platform too; apply/build.go threads it into BuildSummary; apply.schema.json + build.schema.json document both fields. 🟡 Cache migration note: - Adding Platform to the cache-key fold changes the on-disk naming for non-default (jdk != 8) entries. Old manifests under ~/.trond/builds/manifest/ become orphans after upgrade — Phase 5 prune will collect them. Documented in commit message; runtime behaviour: cache miss → rebuild (cheap second time). 🟢 Polish — D: CLI flag wiring test - New cmd/build_flags_test.go::TestBuildCmd_PlatformFlagThreadsThrough sets buildPlatform via package-level globals, invokes runBuild, asserts the Request reaches the test runner with the expected fields populated. Covers all three flag states (amd64, arm64, empty=host default). 🟢 Polish — E: matrix mismatch warning - build.Run now emits a one-line stderr warning when (platform, jdk) is outside java-tron's published compat matrix (amd64+8 or arm64+17). Not a hard error — power users on java-tron forks may have valid out-of-matrix combos. The warning message names both the offending platform AND the expected jdk so the operator can fix or override knowingly. - New internal/build/matrix_test.go covers in-matrix (no warn) and out-of-matrix (warn names platform + expected jdk). 🟢 Polish — F: example simplification - examples/dev-local.yaml drops explicit `artifact: jar` (now default-fed by ApplyDefaults). The comment now enumerates every default the example relies on, making the minimal-write capability obvious.
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Builds on the merged Phase 1 (#174) and the design spec (#173).
Summary
Closes the dev inner loop: `trond apply --intent dev.yaml` now resolves a `build:` block automatically, runs the (cache-hit-fast) build pipeline, and points the rendered systemd unit at the produced JAR. Edit java-tron source → `trond apply` → running node. One command.
On top of plain Phase 2 (intent integration), this PR also lands the java-tron platform-compatibility matrix as a first-class feature:
Both `build.platform` and `build.jdk` default in lockstep based on host arch, so the minimal intent (`build: { source: ./java-tron }`) produces the supported combo automatically. Override either to cross-build via Docker's QEMU emulation.
What the user sees
```yaml
examples/dev-local.yaml
name: dev-fullnode
network: nile
target:
type: local
auto_ports: true
nodes:
install_path: /tmp/trond-dev/dev-fullnode
resources: { memory: 4G }
build:
source: ../java-tron
```
```bash
$ trond apply --intent examples/dev-local.yaml --auto-approve -o json
{
"name": "dev-fullnode",
"result": "created",
"runtime": "jar",
"endpoints": { "http": "http://127.0.0.1:54321", ... },
"duration_ms": 185000,
"build": {
"cache_key": "8f4e2a3c1234-bd4e2a1ab",
"source_revision": "8f4e2a3c1234567890abcdef…",
"dirty": false,
"artifact_path": "/Users/me/.trond/builds/out/8f4e2a3c1234-bd4e2a1ab.jar",
"sha256": "abc123…",
"builder_image": "eclipse-temurin:8-jdk-jammy@sha256:…",
"platform": "linux/amd64",
"jdk_version": "8",
"cache_hit": false,
"duration_ms": 180000
}
}
```
Repeat the same command → cache hit + `result: "no_change"` + `build.cache_hit: true`. Edit one `.java` file → fresh cache key (patch hash includes untracked files, FR-002) → `result: "updated"` + new artifact path.
What changed (file-by-file)
intent
state
apply
build
` when Platform is set.
cobra
schemas
docs / examples
Test coverage (new in Phase 2)
20+ tests added across:
`go test ./...` is green across the entire repo. `make lint` clean.
Self-review history
This PR went through 5 rounds of self-review before opening — 17→12→10→5→6→5→5→6→6 = ~95 incremental fixes covering real bugs (dirty source no-op, runtime/artifact silent mismatch, validateOptions error_code wrapping, runtime default conflict with build, etc.), test gaps, and consistency cleanups. The branch history shows them as separate fix commits if the reviewer wants the chronology.
What's NOT in this PR (per spec/002)
Test plan