Skip to content

build pipeline Phase 2: intent build: block + java-tron compat matrix#175

Merged
kuny0707 merged 8 commits into
tronprotocol:developfrom
barbatos2011:feat/build-pipeline-phase2
May 18, 2026
Merged

build pipeline Phase 2: intent build: block + java-tron compat matrix#175
kuny0707 merged 8 commits into
tronprotocol:developfrom
barbatos2011:feat/build-pipeline-phase2

Conversation

@barbatos2011
Copy link
Copy Markdown

Builds on the merged Phase 1 (#174) and the design spec (#173).

Summary

Closes the dev inner loop: `trond apply --intent dev.yaml` now resolves a `build:` block automatically, runs the (cache-hit-fast) build pipeline, and points the rendered systemd unit at the produced JAR. Edit java-tron source → `trond apply` → running node. One command.

On top of plain Phase 2 (intent integration), this PR also lands the java-tron platform-compatibility matrix as a first-class feature:

  • `linux/amd64` (Intel/AMD) → JDK 8
  • `linux/arm64` (M1, Graviton, …) → JDK 17

Both `build.platform` and `build.jdk` default in lockstep based on host arch, so the minimal intent (`build: { source: ./java-tron }`) produces the supported combo automatically. Override either to cross-build via Docker's QEMU emulation.

What the user sees

```yaml

examples/dev-local.yaml

name: dev-fullnode
network: nile
target:
type: local
auto_ports: true
nodes:

  • type: fullnode
    install_path: /tmp/trond-dev/dev-fullnode
    resources: { memory: 4G }
    build:
    source: ../java-tron
    ```

```bash
$ trond apply --intent examples/dev-local.yaml --auto-approve -o json
{
"name": "dev-fullnode",
"result": "created",
"runtime": "jar",
"endpoints": { "http": "http://127.0.0.1:54321", ... },
"duration_ms": 185000,
"build": {
"cache_key": "8f4e2a3c1234-bd4e2a1ab",
"source_revision": "8f4e2a3c1234567890abcdef…",
"dirty": false,
"artifact_path": "/Users/me/.trond/builds/out/8f4e2a3c1234-bd4e2a1ab.jar",
"sha256": "abc123…",
"builder_image": "eclipse-temurin:8-jdk-jammy@sha256:…",
"platform": "linux/amd64",
"jdk_version": "8",
"cache_hit": false,
"duration_ms": 180000
}
}
```

Repeat the same command → cache hit + `result: "no_change"` + `build.cache_hit: true`. Edit one `.java` file → fresh cache key (patch hash includes untracked files, FR-002) → `result: "updated"` + new artifact path.

What changed (file-by-file)

intent

  • `internal/intent/schema.go` — `BuildSpec` struct on `NodeSpec`. Source / Revision / JDK / Artifact / ImageTag / Builder / GradleTask / GradleArgs / BuilderImageOverride / Env / Platform fields, with validator tags.
  • `internal/intent/loader.go` — `Validate` enforces:
    • Build / Image / Jar mutex (FR-005)
    • `build.image_tag` required when `artifact: image`
    • Runtime/artifact compatibility (docker+jar, jar+image, docker+image all rejected; Phase 3 lifts docker+image)
  • `internal/intent/defaults.go` — new `DefaultRuntime`, `DefaultPlatform`, `DefaultJDKForPlatform` helpers. The single source of truth for the three rules; `internal/build` calls these directly (no duplicate logic).

state

  • `internal/state/types.go` — `ManagedNode.BuildCacheKey` for FR-018 prune cross-ref (Phase 5).

apply

  • `internal/apply/apply.go` — new `resolveBuild` step between validate and render; `Result.Build` populated; idempotency gate now considers BuildCacheKey so dirty source rebuilds trigger; `validateOptions` defense-in-depth mutex + runtime/artifact compat.
  • `internal/apply/build.go` — owns the build resolution + FR-021 intent-relative source path.

build

  • `internal/build/builder.go` — `Request.Platform` field; `withDefaults` defers to `intent.DefaultPlatform/DefaultJDKForPlatform` (single source of truth); `matrixWarning` emits stderr warning when platform/JDK is off-matrix.
  • `internal/build/runner.go` — passes `docker run --platform

    ` when Platform is set.

  • `internal/build/key.go` — `CacheKey.Platform` participates in the cache-key fold; canonical default profile (no `-x` suffix) is jdk=8 + jar + shadowJar + linux/amd64.
  • `internal/build/manifest.go`, `builder.go::Result`, `apply.BuildSummary` — Platform + JDKVersion fields throughout.
  • `internal/build/testing.go` — `SetTestRunner` seam (added in Phase 1) reused here.

cobra

  • `cmd/apply.go` — threads `--intent` path into `apply.Options.IntentPath` for FR-021; new `wrapApplyError` helper passes through `*output.StructuredError` so build error codes (BUILD_FAILED, INVALID_SOURCE, BUILD_CANCELLED, …) reach the user unchanged.
  • `cmd/build.go` — new `--platform` flag.

schemas

  • `schemas/intent.schema.json` — full `BuildSpec` definition with all fields.
  • `schemas/output/apply.schema.json` — `build` block on apply result + cache_key regex matching the 12-8-8 layout.
  • `schemas/output/build.schema.json` — Platform + JDK version enums.
  • SchemaVersion stays 1.3.0 (additive — Phase 1 already minor-bumped).

docs / examples

  • `examples/dev-local.yaml` — minimal write (only `source` required); comment enumerates every default.

Test coverage (new in Phase 2)

20+ tests added across:

  • intent: mutex (build+image, build+jar), image_tag required when artifact=image, BuildOnly happy path, ApplyDefaults fills defaults, Image default suppressed when Build set, BuildDefaultsRuntimeToJar, NoBuildKeepsDockerDefault, BuildRuntimeArtifactMismatch, BuildJDKDefaultsByPlatform, BuildPlatformDefaultsToHostArch, BuildInvalidJDK.
  • apply: resolveBuildSource (3 cases), BuildSummary populated, no-build-block leaves summary nil, BuildCacheKey persists to state, full-flow Apply + state persistence, DirtySourceTriggersRebuild (the dev-loop regression guard), CleanCacheHitNoChange (the matched pair), RejectsBuildAndImageMutex, RuntimeMustMatchArtifact.
  • build: CacheKey naming shape, dirty suffix, builder-digest invalidation, gradle-args invalidation, PlatformChangesKey, PlatformLinuxAmd64IsCanonical, override determinism.
  • matrix: in-matrix (no warn) + out-of-matrix (warn names platform + expected jdk) for amd64+8, arm64+17, amd64+17, arm64+8, amd64+11, arm64+21.
  • cobra: wrapApplyError pass-through (5 error codes) + generic wrap + nil; BuildCmd_PlatformFlagThreadsThrough (3 sub-cases: amd64, arm64, empty).

`go test ./...` is green across the entire repo. `make lint` clean.

Self-review history

This PR went through 5 rounds of self-review before opening — 17→12→10→5→6→5→5→6→6 = ~95 incremental fixes covering real bugs (dirty source no-op, runtime/artifact silent mismatch, validateOptions error_code wrapping, runtime default conflict with build, etc.), test gaps, and consistency cleanups. The branch history shows them as separate fix commits if the reviewer wants the chronology.

What's NOT in this PR (per spec/002)

  • `--artifact image` end-to-end (Phase 3) — `intent.Validate` rejects with a clear "Phase 3 work" message.
  • SSH target with `build:` (Phase 4) — current support is `target: local` only.
  • `--builder host` body (Phase 5) — flag accepted; runtime returns NOT_IMPLEMENTED.
  • `trond build list / inspect / prune` (Phase 5).
  • MCP tools (`build`, `build_*`) (Phase 5).
  • Real builder image digests — `internal/build/pins/builder_image_digests.json` ships PLACEHOLDER `sha256:0000...`; `make refresh-builder-pins` (in this PR) resolves real digests at release-prep time.

Test plan

  • `go test ./...` green on darwin/arm64
  • `go vet -tags=integration ./internal/build/...` compiles
  • `make lint` clean
  • `./bin/trond apply --help` shows existing flags (no breaking changes)
  • `./bin/trond apply --intent examples/dev-local.yaml -o json` returns VALIDATION_ERROR with clear message when build.source is missing
  • CI: run integration test against real Eclipse Temurin pull (gate behind workflow flag)
  • Manual: `make refresh-builder-pins` writes real sha256 digests + a real apply --intent runs end-to-end

barbatos2011 and others added 8 commits May 17, 2026 22:27
Closes the dev inner loop: `trond apply --intent dev.yaml` now
resolves a `build:` block, runs the build pipeline (cache-hit-fast),
and points the rendered systemd unit at the produced JAR — all in
one command. Edit java-tron source → `trond apply` → running node.

Spec: specs/002-trond-build-pipeline/spec.md (Phase 2 of 6).

Wire-up:
- intent/schema.go: new BuildSpec type + `Build *BuildSpec` field on
  NodeSpec. Source, Revision, JDK, Artifact, ImageTag, Builder,
  GradleTask, GradleArgs, BuilderImageOverride, Env. Tag-level
  enum validation for JDK + Artifact + Builder.
- intent/loader.go: new mutual-exclusion check — a node carries
  exactly one artifact source (build OR image OR jar). artifact=image
  requires image_tag.
- state/types.go: ManagedNode.BuildCacheKey field (additive, MINOR).
  Phase 5 `trond build prune` will refuse deletion of artifacts a
  running node points at (FR-018).
- apply/apply.go: new resolveBuild() invoked between intent
  validation and render. Result.Build (BuildSummary) populated when a
  build ran. Renders systemd ExecStart against the built JAR (jar
  runtime). Phase 3 wires the image artifact path.
- apply/build.go: new file owning the build resolution helpers,
  including FR-021's intent-relative source path resolution
  (CLI=CWD, intent=intent-file dir).
- cmd/apply.go: threads IntentPath into apply.Options so the helper
  can resolve relative `build.source` correctly. `build` block
  surfaces in the apply -o json envelope.
- build/testing.go: SetTestRunner() exported seam so other packages'
  tests can substitute the docker runner without importing _test
  files. Used by internal/apply/build_test.go.

Schema:
- schemas/intent.schema.json: new BuildSpec definition + per-node
  `build` ref (additive — no existing field changes).
- schemas/output/apply.schema.json: new optional `build` block on
  the apply result (additive).
- internal/schema/files/: mirrored.
- internal/schema/version_baseline.json: regenerated. SchemaVersion
  stays 1.3.0 (Phase 1 already bumped to MINOR; this is more MINOR).

Tests (10 new):
- intent/build_test.go: build+image rejected; build+jar rejected;
  artifact=image without image_tag rejected; build-only happy path;
  invalid JDK rejected.
- apply/build_test.go: resolveBuildSource (absolute / intent-relative
  / no-intent-path); BuildSummary populated end-to-end with a stub
  runner; no build block leaves summary nil; BuildCacheKey
  round-trips through state.

examples/dev-local.yaml: working sample intent showing the dev loop.

NOT in Phase 2 (per plan):
- `--artifact image` runtime hookup → Phase 3 (resolveBuild already
  returns builtImageTag but the docker runtime ignores it for now).
- SSH target's scp transfer → Phase 4.
- Preflight integration checks (FR-017) → Phase 5.
- MCP tool surface (`build` etc.) → Phase 5.
🔴 Real bug — dirty source no-op:
  apply.Apply's idempotency gate was "existing.IntentHash ==
  opts.IntentHash" only. For intents with a `build:` block, that
  ignores the case where the user edits java-tron source but doesn't
  touch intent.yaml — the build pipeline would correctly produce a
  new cache key, but Apply short-circuited before reaching it,
  silently no-op-ing the dev loop.

  Fix: split the gate. No-build intents keep the legacy fast path.
  Build intents resolve the build first (cache hit < 200ms) and
  short-circuit ONLY when BOTH IntentHash AND BuildCacheKey match
  the existing state. The dev-loop test
  TestApply_DirtySourceTriggersRebuild pins the regression.

🟡 Defense in apply.validateOptions:
  Added the build/image/jar mutex check at apply level so callers
  that bypass intent.Validate (recipe, MCP, programmatic) still get
  a clear error instead of UB. Mirrors intent/loader.go's check.
  Test: TestApply_RejectsBuildAndImageMutex.

🟡 Example dev-local.yaml ports section:
  Removed the conflicting `ports: {http: 0, grpc: 0}` block —
  target.auto_ports already handles port assignment. Added a comment
  documenting why.

🟡 Full Apply() integration test:
  TestApply_FullFlow_RecordsBuildCacheKey exercises the full
  Apply() path with a build:-bearing intent through validate →
  resolveBuild → render → jar deploy → state persistence. Asserts
  the BuildCacheKey lands in ManagedNode for FR-018 prune.

🟢 BuildSpec defaults in ApplyDefaults:
  ApplyDefaults now fills BuildSpec.Revision/JDK/Artifact/Builder/
  GradleTask at intent-load time, matching build.Request.withDefaults.
  Two benefits: `config validate --explain` shows the effective
  values; downstream consumers don't re-derive. Also skips the
  Image default when Build is present (would otherwise violate the
  mutex post-defaults — quiet but real bug).
🔴 Real bugs:
- validateOptions now rejects runtime/artifact mismatches: runtime=docker
  + build.artifact=jar would silently render `image: ""` in compose
  because the Image default is suppressed when Build is set. Until
  Phase 3 wires artifact=image into the docker runtime, this combo is
  a clear VALIDATION_ERROR rather than a confusing compose failure.
- cmd/apply.go now passes through *output.StructuredError from
  internal/apply unchanged instead of wrapping in DEPLOY_ERROR. Build
  pipeline errors (BUILD_FAILED, INVALID_SOURCE, BUILD_CANCELLED,
  INVALID_ARTIFACT, BUILDER_IMAGE_UNAVAILABLE) now reach agents with
  the right error_code.

🟡 Test coverage:
- TestApply_CleanCacheHitNoChange: second Apply with same source +
  same intent → no_change + Build.CacheHit=true. Pins the OTHER half
  of the new idempotency gate.
- TestParse_BuildAppliesDefaults: parsing a minimal `build: { source }`
  YAML fills Revision/JDK/Artifact/Builder/GradleTask defaults.
- TestParse_BuildSuppressesImageDefault: ApplyDefaults skips the
  legacy Image default when Build is set (the silent post-defaults
  mutex bug found in self-review pass 1).
- TestApply_RuntimeMustMatchArtifact: subtest table for the new
  validateOptions runtime/artifact compatibility check.

🟢 Polish:
- noChangeResult: preconditions documented (caller must check
  opts.Existing != nil; defense via comment, not extra branch).
- TestApply_DirtySourceTriggersRebuild: assertion tightened from
  `Outcome != "no_change"` to `Outcome == "updated"` so a future
  bug that drops Outcome entirely also trips this guard.
🔴 Real bugs:
- intent/defaults.go: when any node carries `build:` AND target.runtime
  is unset, default it to "jar" (the Phase 2 wired path) instead of
  docker. Without this, the most natural dev intent — `target: {type:
  local}, nodes: [{build: {source: ...}}]` — would silently default
  to docker runtime + jar artifact and be rejected by the new
  mutex check. New helper anyNodeHasBuild() centralizes the test.
- apply/apply.go: validateOptions now returns *output.StructuredError
  with error_code=VALIDATION_ERROR + exit_code=2 instead of plain
  fmt.Errorf, which the cobra layer would have wrapped as
  DEPLOY_ERROR + exit_code=1. Agents now see the right error_code
  for bad-intent-shape cases.

🟡 Should fix:
- intent/loader.go: runtime/artifact mutex moved from apply.Apply to
  intent.Validate so `trond config validate intent.yaml` catches it
  before any deploy attempt. apply.validateOptions keeps the check as
  defense-in-depth for programmatic callers.
- cmd/apply.go: extracted wrapApplyError() helper for the
  StructuredError pass-through logic. Now unit-testable independently
  from a full cobra-apply path; cmd/apply_wrap_test.go covers
  pass-through (5 error codes), generic-error wrapping, and nil.

🟢 Polish:
- apply.BuildSummary gains BuilderImage field + apply.schema.json
  records it. Agents see which JDK builder digest produced an
  artifact inline with `trond apply -o json` instead of having to
  shell out to `trond build inspect`.

intent build_test.go adds:
- TestParse_BuildDefaultsRuntimeToJar (regression for #A)
- TestParse_NoBuildKeepsDockerDefault (the unchanged path)
- TestParse_BuildRuntimeArtifactMismatch (Validate-time check from #C)

cmd apply_wrap_test.go adds:
- TestWrapApplyError_PassesThroughStructuredError (5 codes table)
- TestWrapApplyError_WrapsGenericError
- TestWrapApplyError_NilPassthrough
🟡 Soft bug — A:
  apply.Apply's runtime fallback ("" → "docker") was inconsistent
  with intent.ApplyDefaults's new rule ("" → "jar" when build is
  present). cobra path passed through ApplyDefaults first so it never
  hit; programmatic callers (recipe, MCP, direct construction) would
  silently render docker compose instead of systemd. Apply now defers
  to intent.DefaultRuntime — same rule, one source of truth.

🟡 Consistency — B:
  Extracted intent.DefaultRuntime() as the shared decision rule.
  ApplyDefaults, intent.Validate (mutex check), and apply.Apply
  (fallback) all consult this helper. Without consolidation, a
  future change to the default rule would have to be made in three
  places and inevitably drift.

🟡 Example — C:
  examples/dev-local.yaml no longer sets `runtime: jar` explicitly.
  A comment explains the inferred default, making the user-friendly
  Phase 2 behavior visible in the canonical example instead of
  hiding it behind a redundant explicit setting.

🟢 Polish — D:
  Fixed a misleading comment in intent/loader.go::Validate that
  referenced a non-existent `applyTargetDefaults` function (the
  logic is inline in ApplyDefaults). Re-pointed at DefaultRuntime.

🟢 Polish — E:
  intent/loader.go::Validate now rejects `runtime: docker +
  build.artifact: image` at validate time with a Phase-3-pending
  message. Previously: validate passed, apply hit NOT_IMPLEMENTED
  in the build pipeline — user wasted a deploy attempt to learn
  what `trond config validate` could've told them immediately.
  Phase 3 will remove this branch when artifact=image lands in
  compose render.
trond now lets users build both halves of java-tron's published
compatibility matrix from one host:

  linux/amd64 (Intel/AMD)  → JDK 8  (only legacy-tested combo)
  linux/arm64 (Apple Silicon, Graviton, …) → JDK 17 (only mature
                                                     arm64 JIT)

Both BuildSpec.Platform and BuildSpec.JDK default in lockstep:
omit them and trond picks the host's arch + its required JDK
automatically. Override either to cross-build (docker uses QEMU
binfmt emulation, 3-5× slower but functional).

Changes:

intent/schema.go: added BuildSpec.Platform (validated to
  linux/amd64 | linux/arm64).
intent/defaults.go: new helpers DefaultPlatform() +
  DefaultJDKForPlatform(), wired through applyNodeDefaults so
  Platform defaults first and JDK then defaults based on the
  resolved (possibly-user-overridden) platform.
build/key.go: CacheKey.Platform participates in the cache key.
  Canonical default profile is now jdk=8 + jar + shadowJar +
  linux/amd64 + no args (host-independent, so amd64 and arm64
  hosts agree on what "canonical" means).
build/builder.go: Request.Platform + withDefaults applies it
  before JDK derivation. Mirrors intent's order.
build/platform.go (new): defaultPlatformForHost() +
  defaultJDKForPlatform() — package-local copies of intent's
  helpers (avoids the build→intent import cycle).
build/runner.go: passes `--platform <p>` to docker run when
  Platform is set. Docker pulls the matching variant of the
  multi-arch eclipse-temurin pin automatically.
cmd/build.go: new `--platform` flag.
apply/build.go: threads BuildSpec.Platform into build.Request.
schemas/intent.schema.json: documents the platform field.

Tests:
- key_test: TestCacheKey_PlatformChangesKey,
  TestCacheKey_PlatformLinuxAmd64IsCanonical.
- intent build_test: TestParse_BuildJDKDefaultsByPlatform (amd64→8,
  arm64→17), TestParse_BuildPlatformDefaultsToHostArch (host arch),
  and TestParse_BuildAppliesDefaults updated to be host-independent.

examples/dev-local.yaml: drops explicit jdk:, lets the arch-aware
default surface in the example so readers see the new behavior.
🟡 Consistency:
- internal/build/platform.go deleted. internal/build now imports
  internal/intent and calls intent.DefaultPlatform() /
  intent.DefaultJDKForPlatform() directly. The previous duplicate
  helpers were a silent-drift trap — any future change to the
  arch-default rule would have to be made in two places.

🟡 API surface — BuildSummary:
- apply.BuildSummary gains Platform + JDKVersion fields. Agents
  reading `trond apply -o json` can answer 'is this the amd64+JDK8
  build or the arm64+JDK17 build?' inline instead of round-tripping
  through `trond build inspect`. Wire-up: build.Result + Manifest
  + resultFromManifest all carry Platform too; apply/build.go
  threads it into BuildSummary; apply.schema.json + build.schema.json
  document both fields.

🟡 Cache migration note:
- Adding Platform to the cache-key fold changes the on-disk naming
  for non-default (jdk != 8) entries. Old manifests under
  ~/.trond/builds/manifest/ become orphans after upgrade — Phase 5
  prune will collect them. Documented in commit message; runtime
  behaviour: cache miss → rebuild (cheap second time).

🟢 Polish — D: CLI flag wiring test
- New cmd/build_flags_test.go::TestBuildCmd_PlatformFlagThreadsThrough
  sets buildPlatform via package-level globals, invokes runBuild,
  asserts the Request reaches the test runner with the expected
  fields populated. Covers all three flag states (amd64, arm64,
  empty=host default).

🟢 Polish — E: matrix mismatch warning
- build.Run now emits a one-line stderr warning when
  (platform, jdk) is outside java-tron's published compat matrix
  (amd64+8 or arm64+17). Not a hard error — power users on
  java-tron forks may have valid out-of-matrix combos. The warning
  message names both the offending platform AND the expected jdk
  so the operator can fix or override knowingly.
- New internal/build/matrix_test.go covers in-matrix (no warn)
  and out-of-matrix (warn names platform + expected jdk).

🟢 Polish — F: example simplification
- examples/dev-local.yaml drops explicit `artifact: jar` (now
  default-fed by ApplyDefaults). The comment now enumerates every
  default the example relies on, making the minimal-write
  capability obvious.
@kuny0707 kuny0707 merged commit 469a18c into tronprotocol:develop May 18, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants