feat(cli): af install/run for agent nodes — encrypted secrets, env prompting, node-to-node deps#692
Open
AbirAbbas wants to merge 7 commits into
Open
feat(cli): af install/run for agent nodes — encrypted secrets, env prompting, node-to-node deps#692AbirAbbas wants to merge 7 commits into
AbirAbbas wants to merge 7 commits into
Conversation
…n for agent nodes
Adds the foundation for making 'af install'/'af run' usable for real agent
nodes (which start via 'python -m pkg.app' and have no top-level main.py):
- internal/packages/secrets.go: encrypted at-rest secret store. KeyfileProvider
keeps a random 32-byte key at ~/.agentfield/keyring/master.key (0600);
SecretStore encrypts global.enc + <node>.enc via AES-256-GCM, with node scope
overriding global so shared keys (API tokens) are entered once.
- internal/packages/env_resolver.go: resolves declared env vars in order
process-env -> node store -> global store -> manifest default -> prompt
(hidden for type:secret), persisting prompted secrets encrypted. Injected only
into the child process; never written to disk in plaintext.
- installer.go: manifest gains entrypoint{start,healthcheck}, dependencies.nodes,
and per-var scope. Validation accepts entrypoint.start instead of requiring
main.py; package copy excludes .git/venv/.env/__pycache__.
- runner.go: launches via manifest entrypoint, exports AGENTFIELD_SERVER (the
var the SDK actually reads) alongside legacy AGENTFIELD_SERVER_URL, honors the
manifest healthcheck path, and resolves env via the secret store.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- af secrets set/ls/rm manages the encrypted store (hidden input for set, masked listing, global + --node scopes). - install resolves dependencies.nodes recursively (af://registry/<name> -> github.com/Agent-Field/<name>, or git URLs), skipping already-installed nodes to break cycles. - af run brings up a node's installed node-dependencies first, in dependency order, with cycle protection. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- docs/installing-agent-nodes.md: full guide to af install/run, the agentfield-package.yaml manifest (entrypoint, node deps, user_environment), the encrypted runtime-only secrets model, and af secrets. - cli-toolkit.md reference: document af install, af run, af secrets (+ embedded skill_data copy synced). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Local end-to-end verification revealed the CLI's install/run path goes through internal/core/services (DefaultPackageService/DefaultAgentService), which duplicated — and so bypassed — the fixes previously made in internal/packages. 'af install' on an entrypoint-only node still failed with 'main.py not found', and 'af run' still exported only AGENTFIELD_SERVER_URL and loaded plaintext .env. - package_service: validate/parse/copy now delegate to the shared packages.ValidatePackage / ParsePackageMetadata / ShouldSkipCopy (entrypoint accepted, junk excluded). Install guidance points at 'af secrets set'. - agent_service: buildProcessConfig launches via the manifest entrypoint, exports AGENTFIELD_SERVER, resolves env via the encrypted secret store (prompting for missing required), honors the manifest healthcheck path, and drops the plaintext .env loader. RunAgent starts node deps first with a threaded cycle guard. - packages: export ValidatePackage + ShouldSkipCopy as the single source of truth. - tests updated to the new contract (entrypoint validation, store-based env injection instead of .env). Verified end-to-end: install entrypoint-only node -> missing-secret errors cleanly -> af secrets set -> af run injects AGENTFIELD_SERVER + the stored secret + manifest default into the process (confirmed via the node's env dump). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Local multi-agent verification showed a port collision: dependencies were started after the parent allocated its port, so the parent's port (not yet bound) was handed out again to a dependency, which then failed to bind. Move dependency startup ahead of port allocation so each dependency fully binds its own port first. Verified end-to-end against a live local control plane: 'af run greeter-node' auto-starts its dependency echo-node (distinct ports 8002/8003), both register, both reasoners execute through the control plane, and an already-running dependency is left untouched (same PID). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Unit tests for resolveNodeRef, installedNames, installNodeDependencies (skip-already-installed), and startNodeDependencies (not-installed warning + already-running skip) in both the service and packages layers — covering the new patch lines and pinning the behaviors verified end-to-end. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
📊 Coverage gateThresholds from
✅ Gate passedNo surface regressed past the allowed threshold and the aggregate stayed above the floor. |
Contributor
📐 Patch coverage gateThreshold: 80% on lines this PR touches vs
❌ Patch gate failed
How to fix
|
This was referenced Jun 26, 2026
End-to-end install testing against the published node repos surfaced two gaps: 1. The git and GitHub install paths (git.go/github.go findPackageRoot) were a third and fourth copy of the 'main.py required' check, so 'af install <github-url>' failed for entrypoint-only nodes (no top-level main.py) such as SWE-AF and cloudsecurity-af. Both now delegate to the shared ValidatePackage (accepts a manifest entrypoint.start). 2. Dependency install only ran for requirements.txt projects, so pyproject-only nodes (pr-af, sec-af, cloudsecurity-af) installed with no venv and no deps. Dependency install is now a single shared InstallPythonDependencies that also runs 'pip install .' for pyproject.toml/setup.py projects. Verified: all five published node repos now install from their GitHub URLs; a pyproject node (sec-af) builds its venv and 'pip install .' succeeds, with sec_af + agentfield importable from the node's venv. (Nodes that declare requires-python >=3.11 need a matching interpreter on PATH — pip reports this clearly.) Tests updated for the new validation contract; new unit tests cover the pyproject branch and entrypoint-accepting findPackageRoot. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Member
|
@AbirAbbas before in, can you test it with our swe/pr-af etc.. ? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The
af install/af runagent-node scaffolding has existed since the very first commit but was effectively unusable for real nodes: it required a top-levelmain.py(real nodes start viapython -m pkg.app), hardcodedpython main.pyat launch, exportedAGENTFIELD_SERVER_URLwhile the SDK readsAGENTFIELD_SERVER, and stored secrets as plaintext.env. This PR makes the flow actually work end-to-end and adds the pieces needed for day-to-day use.What's new
agentfield-package.yamlmanifest with anentrypoint.start(e.g.python -m pr_af.app) — nomain.pyrequired. The runner launches via the manifest entrypoint, honors the manifesthealthcheckpath, and exportsAGENTFIELD_SERVER(+ legacyAGENTFIELD_SERVER_URL).~/.agentfield/secrets/with a random 32-byte key in~/.agentfield/keyring/master.key(0600). They are decrypted only into the child process' environment at start time — never written back to disk in plaintext. Global scope is shared across nodes; node scope overrides it.af run, required variables resolve in order: process env → node store → global store → manifest default → prompt (hidden fortype: secret), persisting prompted secrets encrypted. Missing required vars in a non-interactive session produce a clean error instead of hanging.af secretscommand.set/ls(values masked) /rm, with--nodescoping.dependencies.nodes(e.g.af://registry/swe-planner→github.com/Agent-Field/<name>, or a git URL).af installpulls them in recursively (skipping already-installed, which breaks cycles);af runstarts a node's dependencies first, in order, before allocating its own port — and leaves already-running dependencies untouched.Notable fixes uncovered while verifying locally
internal/core/services, a duplicate of theinternal/packageslogic — fixes are now applied there (the two layers share oneValidatePackage/ParsePackageMetadata/ShouldSkipCopy).Verification
Verified end-to-end against a live local control plane with two no-LLM nodes built on the real SDK:
af installan entrypoint-only node (nomain.py) → registers →af call node.reasonerreturns a real result.af secrets set,af runinjectsAGENTFIELD_SERVER+ the stored secret + manifest defaults into the process (confirmed via the node's own env dump; no plaintext on disk, files0600).af run greeter-nodeauto-starts its dependencyecho-nodefirst (distinct ports), both register and execute; an already-running dependency is not restarted.Docs
docs/installing-agent-nodes.md— full guide to install/run, the manifest schema, the encrypted secrets model, andaf secrets.cli-toolkit.mdreference updated (+ embedded skill copy synced).Test plan
go build ./...cleango test ./...for control-plane green (47 packages)Follow-ups (not in this PR)
agentfield-package.yamlmanifests in the public node repos (SWE-AF, pr-af, sec-af, cloudsecurity-af, af-template).af dev(a third copy of the launch logic) still hardcodesmain.py; collapsing the duplicated install/run implementations into one is a good follow-up.🤖 Generated with Claude Code