feat(eval): add --serve to auto-launch the EvalServer#20
Open
kunalkushwaha wants to merge 1 commit into
Open
Conversation
Removes the two-terminal dance from `agk eval`. With `--serve`, AGK builds and launches the project in EvalServer mode (AGK_EVAL_MODE=true), waits for it to become healthy, runs the tests, and tears it down — all in one command. - startEvalServer runs `go run .` (or a custom --serve-cmd) in its own process group so the compiled child is reliably killed on teardown (SIGTERM→SIGKILL). - waitForHealthy polls the test file's target.url /health until ready or timeout. - Server stdout/stderr is captured and printed if startup fails (and streamed with a [server] prefix under --verbose). - Lifecycle is signal-safe and torn down before the os.Exit on test failure. - Flags: --serve, --serve-dir, --serve-cmd, --serve-timeout. - Docs: EVAL.md "Run Tests" now leads with the one-command flow. Tests: parseServeCmd + waitForHealthy (httptest, healthy and timeout paths). Verified end-to-end against a stub EvalServer (launch → run → clean teardown, no lingering process). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Removes the two-terminal dance from
agk eval. Today you must start your project in EvalServer mode in one terminal, then runagk evalin another. With--serve, AGK builds & launches the project (settingAGK_EVAL_MODE=true), waits for it to become healthy, runs the tests, and tears it down — all in one command.This is feature B2 (auto-serve) from the
FEATURES.mdroadmap — the biggest eval-DX win.Before / after
How it works
startEvalServerrunsgo run .(or a custom--serve-cmd) in its own process group, sogo run's compiled child is reliably killed on teardown (SIGTERM → SIGKILL after a grace period).waitForHealthypolls the test file'starget.url/healthuntil ready or--serve-timeout.[server]prefix under--verbose).os.Exiton test failure.Flags
--serve--serve-dir.--serve-cmdgo run .--serve-timeout90Testing
go build,go vet,go test ./...,gofmtall green.parseServeCmd, andwaitForHealthyagainsthttptest(becomes-healthy and timeout paths).agk eval --servelaunched it, waited for health, ran acontainstest to a pass, and tore down cleanly — confirmed no lingering process afterward (process-group kill works).docs/EVAL.md"Run Tests" now leads with the one-command flow.🤖 Generated with Claude Code