Bound uplc-evaluator integration test teardown and run it nightly#7817
Open
Unisay wants to merge 4 commits into
Open
Bound uplc-evaluator integration test teardown and run it nightly#7817Unisay wants to merge 4 commits into
Unisay wants to merge 4 commits into
Conversation
Test teardown ran terminateProcess followed by an unbounded waitForProcess, so a spawned service that missed SIGTERM hung the run until the 2h CI cap (plutus-private#2257). Bound the wait and escalate to SIGKILL, add a per-test tasty timeout as a safety net, and a regression test that drives a SIGTERM-ignoring process.
The uplc-evaluator is a non-critical on-demand tool, so its integration tests need not gate every PR. The test executable is still built on every PR; the suite is run nightly instead. See plutus-private#2257.
See plutus-private#2257.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR hardens the uplc-evaluator integration test suite against teardown hangs by bounding process shutdown, adds a per-test timeout safety net, and moves execution of the suite from the per-PR required gate to the nightly workflow (while still building the test executable on PRs).
Changes:
- Implement bounded service shutdown in the test harness (SIGTERM → bounded wait → SIGKILL escalation → bounded reap) and make cleanup more robust.
- Wrap the tasty tree with a default per-test timeout (only when the user hasn’t provided
--timeout) and add a regression test covering SIGTERM-ignoring processes. - Adjust Nix/Hydra outputs and the nightly GitHub Actions workflow so the integration tests run nightly instead of gating every PR.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
plutus-benchmark/uplc-evaluator/test/Spec.hs |
Adds default tasty timeout wrapper and a regression test validating bounded teardown behavior. |
plutus-benchmark/uplc-evaluator/test/Harness.hs |
Introduces stopProcessBounded and switches teardown cleanup to removePathForcibly. |
plutus-benchmark/plutus-benchmark.cabal |
Adds unix dependency for POSIX signaling used by bounded shutdown. |
nix/outputs.nix |
Removes the integration test check from required Hydra jobs while still exposing/building the test derivation. |
.github/workflows/nightly-testsuite.yml |
Runs uplc-evaluator integration tests in the nightly workflow. |
- plutus-benchmark/uplc-evaluator/test/Harness.hs:126 — re-check getProcessExitCode before SIGKILL escalation (#7817 (comment))
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
plutus-benchmark:test:uplc-evaluator-integration-testsintermittently consumed the full 2h Hydra wall clock and landed asTIMED_OUT, reddeningx86_64-linux.requiredon unrelated PRs (IntersectMBO/plutus-private#2257). The root cause is in test teardown:Harness.stopServicesent SIGTERM viaterminateProcessand then calledwaitForProcesswith no timeout, so any spawneduplc-evaluatorthat missed SIGTERM blocked teardown until the external CI cap. The suite starts 20+ services concurrently under tasty, so under CI load the odds of one wedging on shutdown are non-trivial.This PR applies the three measures from the issue:
stopProcessBoundedwaits a grace period after SIGTERM, then escalates to SIGKILL viagetPid+signalProcess, with the post-kill reap bounded too. A process that survives even SIGKILL is left for init to reap rather than blocking teardown. Cleanup now usesremovePathForcibly. This pulls inunix, which is safe because the suite is alreadybuildable: Falseon Windows viaos-support.adjustOption defaultTestTimeout(120s per test), applied only when no--timeoutis given so it stays overridable. On its own it would not have fixed the hang, because when tasty times out a test thebracketcleanup still runs the teardown wait. So it is a safety net rather than the actual fix.requiredgate. The uplc-evaluator is a non-critical on-demand tool, so its integration tests need not gate every PR. The test executable is still built on every PR (thepackagesjobs stay inrequired), so a compile break still reddens PRs; only the test run moves to the nightly testsuite workflow.A regression test drives a SIGTERM-ignoring process (
trap '' TERM; exec sleep 600) and asserts teardown returns in bounded time with the process actually dead.Verified locally: the regression test fails (times out) against the old unbounded teardown and passes in ~1.2s with the fix; the full suite is 29/29 both via
cabaland via the exact nightlynix shellinvocation;nix evalconfirms thechecksjobs are gone fromrequiredwhile thepackagesbuilds remain.Closes https://github.com/IntersectMBO/plutus-private/issues/2257