Skip to content

Pre-flight test run on main branch to detect broken test environments #64

@that-github-user

Description

@that-github-user

Problem

When a --test-cmd is provided, thinktank runs it on each agent's worktree to score results. But if the test suite is already failing on the main branch (broken environment, missing env var, DB not up, flaky tests), every agent scores "tests failed" — even the best one. The scoring is meaningless and the recommendation is unreliable.

There is currently no way to distinguish:

  • "Agent broke the tests" (agent is bad)
  • "Tests were already failing before any agent ran" (environment issue)

Proposed Solution

Before spinning up agents, run the test command once on the current branch (without any agent changes). If it fails:

⚠  Pre-flight test check failed on main branch:
   Command: npm test
   Exit code: 1

   Your test suite is failing before any agent runs. This will make
   test-based scoring unreliable.

   Options:
     --skip-preflight   Skip this check and run anyway
     --no-test-cmd      Omit test scoring for this run

   Fix your test suite first, or use --skip-preflight to proceed anyway.

If pre-flight passes, add a line to the run output: "✓ Pre-flight: tests pass on main branch".

Acceptance Criteria

  • When --test-cmd is provided, a pre-flight test run executes on the current working tree before agents start
  • If pre-flight fails, the run aborts with a clear error explaining the issue and suggesting --skip-preflight
  • --skip-preflight flag skips the check and proceeds (useful when tests are flaky or intentionally broken before the fix)
  • Pre-flight result is shown in run output
  • Pre-flight uses the same timeout and environment as agent test runs
  • Pre-flight failure is distinct from agent test failure in saved results JSON

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions