Skip to content

test_bench_exits_nonzero invokes full bench CLI; hangs locally when Ollama is running #16

Description

@adris-misra

Description

The test test_bench_exits_nonzero invokes the full industrial-agents bench CLI to verify non-zero exit on failure. Since IA-4 and IA-6 now
make real LLM calls, this test runs the entire benchmark pipeline when
Ollama is reachable locally — taking minutes or appearing to hang.

In CI (no Ollama), the bench fails fast with a connection error → exit
code 1 → the test's assertion passes. So CI is unaffected, but local
pytest tests/unit -q is slow or hangs.

Surfaced during PR #15 (IA-4/IA-6 implementation).

Steps to reproduce

  1. Ensure Ollama is running locally (e.g. llama3.2:1b pulled and serving)
  2. From the repo root: pytest tests/unit -q
  3. Observe the run stall at test_bench_exits_nonzero while it executes
    the full IA-4 + IA-6 benchmark against the live model

Expected behaviour

A unit test should never invoke the real benchmark pipeline or make live
LLM calls. The test should mock the CLI's bench invocation so it verifies
the non-zero exit path deterministically, in milliseconds, regardless of
whether Ollama is running.

Actual behaviour

The test invokes the real industrial-agents bench command. With Ollama
running, this executes IA-4 and IA-6 (real LLM calls), taking minutes or
hanging. Without Ollama (CI), it fails fast with a connection error, so
the assertion coincidentally passes — masking the design flaw.

Suggested fix: mock the bench invocation in this test, OR mark it
@pytest.mark.integration and exclude it from the default unit run.

Framework version

v0.1.0-pre (bench/iabench-synthesis-cost @ 66bc8fe)

LLM provider

ollama

Environment

Windows 11, Python 3.12

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions