Testing Guide

Audience: Contributors.

This is the canonical human-facing guide for how testing works in this repository. Guardrails still constrain the rules, but contributor workflow, test selection, live-test setup, and diagnostics should be learned here first.

Repository Tests-First Workflow

The repository delivery flow is tests-first, not just test-after:

Specification
Spec hardening
Test generation
Implementation
Review
Doc sync

Within implementation, every addition, bug fix, and behaviour change must follow RED → GREEN → REFACTOR:

RED: write or update the smallest behavioural test first and verify it fails for the intended reason.
GREEN: implement the minimum production change needed to make that test pass, then progressively broaden verification through the next wider relevant layers before the final full-suite run so the repository is restored to an all-green state.
REFACTOR: improve naming, structure, and duplication only after the relevant tests are green.

GREEN is not satisfied by a slice-only pass. The targeted failing test proves intent. Wider layers rebuild confidence. The final green gate is a fresh full-suite pass for the repository.

Adding production code before the intended failing test exists is not compliant with the repository workflow.

What Spec Hardening Means Here

After a spec is approved and before implementation proceeds, the repository expects the feature scope to be challenged for:

architecture risk
observability coverage
red-team blind spots

The agent-oriented enforcement lives in test-first-workflow.md. Contributors should treat that file as the exact contract and this guide as the human explanation.

Test Hierarchy

The repository uses a four-level hierarchy. Prefer the lowest level that can prove the behaviour.

Level	Typical marker	What it proves	Speed	External dependencies
Unit	`[TestClass]` / `[TestMethod]`	Isolated logic, branching, transforms, validation	Fastest	None
Feature	`.feature` + `[Binding]`	Behaviour scenarios with in-memory seams	Fast	None
Smoke system	`[TestCategory("SystemTest_Smoke")]`	Startup wiring and process boundary health before broader system coverage	Fast-medium	None
Simulated system	`[TestCategory("SystemTest_Simulated")]`	End-to-end behaviour with deterministic Simulated connectors	Medium	None
Live system	`[TestCategory("SystemTest")]` or `[TestCategory("SystemTest_Live")]`	Real Azure DevOps or TFS behaviour	Slowest	Real credentials and environment

Selection Rule

Choose the first layer that can falsify the behaviour without real infrastructure:

Unit when the logic does not need connector or process boundaries.
Feature when behaviour matters but real I/O still does not.
Smoke system when you need a fast startup/lifecycle guard on host-to-agent wiring without running a migration workload.
Simulated system when you need end-to-end wiring, package output, or process boundaries.
Live system only when lower layers cannot prove the connector or environment-specific behaviour.

If a proposed live test can be rewritten as a unit, feature, or simulated test, do that instead.

Running Tests

# All tests
dotnet test

# Fast local pass that excludes live system tests
dotnet test --filter "TestCategory!=SystemTest"

# Smoke system tests only
dotnet test --filter "TestCategory=SystemTest_Smoke"

# Simulated system tests only
dotnet test --filter "TestCategory=SystemTest_Simulated"

# Live system tests only
dotnet test --filter "TestCategory=SystemTest"

Some suites may also use SystemTest_Live as an additional category marker. Treat those as live tests as well.

Test Conventions

MSTest

All test classes use [TestClass].
All test methods use [TestMethod].
No Assert.Inconclusive() in committed tests.
No [Ignore] in committed tests.
No vacuous assertions such as Assert.IsTrue(true) or Assert.IsTrue(count >= 0).

Naming

{Subject}_{Scenario}_{ExpectedOutcome}

Examples:
WorkItemExportModule_WhenSourceHasRevisions_WritesRevisionFilesToPackage
SimulatedWorkItemSource_WhenSeeded_ReturnsAtLeastTwoItems

Reqnroll Feature Tests

Feature files live in features/. Step definitions follow Reqnroll.MSTest conventions.

Feature files must comply with acceptance-test-format.md.
Step definitions must be in a class annotated [Binding].
Use feature tests for behaviour scenarios with in-memory fakes or mocks, not for live environment coverage.

Simulated System Tests

Use Simulated connectors for deterministic end-to-end coverage.

They must:

return at least 2 items per EnumerateAsync call
be deterministic for a given seed
avoid live network dependencies

Module Expectations

Every export module SystemTest_Simulated should prove:

the expected artefact path exists in IArtefactStore
the artefact content is non-empty

Every import module SystemTest_Simulated should prove:

the target connector received data
assertions are about observable output, not only absence of exceptions

Live System Tests

Live system test setup, CI wiring, environment variables, and troubleshooting live in live-system-testing-guide.md.

The important repository rule is simple: do not make committed live tests self-skip with Assert.Inconclusive() or [Ignore]. If an environment is unavailable, exclude that category from the run instead of committing a self-skipping test body.

Diagnostics For Simulated And Live Tests

When a SystemTest_Simulated, SystemTest, or SystemTest_Live test runs through CliRunner, every spawned CLI and agent process writes OTel diagnostics under:

<repo-root>/.output/workingtests/{TestMethodName}/.otel-diagnostics/

Use this directory to inspect:

*-logs.log
*-traces.log
*-metrics.log

If the same run is reproduced with --diagnostics, the CLI also writes raw control-plane payloads under:

<repo-root>/.output/workingtests/{TestMethodName}/.otel-diagnostics/inbox/

Typical inbox files include bootstrap, telemetry, progress-{module}-{stage}, and diagnostics payloads. These folders are local debug artefacts and are ignored by Git.

Test Data

Use the Simulated source connector for deterministic test data.
Never rely on live data for unit, feature, or simulated tests.
The Seed property in Simulated config controls determinism.

Assertion Standards

What every test must prove:

For export: artefact exists at the expected path and has non-trivially non-empty content.
For import: the target connector shows evidence of received data.
For connectors: the SDK or backing service was actually exercised, not hard-coded.

Lifecycle-enabled connector tests

For project-lifecycle scenarios, mark tests with explicit lifecycle eligibility and ensure the run:

Creates a run-correlated ephemeral project name before the main assertions execute.
Binds execution to that created project identity.
Always attempts teardown in a finally path, including when the test body fails.
Emits a lifecycle record with create result, teardown result, blocking reason (if any), and teardown latency.

SC-004 measurement plan (cleanup intervention reduction)

Track manual cleanup interventions before and after rollout:

Source: lifecycle outcome logs (ProjectLifecycle outcome ...) and operator incident notes.
Metric: count of runs requiring human cleanup follow-up per week.
Target: reduce manual cleanup interventions by at least 50% over the first two reporting windows.
Collection: run scripts/project-lifecycle/collect-cleanup-metrics.ps1 against test logs and publish weekly trend snapshots.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Testing Guide

Repository Tests-First Workflow

What Spec Hardening Means Here

Test Hierarchy

Selection Rule

Running Tests

Test Conventions

MSTest

Naming

Reqnroll Feature Tests

Simulated System Tests

Module Expectations

Live System Tests

Diagnostics For Simulated And Live Tests

Test Data

Assertion Standards

Further Reading

Lifecycle-enabled connector tests

SC-004 measurement plan (cleanup intervention reduction)

Uh oh!

FilesExpand file tree

testing-guide.md

Latest commit

History

testing-guide.md

File metadata and controls

Testing Guide

Repository Tests-First Workflow

What Spec Hardening Means Here

Test Hierarchy

Selection Rule

Running Tests

Test Conventions

MSTest

Naming

Reqnroll Feature Tests

Simulated System Tests

Module Expectations

Live System Tests

Diagnostics For Simulated And Live Tests

Test Data

Assertion Standards

Further Reading

Lifecycle-enabled connector tests

SC-004 measurement plan (cleanup intervention reduction)