Summary
Introduce a Retry Test feature aimed at test failures caused by transient infrastructure or external dependency issues (e.g., network hiccups, temporary service unavailability, throttling), rather than application logic defects. The goal is to reduce flaky CI outcomes while keeping true failures visible and actionable.
Problem / Motivation
Some test cases rely on external systems such as databases, message brokers, HTTP services, or cloud resources. These dependencies can intermittently fail due to infrastructure conditions, causing:
- False-negative test failures in CI
- Manual re-runs of entire test suites
- Reduced confidence in test signal and slower feedback loops
We need a controlled retry mechanism that:
- Retries only when the failure is likely transient
- Does not mask genuine logic or assertion failures
- Clearly reports when a test required retrying to pass
Goals
- Reduce CI flakiness caused by transient external dependency failures.
- Preserve correctness by avoiding retries for deterministic logic/assertion failures.
- Improve observability by recording retry attempts, reason, and final outcome.
Non-Goals
- Blindly retrying all test failures.
- Hiding flaky tests or dependency instability (retries must remain visible).
- Replacing proper test isolation, cleanup, and resilience improvements.
Proposed Solution
Implement a retry policy that applies to tests interacting with external dependencies, I would be happy to provide my solution if it aligns with the Project Direction
Summary
Introduce a Retry Test feature aimed at test failures caused by transient infrastructure or external dependency issues (e.g., network hiccups, temporary service unavailability, throttling), rather than application logic defects. The goal is to reduce flaky CI outcomes while keeping true failures visible and actionable.
Problem / Motivation
Some test cases rely on external systems such as databases, message brokers, HTTP services, or cloud resources. These dependencies can intermittently fail due to infrastructure conditions, causing:
We need a controlled retry mechanism that:
Goals
Non-Goals
Proposed Solution
Implement a retry policy that applies to tests interacting with external dependencies, I would be happy to provide my solution if it aligns with the Project Direction