Context
Sheaft should consume the Bering liveness/degradation contract from MB3R-Lab/Bering#43 and map it into orchestration decisions. Sheaft should not define a separate source-of-truth status model.
Failure detector motivation
This work is explicitly motivated by failure detector theory as applied at the orchestration boundary. Sheaft should preserve Bering's uncertainty model instead of turning every missing or delayed observation into an immediate workflow failure.
The useful downstream behavior is to distinguish suspicion from confirmed unreachability, stale data from current degradation, and unknown state from failure. This lets Sheaft make policy decisions such as wait, retry, pause, reroute, fail, or escalate while staying compatible with Bering's eventual convergence model.
Scope
- Read and validate Bering health observations once the upstream contract is available.
- Map Bering states to workflow decisions such as wait, retry, pause, fail, reroute, or escalate.
- Preserve uncertainty:
suspect, stale, and unknown must not collapse into a generic failure.
- Surface useful diagnostics in CLI/report output: reason, last seen time, observation time, missed heartbeat count, and suspicion score where present.
- Add tests for degraded, suspect, stale, unreachable, unknown, recovery, and out-of-order observation handling.
- Document the consumer behavior and compatibility expectations.
Definition of done
- Sheaft consumes the Bering contract without redefining incompatible statuses.
- Orchestration behavior is covered by tests and documented.
- Docs explicitly mention the failure detector motivation and describe how uncertainty maps to workflow policy.
- Compatibility metadata points at the Bering version/commit that introduced the contract.
Context
Sheaft should consume the Bering liveness/degradation contract from MB3R-Lab/Bering#43 and map it into orchestration decisions. Sheaft should not define a separate source-of-truth status model.
Failure detector motivation
This work is explicitly motivated by failure detector theory as applied at the orchestration boundary. Sheaft should preserve Bering's uncertainty model instead of turning every missing or delayed observation into an immediate workflow failure.
The useful downstream behavior is to distinguish suspicion from confirmed unreachability, stale data from current degradation, and unknown state from failure. This lets Sheaft make policy decisions such as wait, retry, pause, reroute, fail, or escalate while staying compatible with Bering's eventual convergence model.
Scope
suspect,stale, andunknownmust not collapse into a generic failure.Definition of done