Transient pod waiting state can fail restart OpsRequest

## Problem

A restart OpsRequest can be marked Failed when a recreated Pod temporarily reports a kubelet waiting message but later becomes Ready.

In the ES preserved case, the Pod event was:

```text
Error: failed to sync configmap cache: timed out waiting for the condition
```

The Pod later completed init/main startup and the ES cluster was healthy, but the OpsRequest progress had already been finalized as Failed.

## Suspected controller path

`IsPodFailedAndTimedOut()` treats any container `Waiting` state with a non-empty message as a failed container after `PodContainerFailedTimeout`.

That transient failure condition is propagated through InstanceSet `InstanceFailure` and then Ops progress finalization.

## Expected behavior

Recoverable kubelet startup wait states should not be treated as terminal instance failures. Ops progress should only fail for terminal waiting reasons or unrecovered pod failures.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transient pod waiting state can fail restart OpsRequest #10300

Problem

Suspected controller path

Expected behavior

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Transient pod waiting state can fail restart OpsRequest #10300

Description

Problem

Suspected controller path

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions