Skip to content

fix: prevent stale ready status after stack update#3610

Open
zreigz wants to merge 4 commits into
masterfrom
lukasz/prod-4405-infrastructurestack-crd-doesnt-change-readiness-on-updates
Open

fix: prevent stale ready status after stack update#3610
zreigz wants to merge 4 commits into
masterfrom
lukasz/prod-4405-infrastructurestack-crd-doesnt-change-readiness-on-updates

Conversation

@zreigz
Copy link
Copy Markdown
Member

@zreigz zreigz commented May 25, 2026

When a stack was updated, the controller immediately queried the Console API for the status. This could incorrectly return Successful from the previous run.

This PR reduces the requeue interval to provide more responsive status updates.

It also fixes requeue behavior in read-only mode to ensure the actual status is retrieved.

The stack.Spec.Reconciliation.Requeue() replaced with ctrl.Result{RequeueAfter: requeueAfterInfrastructureStack}

Test Plan

Test environment: https://console.your-env.onplural.sh/

Checklist

  • I have added a meaningful title and summary to convey the impact of this PR to a user.
  • If required, I have updated the Plural documentation accordingly.
  • I have added tests to cover my changes.
  • I have deployed the agent to a test environment and verified that it works as expected (required only when changing agent code).

Plural Flow: console

@linear
Copy link
Copy Markdown

linear Bot commented May 25, 2026

PROD-4405

@zreigz zreigz added the bug-fix This pull request fixes a bug label May 25, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 25, 2026

Greptile Summary

Addresses a race condition where the controller would read a stale Successful stack status immediately after calling UpdateStack, causing the Ready condition to be prematurely set True for the previous run rather than the newly triggered one.

  • Introduces a changed flag in Process: when the computed SHA differs from the stored SHA, the controller skips the GetStackStatus call entirely on that reconcile and requeues after ~5 s (Jitter(5s)), giving the console time to clear the old status before the next check.
  • Changes setReadyCondition signature to return (*ctrl.Result, error) and adds a WaitForResources requeue when the stack does not yet exist, preventing a newly-created stack from being immediately marked Synchronized=True before any status is available.
  • As a side effect, handleExistingResource now returns requeueAfterInfrastructureStack (2 min) instead of stack.Spec.Reconciliation.Requeue(), which changes the polling frequency for read-only stacks where drift detection is disabled.

Confidence Score: 4/5

The core logic is safe to merge; the race-condition fix is correct and well-scoped.

The changed flag and early-return requeue correctly prevent the stale Successful read after an update. Two observations are worth a second look: the 5-second delay is an untested heuristic that could be too short under console load, and the silent switch from Reconciliation.Requeue() to a fixed 2-minute interval in handleExistingResource changes polling behavior for drift-detection-disabled stacks in a way not covered by the PR description.

go/controller/internal/controller/infrastructurestack_controller.go — specifically the requeue interval change in handleExistingResource and the hardcoded 5-second wait after update.

Important Files Changed

Filename Overview
go/controller/internal/controller/infrastructurestack_controller.go Adds a changed flag so the controller waits ~5 s after a stack update before calling GetStackStatus, preventing stale "Successful" reads. Also changes handleExistingResource to always requeue every 2 min instead of respecting Reconciliation.Requeue(), which silently alters behavior for drift-detection-disabled stacks.

Reviews (1): Last reviewed commit: "format" | Re-trigger Greptile

Comment thread go/controller/internal/controller/infrastructurestack_controller.go
Comment thread go/controller/internal/controller/infrastructurestack_controller.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix This pull request fixes a bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant