Skip to content

Connector upstream-status annotation not re-mirrored after Ready condition flips True #209

Description

@drewr

Summary

When a tunnel connector's Ready condition transitions False → True (heartbeat renews the lease), the downstream connector's networking.datumapis.com/upstream-status annotation is not updated. The extension server reads this stale annotation and classifies the connector as offline, causing persistent HTTP 503 responses.

Sequence

  1. Connector created — replicator mirrors status into the downstream annotation. At this moment Ready: False (lease not yet renewed).
  2. Heartbeat connects — upstream connector becomes Ready: True, connectionDetails populated.
  3. Replicator does not re-reconcile — skipUpstreamStatusSync: true suppresses downstream→upstream sync, and there is no watch on upstream status changes to re-trigger the mirror path.
  4. Downstream connector's annotation still contains Ready: False with stale connectionDetails.
  5. Extension server reads the annotation, sees Ready: False, marks all routes for that connector as offline → HTTP 503.

Evidence

Downstream connector annotation captured while upstream connector was Ready: True:

{
  "conditions": [
    { "type": "Ready", "status": "False", "reason": "ConnectorNotReady",
      "message": "Connector lease has expired. Agent may be offline." }
  ],
  "connectionDetails": { "publicKey": { "id": "378843c806c8c93c5770abaa19bc47e04e9f56977c6e7cc28044a09ef5a1cd23", ... } }
}

Upstream connector at the same moment:

{ "type": "Ready", "status": "True", "reason": "ConnectorReady",
  "message": "The connector is ready to tunnel traffic." }

Extension server log confirms all vhosts for this connector remain offline:

connector_offline_routes:26  clusters_replaced:1

Root Cause

replicationResourceConfig for the Connector type sets:

mirrorStatusToAnnotation: true,
skipUpstreamStatusSync:   true,

The replicator reconciles on spec changes but has no watch/enqueue path for upstream status changes. When Ready flips after the initial replication, nothing re-queues the replicator to update the annotation.

touchDownstreamGatewayAnnotations in the connector controller fires to trigger EG re-translation, but the stale annotation means the extension server still classifies the connector as offline even after re-translation.

Fix

The replicator should watch upstream connector status changes and re-mirror the annotation when status changes. Alternatively, the connector controller should enqueue a replicator reconcile after writing Ready: True.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions