Skip to content

Slow Listener / Connector readiness at scale (many CRs stay non‑Ready long after creation) #2412

@AryanP123

Description

@AryanP123

Problem

When creating a large number of Skupper v2 Listener and Connector resources (paired by routingKey), all CRs appear in the API, but it takes a long time until every resource reports Ready=True.

Reproduce

  • Repo scenario: skupper/tests/e2e/scenarios/service-scale

  • Command (example):

    make test TEST="service-scale" \
      EXTRA_VARS="-e service_count=300 -e ready_retries=400 -e ready_delay=3"
  • Cluster: kind (single control plane), MetalLB, Skupper v2 controller (dev cluster setup).

Observed behavior

  • Full playbook wall time on the order of ~15–17 minutes for 300 pairs; a large fraction is spent in Ansible waits:

    • Wait for all Connectors Ready (east)
    • Wait for all Listeners Ready (west)
  • While investigating live with kubectl:

    • Connector objects often showed Pending / not Ready with No matching listeners until the paired Listener side was present and reflected in Skupper’s matching logic. Ready is not implied by “CR exists.”
    • After all CRs existed, Listener Ready counts increased gradually and sometimes stalled before completing, suggesting a reconciliation / status-propagation tail

From the API model, Ready for these resources depends on per-resource Configured work and Matched to the peer (HasMatchingListener / HasMatchingConnector) derived from the linked network view. Even when all CRs are already created, finishing Configured + updating matching + writing status for hundreds of resources appears to take many minutes.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions