Skip to content

chore(ci): collect K8s diagnostics on E2E test failure#25114

Merged
pront merged 2 commits intomasterfrom
pront/k8s-e2e-diagnostics
Apr 6, 2026
Merged

chore(ci): collect K8s diagnostics on E2E test failure#25114
pront merged 2 commits intomasterfrom
pront/k8s-e2e-diagnostics

Conversation

@pront
Copy link
Copy Markdown
Member

@pront pront commented Apr 2, 2026

Summary

Add a diagnostic step to the K8s E2E workflow that runs on failure (if: failure()). This captures:

  • Cluster-wide pod status, events, and node info
  • Per vector-* namespace: pod descriptions, events, configmaps
  • Container logs (current + previous) with --all-containers=true
  • Node resource allocation
  • Minikube system logs

Motivated by #25111 where the root cause (a config field rejected by Vector on startup) would have been immediately visible from pod logs.

How did you test this PR?

Validated YAML syntax. The step uses if: failure() so it only runs when tests fail. Zero overhead on green runs.

Change Type

  • Bug fix
  • New feature
  • Dependencies
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes.
  • No. A maintainer will apply the no-changelog label to this PR.

References

🤖 Generated with Claude Code

@github-actions github-actions bot added the domain: ci Anything related to Vector's CI environment label Apr 2, 2026
@pront pront added no-changelog Changes in this PR do not need user-facing explanations in the release changelog platform: kubernetes Anything `kubernetes` platform related labels Apr 2, 2026
@pront pront marked this pull request as ready for review April 2, 2026 18:48
@pront pront requested a review from a team as a code owner April 2, 2026 18:48
@pront pront force-pushed the pront/k8s-e2e-diagnostics branch from f046f8f to 548d1d9 Compare April 2, 2026 18:54
@pront pront enabled auto-merge April 2, 2026 18:55
@pront pront added this pull request to the merge queue Apr 2, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 2, 2026
Comment thread .github/workflows/k8s_e2e.yml Outdated
@pront pront force-pushed the pront/k8s-e2e-diagnostics branch from 8d09d33 to 601665d Compare April 3, 2026 14:06
@pront pront enabled auto-merge April 3, 2026 14:07
@pront pront added this pull request to the merge queue Apr 3, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 3, 2026
Add a diagnostic step to the K8s E2E workflow that runs on failure.
Captures pod logs, events, configs, and node resource usage to avoid
deep manual investigation when tests fail.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@pront pront force-pushed the pront/k8s-e2e-diagnostics branch from 0aefbe2 to f72ce22 Compare April 6, 2026 13:28
@pront pront enabled auto-merge April 6, 2026 13:31
@pront pront added this pull request to the merge queue Apr 6, 2026
Merged via the queue into master with commit 1152614 Apr 6, 2026
59 checks passed
@pront pront deleted the pront/k8s-e2e-diagnostics branch April 6, 2026 14:20
@github-actions github-actions bot locked and limited conversation to collaborators Apr 6, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

domain: ci Anything related to Vector's CI environment no-changelog Changes in this PR do not need user-facing explanations in the release changelog platform: kubernetes Anything `kubernetes` platform related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants