Skip to content

fix: etcd fails due to unaccounted space usage (e.g. WAL)#115

Open
cwrau wants to merge 6 commits into
mainfrom
fix/etcd-unaccounted-space-usage
Open

fix: etcd fails due to unaccounted space usage (e.g. WAL)#115
cwrau wants to merge 6 commits into
mainfrom
fix/etcd-unaccounted-space-usage

Conversation

@cwrau
Copy link
Copy Markdown
Member

@cwrau cwrau commented Mar 31, 2026

This way the real volume usage is looked at and used for resizing.

Copilot AI review requested due to automatic review settings March 31, 2026 13:17
@cwrau cwrau enabled auto-merge March 31, 2026 13:17
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new EtcdVolumeStatsProvider to monitor and reconcile etcd volume usage by querying kubelet stats, allowing for more accurate tracking of disk usage beyond just the etcd database size. It includes updates to the etcd client to support filtering by ready pods, improvements to the etcd cluster reconciler to incorporate filesystem usage metrics, and necessary dependency updates. I have no feedback to provide.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes etcd volume auto-resize logic by using actual filesystem usage (via kubelet stats summary) instead of relying solely on etcd-reported DB size, accounting for extra space usage like WAL/snapshot files.

Changes:

  • Add an etcd volume stats provider that queries kubelet /stats/summary and computes max etcd volume usage across pods.
  • Update the etcd cluster reconciler to compute an “effective” volume usage as max(dbSize, filesystemUsage) and emit warnings when they diverge significantly.
  • Update etcd status collection to target ready pods and parallelize member calls; adjust wiring, stubs, and tests accordingly.

Reviewed changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
test/etcd_stubs.go Updates etcd client stub signature and adds a stub for the new volume stats provider.
pkg/reconcilers/etcd_cluster/volume_stats/volume_stats.go Implements kubelet-based filesystem usage collection for etcd data volumes.
pkg/reconcilers/etcd_cluster/volume_stats/volume_stats_test.go Adds unit tests for volume identification (isEtcdDataVolume).
pkg/reconcilers/etcd_cluster/reconciler.go Integrates filesystem usage into resize decisions; adds pod listing and adjusts health check placement and args gating.
pkg/reconcilers/etcd_cluster/reconciler_test.go Extends tests for new effective-usage and warning behaviors.
pkg/reconcilers/etcd_cluster/etcd_client/etcd_client.go Changes GetStatuses API to accept ready pod names and parallelizes per-member calls.
pkg/hostedcontrolplane/controller.go Wires the new volume stats provider into the reconciler; adjusts events RBAC verbs.
go.mod Bumps several dependencies (k8s libs, controller-runtime, grpc, cilium).
go.sum Updates checksums for dependency upgrades.
.golangci.yaml Adds import aliasing for kubelet stats API package.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/reconcilers/etcd_cluster/reconciler.go Outdated
Comment thread pkg/reconcilers/etcd_cluster/reconciler.go Outdated
Comment thread pkg/reconcilers/etcd_cluster/reconciler.go Outdated
Comment thread pkg/reconcilers/etcd_cluster/etcd_client/etcd_client.go
Comment thread go.mod Outdated
Copilot AI review requested due to automatic review settings March 31, 2026 13:28
auto-merge was automatically disabled March 31, 2026 13:28

Head branch was pushed to by a user without write access

@cwrau cwrau review requested due to automatic review settings March 31, 2026 13:28
Copilot AI review requested due to automatic review settings March 31, 2026 13:30
@cwrau cwrau force-pushed the fix/etcd-unaccounted-space-usage branch from ea81376 to 2e6eac4 Compare March 31, 2026 13:30
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread go.mod
@cwrau cwrau force-pushed the fix/etcd-unaccounted-space-usage branch from 2e6eac4 to 516cda8 Compare March 31, 2026 13:37
Copilot AI review requested due to automatic review settings March 31, 2026 13:39
@cwrau cwrau force-pushed the fix/etcd-unaccounted-space-usage branch from 516cda8 to b37d3d4 Compare March 31, 2026 13:39
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/reconcilers/etcd_cluster/reconciler.go Outdated
Comment thread pkg/reconcilers/etcd_cluster/reconciler.go Outdated
Comment thread pkg/reconcilers/etcd_cluster/reconciler.go Outdated
Comment thread pkg/reconcilers/etcd_cluster/reconciler.go
@cwrau cwrau force-pushed the fix/etcd-unaccounted-space-usage branch from b37d3d4 to 2a5b8ad Compare April 1, 2026 14:06
@cwrau cwrau enabled auto-merge April 1, 2026 14:07
Copilot AI review requested due to automatic review settings April 17, 2026 08:17
@cwrau cwrau force-pushed the fix/etcd-unaccounted-space-usage branch from 2a5b8ad to 3c927cd Compare April 17, 2026 08:17
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/reconcilers/etcd_cluster/volume_stats/volume_stats.go
Comment thread go.mod
Comment thread pkg/reconcilers/etcd_cluster/volume_stats/volume_stats.go Outdated
Comment thread pkg/reconcilers/etcd_cluster/reconciler.go Outdated
Comment thread pkg/reconcilers/etcd_cluster/reconciler.go Outdated
@cwrau cwrau force-pushed the fix/etcd-unaccounted-space-usage branch from 3c927cd to d38cd45 Compare April 17, 2026 09:09
Copilot AI review requested due to automatic review settings April 20, 2026 09:38
@cwrau cwrau force-pushed the fix/etcd-unaccounted-space-usage branch from d38cd45 to 1dc1b75 Compare April 20, 2026 09:38
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 10 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/reconcilers/etcd_cluster/etcd_client/etcd_client.go
Comment thread pkg/reconcilers/etcd_cluster/reconciler.go Outdated
Comment thread go.mod
Copilot AI review requested due to automatic review settings April 30, 2026 10:13
@cwrau cwrau force-pushed the fix/etcd-unaccounted-space-usage branch from 382c399 to 76e651c Compare April 30, 2026 10:13
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/reconcilers/etcd_cluster/volume_stats/volume_stats_test.go Outdated
Comment thread pkg/reconcilers/etcd_cluster/reconciler.go Outdated
Comment thread pkg/reconcilers/etcd_cluster/reconciler.go
Comment thread go.mod
Copilot AI review requested due to automatic review settings May 18, 2026 08:20
@cwrau cwrau force-pushed the fix/etcd-unaccounted-space-usage branch from fe0dd96 to b4747de Compare May 18, 2026 08:20
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 33 out of 35 changed files in this pull request and generated 1 comment.

Comment thread test/context.go Outdated
@cwrau cwrau force-pushed the fix/etcd-unaccounted-space-usage branch from b4747de to daeb324 Compare May 18, 2026 09:02
marvinWolff
marvinWolff previously approved these changes May 18, 2026
@cwrau cwrau added this pull request to the merge queue May 18, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to a conflict with the base branch May 18, 2026
cwrau added 3 commits May 18, 2026 17:07
This way the real volume usage is looked at and used for resizing.
chore: replace span/log/recorder with emit
Copilot AI review requested due to automatic review settings May 18, 2026 15:16
@cwrau cwrau force-pushed the fix/etcd-unaccounted-space-usage branch from daeb324 to 8e3d4cd Compare May 18, 2026 15:16
@cwrau cwrau enabled auto-merge May 18, 2026 15:16
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 33 out of 35 changed files in this pull request and generated 2 comments.

Comment thread pkg/reconcilers/etcd_cluster/reconciler.go
Comment thread pkg/reconcilers/etcd_cluster/reconciler_test.go Outdated
Copilot AI review requested due to automatic review settings May 19, 2026 08:45
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 33 out of 35 changed files in this pull request and generated 1 comment.

Comment thread pkg/reconcilers/etcd_cluster/reconciler.go Outdated
@cwrau cwrau added this pull request to the merge queue May 19, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to a conflict with the base branch May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants