Batcher refactor and performance observability by sitole · Pull Request #2310 · e2b-dev/infra

sitole · 2026-04-07T09:03:33Z

Refactor batcher implementation to be easier to read.
Performance observability metrics for batcher.
Traces for batcher callback functions.

Batcher loop is now easier to navigate. Time waits replaced with ticker. Metrics for performance monitoring are now included. Different batchers can be run in parallel with name attribute used to distinguish.

cursor · 2026-04-07T09:03:40Z

PR Summary

Medium Risk
Medium risk because it changes the batcher’s concurrency/flush logic and Push API semantics, which could affect batching behavior and drop/backpressure handling under load.

Overview
This PR refactors the ClickHouse Batcher implementation to use a ticker-based flush loop with explicit start/stop state, changes Push to return an error (including ErrBatcherQueueFull) instead of a boolean, and removes the custom timer pool.

It adds OpenTelemetry instrumentation: batcher-level metrics (queue length, dropped items, flush sizes and timings) with a configurable batcher name attribute, and wraps ClickHouse batch insert callbacks for sandbox events and host stats in traced spans that record batch size and failures. The OTEL collector filter is updated to allow the new batcher.* metrics through.

^{Reviewed by Cursor Bugbot for commit 376a09a. Bugbot is set up for automated code reviews on this repo. Configure here.}

packages/clickhouse/pkg/batcher/batcher.go

Guard Start, Stop, and Push with a sync.RWMutex to prevent a send-on-closed-channel panic that could occur when Stop closed b.ch between Push's started check and the channel send. Also update the QueueSize doc comment to reflect that Push now returns ErrBatcherQueueFull instead of the old (false, nil).

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e6ce9a7a1b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

packages/clickhouse/pkg/batcher/batcher.go

packages/clickhouse/pkg/events/delivery.go

packages/clickhouse/pkg/batcher/batcher.go

Without reset, a size-triggered flush near a tick boundary would give the next batch less than MaxDelay to accumulate. Now each batch always gets a full MaxDelay window after the previous flush.

sitole added 5 commits April 7, 2026 09:47

Remove incorrectly used sandbox context from feature flag resolve

abcd8e5

Batcher refactor for easier read, performance metrics

8ccb7b1

Batcher loop is now easier to navigate. Time waits replaced with ticker. Metrics for performance monitoring are now included. Different batchers can be run in parallel with name attribute used to distinguish.

Use batcher naming

0dbeccc

Trace batch callback for speed

72c5659

Allow export of batcher metrics

cfde811

e2b-request-same-site-reviewers bot assigned jakubno Apr 7, 2026

claude bot reviewed Apr 7, 2026

View reviewed changes

packages/clickhouse/pkg/batcher/batcher.go Show resolved Hide resolved

claude bot reviewed Apr 7, 2026

View reviewed changes

packages/clickhouse/pkg/batcher/batcher.go Show resolved Hide resolved

sitole marked this pull request as ready for review April 7, 2026 09:59

sitole requested review from ValentaTomas, dobrac and jakubno as code owners April 7, 2026 09:59

chatgpt-codex-connector bot reviewed Apr 7, 2026

View reviewed changes

packages/clickhouse/pkg/batcher/batcher.go Show resolved Hide resolved

claude bot reviewed Apr 7, 2026

View reviewed changes

packages/clickhouse/pkg/events/delivery.go Show resolved Hide resolved

packages/clickhouse/pkg/batcher/batcher.go Show resolved Hide resolved

jakubno requested changes Apr 7, 2026

View reviewed changes

packages/clickhouse/pkg/batcher/batcher.go Show resolved Hide resolved

Reset ticker after each flush for consistent MaxDelay window

376a09a

Without reset, a size-triggered flush near a tick boundary would give the next batch less than MaxDelay to accumulate. Now each batch always gets a full MaxDelay window after the previous flush.

sitole requested a review from jakubno April 7, 2026 13:10

jakubno approved these changes Apr 8, 2026

View reviewed changes

sitole merged commit babe558 into main Apr 8, 2026
44 checks passed

sitole deleted the chore/events-batcher-observability branch April 8, 2026 07:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batcher refactor and performance observability#2310

Batcher refactor and performance observability#2310
sitole merged 7 commits intomainfrom
chore/events-batcher-observability

sitole commented Apr 7, 2026 •

edited

Loading

Uh oh!

cursor bot commented Apr 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sitole commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sitole commented Apr 7, 2026 •

edited

Loading

cursor bot commented Apr 7, 2026 •

edited

Loading