Skip to content

feat(api): add gRPC timeout middleware for API#2155

Closed
jakubno wants to merge 1 commit into
revert-2156-chore/disable-timeoutfrom
fix/add-request-timeout-to-autoresume
Closed

feat(api): add gRPC timeout middleware for API#2155
jakubno wants to merge 1 commit into
revert-2156-chore/disable-timeoutfrom
fix/add-request-timeout-to-autoresume

Conversation

@jakubno
Copy link
Copy Markdown
Member

@jakubno jakubno commented Mar 18, 2026

Note

Medium Risk
Applies a hard deadline to all API gRPC unary calls, which can change behavior for long-running requests and introduce new cancellations if timeouts are mis-sized.

Overview
Adds a reusable unary gRPC timeout interceptor and updates the shared NewGRPCServer helper to accept optional extra unary interceptors. The API now uses the same DefaultRequestTimeout value for both HTTP and gRPC by wiring the timeout interceptor into its gRPC server setup, causing unary gRPC handlers to run under a per-request context deadline with an explicit timeout cause.

Written by Cursor Bugbot for commit a8578c6. This will update automatically on new commits. Configure here.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a8578c6da3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/api/main.go
}

grpcServer := e2bgrpc.NewGRPCServer(tel)
grpcServer := e2bgrpc.NewGRPCServer(tel, e2bgrpc.UnaryTimeoutInterceptor(requestTimeout))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Exempt long-running resume RPCs from global 60s timeout

Adding UnaryTimeoutInterceptor(requestTimeout) to the API gRPC server imposes the HTTP request timeout on every unary RPC, which conflicts with ResumeSandbox's explicit 5-minute budget (timeout := 300 * time.Second in packages/api/internal/handlers/proxy_grpc.go). In environments where resuming a sandbox can exceed ~60s (cold starts, node pressure), these calls will now be canceled by the interceptor with DeadlineExceeded before the handler’s intended timeout, causing resume failures that were previously allowed.

Useful? React with 👍 / 👎.

Comment thread packages/api/main.go
// server's write deadline kills the connection (WriteTimeout does NOT
// cancel r.Context(); see https://github.com/golang/go/issues/59602).
requestTimeout = 60 * time.Second
requestTimeout = customMiddleware.DefaultRequestTimeout
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

customMiddleware.DefaultRequestTimeout is not defined anywhere in the codebase at the PR head commit. It does not exist in packages/api/internal/middleware/timeout.go or any other file. This will fail to compile. You need to add const DefaultRequestTimeout = 60 * time.Second to the middleware package (or inline the literal here).

Comment thread packages/api/main.go
}

grpcServer := e2bgrpc.NewGRPCServer(tel)
grpcServer := e2bgrpc.NewGRPCServer(tel, e2bgrpc.UnaryTimeoutInterceptor(requestTimeout))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ResumeSandbox in proxy_grpc.go (lines 104-106) explicitly sets a 300-second timeout with the comment: "Fixed 5 minutes for client-proxy initiated resume. This intentionally does not allow callers to override timeouts via gRPC."\n\nThe new 60-second interceptor applies to ALL unary calls including ResumeSandbox, so the ctx passed to startSandboxInternal and then CreateSandbox will be cancelled after 60 seconds, well before the intended 5 minutes. Auto-resume operations that take longer than 60 seconds will fail with DeadlineExceeded.\n\nThe interceptor should either skip ResumeSandbox via a per-method check (using grpc.UnaryServerInfo.FullMethod), or ResumeSandbox should detach from the incoming context and create its own timeout context from context.Background().

@jakubno jakubno marked this pull request as draft March 18, 2026 11:09
@jakubno jakubno changed the base branch from main to revert-2156-chore/disable-timeout March 18, 2026 16:35
@jakubno jakubno deleted the branch revert-2156-chore/disable-timeout March 20, 2026 15:15
@jakubno jakubno closed this Mar 20, 2026
@ValentaTomas ValentaTomas deleted the fix/add-request-timeout-to-autoresume branch March 20, 2026 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants