fix: log underlying errors when canvas dashboard returns HTTP 500#4810
Closed
cursor[bot] wants to merge 1 commit into
Closed
fix: log underlying errors when canvas dashboard returns HTTP 500#4810cursor[bot] wants to merge 1 commit into
cursor[bot] wants to merge 1 commit into
Conversation
When the canvas dashboard endpoints (GET/PUT /api/v1/canvases/{id}/dashboard)
return codes.Internal, the underlying error was discarded by the handler and
overwritten with a sanitized message. The grpc error sanitizer then only logged
the sanitized message ("failed to load canvas dashboard"), making it
impossible to diagnose the root cause from the logs or Sentry.
Capture the actual error with the canvas_id and organization_id fields before
wrapping it in a status.Error so the next HTTP 500 from this endpoint has the
context needed to investigate (DB error, table missing, structpb conversion
failure, etc.).
Also switch FindCanvasDashboardInTransaction to use errors.Is for the
gorm.ErrRecordNotFound check, which is the safer pattern matching the rest of
the codebase.
Refs: Sentry issue 7483010621
Co-authored-by: Aleksandar Mitrovic <AleksandarCole@users.noreply.github.com>
|
👋 Commands for maintainers:
|
forestileao
added a commit
that referenced
this pull request
May 25, 2026
<!-- CURSOR_AGENT_PR_BODY_BEGIN --> ## Summary Sentry issue [7504868852](https://superplane.sentry.io/issues/7504868852/) reports an `HTTP 500 /account/limits` event captured at level `info` by the `captureHTTPError` middleware in `pkg/public/middleware/logging.go`. That middleware only attaches the URL and status (`CaptureMessage("HTTP %d %s", status, path)`), so the Sentry event by itself has no information about what actually failed. The `/account/limits` endpoint is served by `getOrganizationCreationStatus` in `pkg/public/server.go`, which delegates to `describeOrganizationCreationStatus`. That function can fail at several distinct stages: - `models.CountOrganizationsByBillingAccount` (DB query) - `usage.Service.CheckAccountLimits` (gRPC to the usage service) - `usage.Service.SetupAccount` (lazy provisioning, called on a `codes.NotFound`) - the second `CheckAccountLimits` call after lazy provisioning The handler previously collapsed all of these into a single `log.Errorf("Error loading organization creation status for account %s: %v", ...)`. Because of `%v` on a wrapped error the underlying cause was technically printed, but the entry was unstructured and didn't expose which stage failed or the gRPC status code — neither in the application logs nor (via correlation) in Sentry. ## Changes `pkg/public/server.go` - `describeOrganizationCreationStatus`: emit a structured `log.WithError(err).WithField("account_id", ...).WithField("stage", ...)` entry at each failure point, identifying the stage (`count_organizations`, `check_account_limits`) so future Sentry occurrences can be filtered, alerted on, and triaged from logs. - `checkAccountOrganizationCreationLimits`: log the `SetupAccount` failure and the retry-`CheckAccountLimits` failure separately, both with `account_id` and `grpc_code` fields, instead of silently returning the raw gRPC error. - `getOrganizationCreationStatus` / `createOrganization`: drop the redundant top-level `log.Errorf` (the cause is now logged from the call site with structured fields) and replace with a short tagged `Error` line, so we don't double-log the same chain. `pkg/public/server_test.go` - Extend `fakePublicUsageService` with a `checkAccountErr` field so tests can simulate gRPC failures. - Add `Test__GetOrganizationCreationStatus/returns 500 with diagnostic context when the usage service is unavailable`: when the usage service returns `codes.Unavailable`, the endpoint must still respond with `HTTP 500` (no panic, no nil dereference) — exercising the previously-untested gRPC failure path that drives the Sentry alert. ## Why diagnostic-only This mirrors the same pattern used to triage prior `HTTP 500 ...` Sentry issues: - PR #4810 — `fix: log underlying errors when canvas dashboard returns HTTP 500` (Sentry 7483010621) - PR #4929 — `fix: log underlying errors when agent chat endpoint returns HTTP 500` (Sentry 7495744767) The handler still returns the same HTTP 500 response when the upstream dependency really is broken, but the next occurrence of this Sentry issue will have the actual underlying error (DB failure / gRPC `Unavailable` / `DeadlineExceeded` / etc.) in the application logs with structured fields, so the root cause can be acted on directly. ## Validation The dev environment requires Docker (unavailable in this VM), so the full `make test` / `make lint` / `make check.build.app` chain was not run. As a substitute: - `go build ./pkg/public/... ./pkg/models/... ./pkg/usage/... ./pkg/public/middleware/...` — clean - `go vet ./pkg/public/...` — clean - `gofmt -s -w` / `goimports -w` on the touched files — no further diff ## Refs - Sentry issue [7504868852](https://superplane.sentry.io/issues/7504868852/) - Prior precedents: PR #4810, PR #4929 <!-- CURSOR_AGENT_PR_BODY_END --> <div><a href="https://cursor.com/agents/bc-486bf509-b07b-46bc-89f3-c5c39530be7d"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cursor.com/assets/images/open-in-web-dark.png"><source media="(prefers-color-scheme: light)" srcset="https://cursor.com/assets/images/open-in-web-light.png"><img alt="Open in Web" width="114" height="28" src="https://cursor.com/assets/images/open-in-web-dark.png"></picture></a> <a href="https://cursor.com/background-agent?bcId=bc-486bf509-b07b-46bc-89f3-c5c39530be7d"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cursor.com/assets/images/open-in-cursor-dark.png"><source media="(prefers-color-scheme: light)" srcset="https://cursor.com/assets/images/open-in-cursor-light.png"><img alt="Open in Cursor" width="131" height="28" src="https://cursor.com/assets/images/open-in-cursor-dark.png"></picture></a> </div> Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Pedro Leão <60622592+forestileao@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds proper error logging to the canvas dashboard endpoints so that the underlying cause of HTTP 500 responses is captured in logs/Sentry.
Why
Sentry issue 7483010621 reports an
HTTP 500 /api/v1/canvases/{id}/dashboardevent. The dashboard endpoints were added in #4782. WhenGetCanvasDashboard/UpdateCanvasDashboardfail inmodels.FindCanvas,models.FindCanvasDashboard, orserializeCanvasDashboard, they wrap the error withstatus.Error(codes.Internal, "...")without first logging the original error.The grpc error sanitizer (
pkg/grpc/error_sanitizer.go) then only logs the sanitized status error (e.g.failed to load canvas dashboard), discarding the underlying cause. As a result, the Sentry event for the 500 only contains the URL — there is no message in our logs that tells us whether the failure was a DB error, a missing table, a structpb conversion failure on a stored panel, etc.Changes
pkg/grpc/actions/canvases/get_canvas_dashboard.go: log the original error withcanvas_idandorganization_idfields before eachcodes.Internalreturn.pkg/grpc/actions/canvases/update_canvas_dashboard.go: same treatment for theFindCanvasandserializeCanvasDashboardfailure paths (the transaction failure path already logs).pkg/models/canvas_dashboard.go: switch theErrRecordNotFoundcheck inFindCanvasDashboardInTransactiontoerrors.Is, matching the rest of the codebase and safe against wrapped errors.These are minimal, low-risk changes. They do not alter the observable API behavior — the same gRPC code / HTTP status / message is still returned — but the next occurrence of this Sentry issue will surface the actual root cause so it can be fixed properly.
Testing
make format.go,make lint,make check.build.appall pass.make test PKG_TEST_PACKAGES=./pkg/grpc/actions/canvases—200 testspass (existing coverage exercises every error path that now logs).make test PKG_TEST_PACKAGES=./pkg/models—49 testspass.