feat(admin): React + Vite SPA for the admin dashboard (P3)#649
Conversation
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 17 minutes and 23 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (2)
📒 Files selected for processing (29)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@claude review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new React-based administrative dashboard for the elastickv service, which is embedded into the Go binary for distribution. The dashboard provides interfaces for cluster monitoring, DynamoDB table management, and S3 bucket administration. The review feedback highlights several opportunities to improve network efficiency and code maintainability, specifically by ensuring that AbortSignal is consistently passed through the API client to support request cancellation. Additionally, the review recommends removing the redundant safe helper function in the dashboard, as its logic is already handled by the useApiQuery hook.
| listTables: (next_token?: string) => | ||
| apiFetch<DynamoTableList>("/dynamo/tables", { query: { next_token } }), | ||
| describeTable: (name: string) => | ||
| apiFetch<DynamoTable>(`/dynamo/tables/${encodeURIComponent(name)}`), | ||
| createTable: (req: CreateTableRequest) => | ||
| apiFetch<DynamoTable>("/dynamo/tables", { method: "POST", body: req as unknown as Json }), | ||
| deleteTable: (name: string) => | ||
| apiFetch<void>(`/dynamo/tables/${encodeURIComponent(name)}`, { method: "DELETE" }), | ||
| listBuckets: (next_token?: string) => | ||
| apiFetch<S3BucketList>("/s3/buckets", { query: { next_token } }), | ||
| describeBucket: (name: string) => | ||
| apiFetch<S3Bucket>(`/s3/buckets/${encodeURIComponent(name)}`), | ||
| createBucket: (req: CreateBucketRequest) => | ||
| apiFetch<S3Bucket>("/s3/buckets", { method: "POST", body: req as unknown as Json }), | ||
| putBucketAcl: (name: string, acl: "private" | "public-read") => | ||
| apiFetch<void>(`/s3/buckets/${encodeURIComponent(name)}/acl`, { | ||
| method: "PUT", | ||
| body: { acl }, | ||
| }), | ||
| deleteBucket: (name: string) => | ||
| apiFetch<void>(`/s3/buckets/${encodeURIComponent(name)}`, { method: "DELETE" }), |
There was a problem hiding this comment.
| const tables = useApiQuery((signal) => safe(() => api.listTables(), signal), []); | ||
| const buckets = useApiQuery((signal) => safe(() => api.listBuckets(), signal), []); |
| // safe wraps a promise so a 404 from a not-yet-wired backend route | ||
| // does not cancel the dashboard render. Other errors propagate. | ||
| function safe<T>(fn: () => Promise<T>, signal?: AbortSignal): Promise<T> { | ||
| if (signal?.aborted) return Promise.reject(new ApiError(0, "aborted", "")); | ||
| return fn(); | ||
| } |
There was a problem hiding this comment.
The safe helper function is redundant and its comment is misleading. It doesn't actually catch 404 errors (which are already handled gracefully by the SummaryCard component via the error state from useApiQuery). Furthermore, the manual signal check is unnecessary as useApiQuery already handles abort logic. This helper should be removed.
| export function DynamoDetailPage() { | ||
| const { name = "" } = useParams<{ name: string }>(); | ||
| const { session } = useAuth(); | ||
| const detail = useApiQuery((_signal) => api.describeTable(name), [name]); |
|
|
||
| export function DynamoListPage() { | ||
| const { session } = useAuth(); | ||
| const tables = useApiQuery((_signal) => api.listTables(undefined), []); |
| export function S3DetailPage() { | ||
| const { name = "" } = useParams<{ name: string }>(); | ||
| const { session } = useAuth(); | ||
| const detail = useApiQuery((_signal) => api.describeBucket(name), [name]); |
|
|
||
| export function S3ListPage() { | ||
| const { session } = useAuth(); | ||
| const buckets = useApiQuery((_signal) => api.listBuckets(), []); |
|
Claude finished @bootjp's task in 2m 44s —— View job Code Review — feat(admin): React + Vite SPA for the admin dashboard (P3)
Overall this is a well-structured SPA. The Go side is clean, the auth model is sound, and the Bug —
|
|
Pushed Bugs (fixed):
Drive-by polish (also from the reviews):
Out of scope (low-priority follow-up):
@claude review |
|
Claude finished @bootjp's task in 1m 26s —— View job Code Review — follow-up on
|
|
Thanks for the thorough re-review on The harmless redundancy you noted in |
Phase P3 of docs/design/2026_04_24_proposed_admin_dashboard.md. The Go side (auth, JWT, cluster handler, dynamo handler, follower-> leader forwarding, embed.go skeleton) shipped in earlier PRs but StaticFS was wired as nil, leaving /admin/* answering 404 for any non-API path. This PR adds the React bundle + flips StaticFS on. Source layout (web/admin/): - React 18 + TypeScript + Vite per design doc Section 5.1 - Tailwind CSS for tokens; no UI framework dependency (kept the bundle at ~60 KB gzip per Section 7.2's ~150 KB target) - vite.config.ts writes the build output straight into internal/admin/dist so `npm run build && go build` produces a single binary with the latest SPA Pages (Section 5.2): - /login - SigV4 access-key + secret-key form, server-side JWT Cookie issued, status-specific error messages for 401/403/429 - / (dashboard) - cluster overview, raft groups, summary cards - /dynamo - table list + Create dialog (full role only) - /dynamo/:name - schema + GSIs + Delete confirmation modal - /s3 - bucket list + Create dialog (endpoint pending, renders graceful "endpoint pending" notice on 404) - /s3/:name - bucket meta + ACL toggle + Delete (same) Auth (Section 6): - Cookie-based session (admin_session, HttpOnly) handled by browser - CSRF double-submit: admin_csrf cookie value echoed in X-Admin-CSRF header on every POST/PUT/DELETE (apiFetch in src/api/client.ts) - Read-only sessions hide Create/Delete affordances and surface a RequireFullAccess notice; backend re-evaluates the role - Session expiry tracked in sessionStorage; expires_at trip clears in-memory state and redirects to /login Backend wiring: - main_admin.go: load admin.StaticFS() and pass to ServerDeps so /admin/assets/* and the SPA fallback start serving real files - internal/admin/embed.go: //go:embed all:dist on the build output - .gitignore: keep the placeholder index.html committed (so go:embed always has at least one file in a fresh clone), exclude hashed Vite assets and the npm/vite caches - committed bundles invariably drift from source Verification: - npm run build: 192 KB JS / 60 KB gzip + 14 KB CSS - npm run lint (tsc strict, noUnusedLocals, noUncheckedSideEffects) - go build ./... - go test -race ./internal/admin/... - golangci-lint run ./internal/admin/... S3 admin endpoints (CreateBucket / ListBuckets / PutBucketAcl / DeleteBucket) are not yet wired on the Go side; the SPA pages render a soft "endpoint pending" notice on the 404 they currently get and will populate transparently once the handlers ship.
Two converging review findings (Claude bot + Gemini Code Assist): 1. AbortSignal not plumbed through listTables / describeTable / listBuckets / describeBucket. useApiQuery created an AbortController and passed signal to the loader, but those four methods discarded it - so requests ran to completion after a component unmounted, wasting network and server connections. Now every read method mirrors cluster's signature (`signal?: AbortSignal`) and every page-level useApiQuery callsite passes it through. 2. The safe() wrapper in Dashboard.tsx was a no-op that received `signal` and never passed it on. The "404 doesn't cancel render" suppression it claimed to provide is already handled by SummaryCard testing `error?.status === 404 && pendingMessage`. Removed. Drive-by polish from the same reviews: - useApiQuery's effect deps no longer include `markUnauthorized`. Read it through markUnauthorizedRef so an AuthProvider remount (React Fast Refresh in dev, or any future wrapping change) does not invalidate every active query and trigger a fresh round-trip on every page. Mirrors the existing loaderRef pattern. - Login.tsx now trims `secretKey` in addition to `accessKey`. SigV4 secrets never contain whitespace, so trimming is safe and saves an operator from a cryptic 401 after pasting a key with a trailing newline. Modal accessibility (no focus trap, missing aria-modal / role=dialog) was flagged as low-priority polish for an internal operator tool. Tracked as a follow-up; not addressed here to keep this change focused on the functional bugs. Verified with `npm run build` and `npm run lint` (tsc --strict --noUnusedLocals --noUncheckedSideEffectImports), and `go build ./...`.
6349276 to
e3b9555
Compare
|
Claude finished @bootjp's task in 1m 45s —— View job Code Review — post-rebase check (onto main @ 76cb788)
The rebase is clean. No regressions, no unexpected changes. Conflict resolution in // lines 448–460 after rebase:
staticFS, err := admin.StaticFS()
if err != nil {
return nil, errors.Wrap(err, "open embedded admin SPA")
}
server, err := admin.NewServer(admin.ServerDeps{
...
Forwarder: forwarder, // ← from #648 on main
StaticFS: staticFS, // ← from this branch (was nil)
...
})
All round-2 fixes remain intact:
This is ready to merge. |
Two follow-ups from the review of #649: 1. useApi.ts: removed the `if (ctrl.signal.aborted) return;` check inside the catch handler. The cleanup function sets `cancelled = true` *before* calling `ctrl.abort()`, so any abort path is already covered by the preceding `cancelled` check. The second guard was harmless belt-and-suspenders dead code; the comment now records why one check is enough. 2. Modal.tsx: added the WAI-ARIA dialog contract (`role="dialog"`, `aria-modal="true"`, `aria-labelledby` pointing at the title) and a focus trap: - On open: store the previously-focused element, then focus the first focusable inside the dialog so keyboard users do not stay stranded on the page underneath. Empty dialogs fall back to focusing the dialog container itself with tabindex=-1. - Tab / Shift+Tab cycle within the dialog instead of escaping to the page underneath. - On close: restore focus to whoever opened the dialog (skipped when that element has been unmounted, e.g. the dialog deleted the row whose button opened it). - Escape and backdrop click stay wired through the existing `busy` gate so a half-applied operation cannot be dismissed. Implementation is vanilla DOM querying — focus-trap-react / react-focus-lock would have added 5–7 KB gzip for what is a ~40 line vanilla solution. Bundle delta from this PR is +0.4 KB gzip (60.23 -> 60.59 KB). Verified with `npm run build` and `npm run lint` (tsc --strict --noUnusedLocals --noUncheckedSideEffectImports), and `go build ./...`. Stacked on top of #649 (feat/admin-dashboard-spa); merge that one first, then this rebases cleanly onto main.
Bridges the in-process keyviz.MemSampler into the in-node admin HTTP listener (Layer C) so the SPA can render the heatmap without an extra hop through the gRPC Admin service. The SPA was merged in PR #649 but has no KeyViz page yet — this PR is the backend half; the SPA page is task B in the design's Phase 2 follow-up. - New KeyVizSource interface in internal/admin (Snapshot only) so *keyviz.MemSampler satisfies it without a wrapper. Tests pass an in-memory fake. - KeyVizHandler serves GET /admin/api/v1/keyviz/matrix. Query parameters: series (default writes per design §4.1), from_unix_ms, to_unix_ms, rows (capped at keyVizRowBudgetCap=1024 per design §4.1). - Returns 503 keyviz_disabled when the source is nil so the SPA renders an explicit "feature off" state instead of an empty matrix. - Pivots column-major MatrixColumn into row-major JSON, mirroring the adapter (gRPC) handler from PR #646. The duplication is intentional for now — extracting a shared pivot helper is a future cleanup. - ServerDeps.KeyViz field; buildAPIMux always registers the route (even when source is nil) so 503 is served instead of 404. - main.go threads *keyviz.MemSampler through startAdminFromFlags → startAdminServer → buildAdminHTTPServer; keyvizSourceFromSampler converts the typed-nil to interface-nil so the handler's "is keyviz disabled" check fires correctly. Tests: - TestKeyVizHandlerReturnsServiceUnavailableWhenNoSource - TestKeyVizHandlerRejectsNonGet - TestKeyVizHandlerPivotsMatrix (two-column, two-route fixture) - TestKeyVizHandlerSeriesParam (table-driven across all four enums) - TestKeyVizHandlerSeriesParamRejectsUnknown (400 invalid_query) - TestKeyVizHandlerHonorsRowsBudget - TestKeyVizHandlerEncodesAggregateBucket
## Summary Two follow-ups from the [#649](#649) review that were intentionally deferred to keep that PR focused on the functional bugs. **Stacked on `feat/admin-dashboard-spa`** (#649). Once #649 merges, this rebases cleanly onto `main`. ### 1. `useApi.ts` — drop the redundant abort guard The `catch` handler had two checks back-to-back: ```ts if (cancelled) return; if (ctrl.signal.aborted) return; // ← unreachable ``` The cleanup function sets `cancelled = true` *before* calling `ctrl.abort()`, so any abort path is already covered by the first check. Dropped the second guard and recorded *why* one check is enough in a comment. ### 2. `Modal.tsx` — full WAI-ARIA dialog contract + focus trap Added: - `role="dialog"`, `aria-modal="true"`, `aria-labelledby` on the dialog root (uses `useId()` for the title id). - Focus management: - On open: store the previously-focused element, focus the first focusable inside the dialog. - Tab / Shift+Tab cycle within the dialog instead of escaping to the page underneath. - On close: restore focus to whoever opened the dialog (skipped if that element was unmounted by the dialog action — e.g. the dialog deleted the row whose button opened it). - Empty-dialog fallback focuses the dialog container itself with `tabindex=-1`. Vanilla DOM implementation — `focus-trap-react` / `react-focus-lock` would have been 5–7 KB gzip for what is a ~40 line solution. **Bundle delta**: +0.4 KB gzip (60.23 → 60.59 KB JS, 14.55 → 14.57 KB CSS). ## Test plan - [x] `cd web/admin && npm install && npm run build` - [x] `cd web/admin && npm run lint` — `tsc --strict` - [x] `go build ./...` - [ ] Manual smoke (after #649 merges and the dashboard is visitable): - Open Create Table / Delete Table dialog → focus lands inside the dialog. - Tab past the last button → wraps back to the close button. - Shift+Tab from the close button → wraps to the last form field. - Escape closes (when not `busy`) and focus returns to the trigger button. - Screen reader announces the dialog name from the title text.
P2 slice 1 of [docs/design/2026_04_24_proposed_admin_dashboard.md](https://github.com/bootjp/elastickv/blob/main/docs/design/2026_04_24_proposed_admin_dashboard.md). Read-only S3 admin endpoints land first so the SPA's S3List and S3Detail pages stop hitting 404. Write paths (POST/PUT/DELETE/ACL) ship in slice 2 together with AdminForward integration; until then they reply 405. ## Summary - **`*adapter.S3Server.AdminListBuckets` / `AdminDescribeBucket`** — SigV4-bypass read methods. Share `loadBucketMetaAt` + the metadata-prefix scan with the SigV4 path, so a SigV4 `listBuckets` and the admin dashboard cannot drift. - **`internal/admin/buckets_source.go`** — `BucketsSource` interface + `BucketSummary` DTO + sentinel errors (`ErrBucketsForbidden` / `ErrBucketsNotLeader` / `ErrBucketsNotFound` / `ErrBucketsAlreadyExists`). `ErrBucketsForbidden` is wired in this slice; the others are reserved for slice 2 to keep the error vocabulary additive. - **`internal/admin/s3_handler.go`** — `S3Handler` with `handleList` (paginated) + `handleDescribe` (404 for missing). Sub-paths under `/buckets/{name}/` (the future `/acl`) return 404 here so a SPA bug pointed at this build cannot accidentally hit the describe path with a `"{name}/acl"` string. - **Shared `list_pagination.go`** — centralises base64url cursor + limit parsing/clamping so the Dynamo and S3 handlers cannot diverge on validation rules. Drops the now-redundant `parseDynamoListLimit` / `decodeDynamoNextToken` / `encodeDynamoNextToken` from `dynamo_handler.go` in favour of `parseListPaginationParams` / `decodeListNextToken` / `encodeListNextToken`. - **`apiRouteTable` in `server.go`** — bundles the precomposed middleware chains and dispatches by URL prefix. `buildAPIMux`'s body went from 13 cyclomatic branches to 6, leaving headroom for the next resource family (SQS queues, etc.) to land without another refactor. - **Production wiring** — `main_s3.go`'s `startS3Server` now returns `*adapter.S3Server`; `runtimeServerRunner` stores it; `startAdminFromFlags` accepts it and threads `bucketsBridge` → `admin.BucketsSource` → `ServerDeps.Buckets`. ## What is NOT in this PR - `AdminCreateBucket` / `AdminPutBucketAcl` / `AdminDeleteBucket` on the adapter and their HTTP/AdminForward counterparts. Slice 2. - `RoleStore` plumb-through for S3. Slice 2 (read-only is fine for any authenticated session today; the write paths will need it). - `forwarded_from` audit trail for S3 admin writes. Slice 2 — needs the `pb.AdminOperation` enum extended. - The merge-freeze-acceptable changes to `proto/admin_forward.proto`. Slice 2. ## Test plan - [x] `go build ./...` - [x] `go vet ./...` - [x] `golangci-lint run` (admin + main + adapter packages: 0 issues) - [x] `go test ./internal/admin/ -count=1 -race` — 13 new handler tests pass - [x] `go test . -count=1 -race` — main package passes - [x] `go test -run "TestS3Server_Admin" ./adapter/ -count=1 -race` — 4 new adapter tests pass - [ ] `go test ./adapter/` times out at 120s due to a pre-existing flake (verified earlier on PR #648 against `main` — unrelated to this branch) - [ ] End-to-end smoke against a 3-node cluster — needs the next slice + the SPA PR (#649) to merge before there's anything for the SPA to render ## Acceptance criteria coverage (Section 4.1) | Endpoint | This PR | |---|---| | `GET /admin/api/v1/s3/buckets` | ✓ | | `POST /admin/api/v1/s3/buckets` | ⏳ slice 2 | | `GET /admin/api/v1/s3/buckets/{name}` | ✓ | | `PUT /admin/api/v1/s3/buckets/{name}/acl` | ⏳ slice 2 | | `DELETE /admin/api/v1/s3/buckets/{name}` | ⏳ slice 2 | ## Self-review (5 lenses) 1. **Data loss**: No FSM / Raft / Pebble path changes. `AdminListBuckets` / `AdminDescribeBucket` are read-only and use the same `ScanAt` / `loadBucketMetaAt` as the SigV4 path. 2. **Concurrency**: No new shared state. The handler holds only the immutable `BucketsSource` interface; the bridge holds only the immutable `*S3Server`. `pinReadTS` is the same pattern the SigV4 path uses for snapshot stability. 3. **Performance**: One additional metadata scan per `GET /admin/api/v1/s3/buckets`, identical to the SigV4 listBuckets. Pagination caps at 1000 buckets per page (silent clamp on oversize). No hot-path changes. 4. **Data consistency**: Read-only; no commit-ts ordering risk. Admin reads use the same lease-read window as the SigV4 path. 5. **Test coverage**: 13 handler tests + 4 adapter tests + the existing dynamo/cluster test surfaces still pass after the shared-helper refactor. The pagination cursor round-trip test pins the wire format so a future cursor-encoding change cannot ship without breaking the test. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **New Features** * Added S3 bucket admin API endpoints to list all buckets and retrieve detailed bucket information, including creation timestamps. * Implemented pagination support with cursor-based next_token for browsing large bucket collections. * Added proper authorization checks and error handling for admin bucket operations (403 for forbidden, 404 for not found, 500 for errors). <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Bridges the in-process keyviz.MemSampler into the in-node admin HTTP listener (Layer C) so the SPA can render the heatmap without an extra hop through the gRPC Admin service. The SPA was merged in PR #649 but has no KeyViz page yet — this PR is the backend half; the SPA page is task B in the design's Phase 2 follow-up. - New KeyVizSource interface in internal/admin (Snapshot only) so *keyviz.MemSampler satisfies it without a wrapper. Tests pass an in-memory fake. - KeyVizHandler serves GET /admin/api/v1/keyviz/matrix. Query parameters: series (default writes per design §4.1), from_unix_ms, to_unix_ms, rows (capped at keyVizRowBudgetCap=1024 per design §4.1). - Returns 503 keyviz_disabled when the source is nil so the SPA renders an explicit "feature off" state instead of an empty matrix. - Pivots column-major MatrixColumn into row-major JSON, mirroring the adapter (gRPC) handler from PR #646. The duplication is intentional for now — extracting a shared pivot helper is a future cleanup. - ServerDeps.KeyViz field; buildAPIMux always registers the route (even when source is nil) so 503 is served instead of 404. - main.go threads *keyviz.MemSampler through startAdminFromFlags → startAdminServer → buildAdminHTTPServer; keyvizSourceFromSampler converts the typed-nil to interface-nil so the handler's "is keyviz disabled" check fires correctly. Tests: - TestKeyVizHandlerReturnsServiceUnavailableWhenNoSource - TestKeyVizHandlerRejectsNonGet - TestKeyVizHandlerPivotsMatrix (two-column, two-route fixture) - TestKeyVizHandlerSeriesParam (table-driven across all four enums) - TestKeyVizHandlerSeriesParamRejectsUnknown (400 invalid_query) - TestKeyVizHandlerHonorsRowsBudget - TestKeyVizHandlerEncodesAggregateBucket
…s [] Two medium findings from Gemini review on PR #670: 1. CountersTruncated field was stale on internal/admin.QueueSummary. The PR description called out that the field was dropped because main's sqsApproxCounters does not expose truncation, but I missed removing the JSON-side mirror in internal/admin/sqs_handler.go. The SPA was never reading it (SqsDetail.tsx already had the reference removed in this PR), so the field was load-bearing for nothing. Removed. 2. handleList serialised an empty queue catalog as {"queues": null} when AdminListQueues returned nil. The SPA iterates the array directly and would crash on null. Normalise nil to []string{} immediately before encoding so the response shape is always {"queues": []} even on the empty case. No new tests because the existing internal/admin race test suite already exercises the encoder path (would have caught a json.Marshal break) and the empty-catalog case is exactly what the SPA hits the moment a fresh node comes up — manual smoke during PR #649 already verified the SPA renders an empty list cleanly when the array is []. Verified with go build ./..., go test -race ./internal/admin/..., go test -race -run TestSQS ./adapter/, golangci-lint run ./adapter/... ./internal/admin/... — all clean, no //nolint.
) ## Summary Bridges the in-process `keyviz.MemSampler` into the in-node admin HTTP listener (Layer C, `internal/admin`) so the SPA can render the heatmap without an extra hop through the gRPC Admin service. The SPA was merged in #649 but has no KeyViz page yet — this PR is the backend half; the SPA page will follow as a separate slice. - New `KeyVizSource` interface in `internal/admin` (Snapshot only) so `*keyviz.MemSampler` satisfies it without a wrapper. Tests pass an in-memory fake. - `KeyVizHandler` serves `GET /admin/api/v1/keyviz/matrix`. Query parameters: `series` (default `writes` per design §4.1), `from_unix_ms`, `to_unix_ms`, `rows` (capped at `keyVizRowBudgetCap=1024` per design §4.1). - Returns `503 keyviz_disabled` when the source is nil so the SPA renders an explicit "feature off" state instead of an empty matrix. - Pivots column-major `MatrixColumn` into row-major JSON, mirroring the adapter (gRPC) handler from PR #646. The duplication is intentional for now — extracting a shared pivot helper is a future cleanup. - `ServerDeps.KeyViz` field; `buildAPIMux` always registers the route (even when source is nil) so `503` is served instead of `404`. - `main.go` threads `*keyviz.MemSampler` through `startAdminFromFlags` → `startAdminServer` → `buildAdminHTTPServer`; `keyvizSourceFromSampler` converts the typed-nil to interface-nil so the handler's "is keyviz disabled" check fires correctly. ## Test plan - [x] `TestKeyVizHandlerReturnsServiceUnavailableWhenNoSource` — 503 + `keyviz_disabled` code. - [x] `TestKeyVizHandlerRejectsNonGet` — 405. - [x] `TestKeyVizHandlerPivotsMatrix` — two-column, two-route fixture; verifies missing-row-becomes-zero contract. - [x] `TestKeyVizHandlerSeriesParam` — table-driven across all four enum values. - [x] `TestKeyVizHandlerSeriesParamRejectsUnknown` — 400 `invalid_query` for typo'd series. - [x] `TestKeyVizHandlerHonorsRowsBudget` — `?rows=2` returns top-2 by activity, sorted by Start. - [x] `TestKeyVizHandlerEncodesAggregateBucket` — virtual-bucket layout with `route_count` from `MemberRoutesTotal` and `route_ids_truncated` flag. - [x] `go test ./internal/admin/... .` clean. - [x] `golangci-lint run . ./internal/admin/...` clean. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added `/admin/api/v1/keyviz/matrix` API endpoint for retrieving key visualization matrix data * Supports filtering by metric type: reads, writes, read_bytes, and write_bytes * Supports time-range filtering with Unix millisecond precision * Supports row limiting and activity-based sorting * Returns 503 status when key visualization sampling is disabled <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Replaces PR #659, which conflicted heavily after main moved (PR #649 squashed; PR #658 added S3 admin endpoints; the Approximate counters implementation now lives directly in adapter/sqs_catalog.go). This PR: Backend (adapter/sqs_admin.go + internal/admin/sqs_handler.go): - SQSServer.AdminListQueues / AdminDescribeQueue / AdminDeleteQueue are SigV4-bypass entrypoints, mirroring the AdminListTables / AdminListBuckets pattern. - AdminDescribeQueue uses the existing scanApproxCounters from sqs_catalog.go (already on main) so the admin path returns the same Visible / NotVisible / Delayed numbers as GetQueueAttributes("All") would, taken at one snapshot read TS. - sqsQueuesBridge in main_admin.go re-shapes adapter.AdminQueueSummary into admin.QueueSummary, keeping internal/admin free of the heavy adapter dependency tree — same pattern as dynamoTablesBridge / s3BucketsBridge. - admin.QueuesSource is opt-in; deployments that don't run --sqsAddress leave /admin/api/v1/sqs/* off the wire and the SPA renders a soft "endpoint pending" notice on the 404. - Role re-evaluation against the live RoleStore on DELETE so a downgraded key cannot keep mutating with a still-valid JWT. - apiRouteTable.dispatch refactored: resourceHandlerFor extracted so the dispatcher stays under cyclop=10 as new resources land (Dynamo, S3, SQS, future). Frontend (web/admin/src/pages/SqsList.tsx, SqsDetail.tsx): - /sqs queue list with refresh + per-row link to detail. - /sqs/:name detail showing FIFO badge, counters card (Visible / In-flight / Delayed), raw attributes table, and a Delete confirmation Modal gated by RequireFullAccess. - api/client.ts gains listQueues / describeQueue / deleteQueue with the same AbortSignal pattern used for cluster / dynamo / s3 reads. - Layout nav adds an SQS tab between DynamoDB and S3. Out of scope (recorded in the SQS partial design doc §16.2): - PurgeQueue from the SPA. Underlying purgeQueueWithRetry is on main; the admin entrypoint is a trivial follow-up. - Send / Peek / CreateQueue from the SPA. Each needs its own adapter entrypoint and form UX; deferred to keep this PR focused. Verified with go build ./..., go test -race ./internal/admin/..., go test -race -run TestSQS ./adapter/, go test -run TestStartAdmin ., golangci-lint run ./adapter/... ./internal/admin/... ./... (0 issues, no //nolint), and cd web/admin && npm run build.
…s [] Two medium findings from Gemini review on PR #670: 1. CountersTruncated field was stale on internal/admin.QueueSummary. The PR description called out that the field was dropped because main's sqsApproxCounters does not expose truncation, but I missed removing the JSON-side mirror in internal/admin/sqs_handler.go. The SPA was never reading it (SqsDetail.tsx already had the reference removed in this PR), so the field was load-bearing for nothing. Removed. 2. handleList serialised an empty queue catalog as {"queues": null} when AdminListQueues returned nil. The SPA iterates the array directly and would crash on null. Normalise nil to []string{} immediately before encoding so the response shape is always {"queues": []} even on the empty case. No new tests because the existing internal/admin race test suite already exercises the encoder path (would have caught a json.Marshal break) and the empty-catalog case is exactly what the SPA hits the moment a fresh node comes up — manual smoke during PR #649 already verified the SPA renders an empty list cleanly when the array is []. Verified with go build ./..., go test -race ./internal/admin/..., go test -race -run TestSQS ./adapter/, golangci-lint run ./adapter/... ./internal/admin/... — all clean, no //nolint.
Per `docs/design/README.md`'s lifecycle convention, the admin dashboard design doc is now "partial": - **P1** (DynamoDB CRUD + AdminForward) — shipped via #634, #635, #644, #648 - **P2** (S3 buckets list/create/delete/ACL + DescribeTable) — shipped via #658, with #669 + #673 in flight - **P3** (React SPA + embed) — shipped via #649, #650 - **P4** (TLS / role / CSRF / operator docs) — TLS, role, CSRF are already live in P1; operator docs in #674 Independent of the in-flight slice 2 PRs (#669/#673) and the docs PR (#674) — this rename only reflects what is already on main today, plus an "Implementation status" table mapping each phase to the PR it landed in. ## What this PR changes - `git mv` the design doc from `2026_04_24_proposed_admin_dashboard.md` to `2026_04_24_partial_admin_dashboard.md` so its history follows - Add an "Implementation status" header table indexing each phase to the PRs that landed it - List the outstanding open items so future readers know what is still owed against the original proposal: - AdminForward acceptance criterion 5 (rolling-upgrade compat flag) — deferred - S3 object browser — explicitly out of scope per Section 2 Non-goals - TLS cert hot-reload — restart-to-rotate is the documented model When the rolling-upgrade flag lands, the doc gets renamed once more to `2026_04_24_implemented_admin_dashboard.md` per the README's lifecycle convention.
## Summary Phase **3.B** of [`docs/design/2026_04_24_partial_sqs_compatible_adapter.md`](https://github.com/bootjp/elastickv/blob/main/docs/design/2026_04_24_partial_sqs_compatible_adapter.md) §16.4. Adds the AWS SQS **query protocol** (form-encoded request, XML response) alongside the existing JSON protocol on the same listener — older `aws-sdk-java` v1, `boto < 1.34`, and AWS CLI clients can now talk to elastickv without modification. Detection happens per-request via `Content-Type` / `X-Amz-Target` / `Action` presence; no flag, no separate port. See the new proposal doc committed alongside. This is an **architectural proof PR**: dispatch + decoding + encoding + error envelope + the `*Core` refactor pattern are all in place, with **three verbs** wired end-to-end as concrete proof. Each follow-up verb is a parser + response struct + one switch arm — no further design work needed. ### Verbs in this PR | Verb | Why it's in the proof set | |---|---| | `CreateQueue` | Exercises the `Attribute.N.{Name,Value}` indexed-collection parser. | | `ListQueues` | Exercises the repeated-element XML response shape. | | `GetQueueUrl` | Exercises the `<ErrorResponse>` envelope path via `QueueDoesNotExist`. | Every other verb returns a structured **501 `NotImplementedYet`** XML envelope so operators see the gap explicitly. `SendMessage` / `ReceiveMessage` / `DeleteMessage` are the highest-priority follow-ups (they need the same `*Core` refactor on the FIFO send loop). ### Key design points - **No new listener / no flag.** `pickSqsProtocol(*http.Request)` decides per request. JSON and Query share the SQS port and the SigV4 path. - **Wire-format-free cores.** `createQueue` / `listQueues` / `getQueueUrl` are now `decode → core → encode` with `core(ctx, in) (out, error)`. The JSON wrappers are unchanged in behavior; existing JSON tests pass without modification. - **DoS protection inherited.** Body read is bounded by the same `sqsMaxRequestBodyBytes` the JSON path uses. - **SigV4 unchanged.** The signed canonical request includes the form-encoded body, so the existing SigV4 middleware verifies query requests without code changes. - **Error parity.** `<Code>` reuses the existing `sqsErr*` constants. HTTP status mirrors what the JSON path returns, so SDK retry classifiers work across protocols. - **Cyclomatic budget honoured.** `handle()` was refactored to extract `handleQueryProtocol` — `cyclop ≤ 10` per project rules, no `//nolint`. ### Known limitation (design §11.4) `proxyToLeader`'s error writer always emits the JSON envelope, so a query-protocol client hitting a follower during a leader flip sees one JSON error before retry lands on the new leader. Follow-up PR threads the detected protocol onto the request context so the proxy emits matching XML. ## Test plan - [x] `go build ./...` — clean - [x] `go test -count=1 -race -run "TestSQS|QueryProtocol|TestPickSqs|TestCollectIndexedKV|TestWriteSQSQueryError" ./adapter/` — passes - [x] `golangci-lint run ./adapter/...` — `0 issues.` - [x] `pickSqsProtocol` table tests cover documented edge cases (header precedence, charset suffix, GET with Action, missing Action, nil request). - [x] `collectIndexedKVPairs` tests cover happy path, orphan Name, empty input, unrelated prefix. - [x] End-to-end via the in-process listener: CreateQueue / ListQueues / GetQueueUrl round-trip on the query side. - [x] **Cross-protocol parity**: a queue created via Query is visible via JSON `GetQueueUrl` with the same URL. - [x] Error envelope: 4xx maps to `<Type>Sender</Type>`, 5xx to `<Type>Receiver</Type>`, namespace pinned, `x-amzn-ErrorType` header set. - [x] Unknown verb returns 501 with the `NotImplementedYet` XML envelope. - [x] Missing `Action` parameter returns 400 (per design §3). ## Self-review (5 lenses) 1. **Data loss** — Wire-format change only. Cores are byte-for-byte identical to the previous handler bodies; no Raft / OCC / MVCC code is touched. 2. **Concurrency** — No new shared state. Detection is request-local. Body parsing is bounded. 3. **Performance** — One additional `Content-Type` string compare per request on the dispatch hot path. Negligible. 4. **Data consistency** — `*Core` returns the same business-logic outputs as before; the JSON tests are the regression net for parity. Cross-protocol parity test pins behaviour. 5. **Test coverage** — 10 new test cases cover detection, parsing, envelope shape, and three end-to-end verbs. Existing `TestSQS*` race suite passes on the refactor. ## Stacking This PR is **independent** — branched from current `main` (which has #638 + #649 merged). It does not depend on PR #650 / PR #659. Merge whenever ready.
) ## Summary **Replaces #659**, which has unresolvable conflicts now that main has moved on (PR #649 squashed into main; PR #658 added the S3 admin endpoints; the Approximate counters implementation now lives directly in `adapter/sqs_catalog.go` via `scanApproxCounters`). Rather than a multi-day rebase, this PR re-applies the unique SQS admin code on a fresh branch off current main. What survived from #659: - `adapter/sqs_admin.go` — `SQSServer.AdminListQueues` / `AdminDescribeQueue` / `AdminDeleteQueue` (SigV4-bypass entrypoints, same shape as `AdminListTables` / `AdminListBuckets`). - `internal/admin/sqs_handler.go` — HTTP handler for `/admin/api/v1/sqs/queues{,/{name}}` with role re-evaluation on DELETE. - `web/admin/src/pages/SqsList.tsx` / `SqsDetail.tsx` — SPA pages for the queues view + delete confirmation. What changed during the re-apply: - `AdminQueueCounters` is now `int64` (matches `sqsApproxCounters` from main; bridge does no width conversion). - `AdminDescribeQueue` calls main's `scanApproxCounters` instead of the duplicate `computeApproxCounters` from the old branch — same numeric output, single implementation. - Dropped the `CountersTruncated` field; main's counter type doesn't expose truncation. SPA's "truncated" pill came out with it. - `apiRouteTable.dispatch` refactored to extract `resourceHandlerFor` so the dispatcher stays under cyclop=10 as new resources land. ## Backend - Re-evaluates the principal's role against the live `MapRoleStore` on every `DELETE` so a downgraded key cannot keep mutating with a still-valid JWT (Codex P1 pattern from earlier admin PRs). - `admin.QueuesSource` is **opt-in**: deployments without `--sqsAddress` leave `/admin/api/v1/sqs/*` off the wire; the SPA renders a soft "endpoint pending" notice on the 404, mirroring the Tables / Buckets `nil` contract. - The bridge in `main_admin.go` (`sqsQueuesBridge`, `convertAdminQueueSummary`, `translateAdminQueuesError`) keeps `internal/admin` free of the heavy adapter dependency tree, same architectural pattern as Dynamo and S3. ## Frontend - **/sqs** queue list with refresh + per-row link to detail. - **/sqs/:name** detail showing FIFO badge, counters card (Visible / In-flight / Delayed), raw attributes table, and a Delete confirmation `Modal` gated by `RequireFullAccess`. - `api/client.ts` gains `listQueues` / `describeQueue` / `deleteQueue` with the same `AbortSignal` pattern used for `cluster` / `dynamo` / `s3` reads. - Layout nav adds an SQS tab between DynamoDB and S3. ## Out of scope (recorded in `docs/design/2026_04_24_proposed_sqs_compatible_adapter.md` Section 14, deferred per the SQS partial doc §16.2) - **PurgeQueue from the SPA** — the underlying `purgeQueueWithRetry` adapter method is on main; the admin entrypoint is a trivial follow-up. - **Send / Peek / CreateQueue from the SPA** — each needs its own SigV4-bypass adapter entrypoint and form UX; deferred to keep this PR focused. ## Test plan - [x] `go build ./...` — clean - [x] `go test -race ./internal/admin/...` — passes - [x] `go test -race -run TestSQS ./adapter/` — passes - [x] `go test -run TestStartAdmin .` — passes - [x] `golangci-lint run ./adapter/... ./internal/admin/... ./...` — `0 issues.`, no `//nolint` - [x] `cd web/admin && npm run build` — 49 modules, 199 KB JS / 61 KB gzip + 14.7 KB CSS - [ ] Manual smoke (after PR lands): start a node with `--sqsAddress :4566 --adminEnabled --adminAllowInsecureDevCookie`, create a queue, send a few messages, hit `/admin/sqs/<name>` → counters match `GetQueueAttributes("All")`, Delete dialog returns to list. ## Self-review (5 lenses) 1. **Data loss** — Delete reuses the existing `deleteQueueWithRetry` OCC path; counters are read-only. No new write paths. 2. **Concurrency** — Per-request leader check on Delete; counters scan uses one snapshot read TS. 3. **Performance** — Counters bounded by main's existing `sqsApproxCounterScanLimit`; admin reads are cheap point lookups + one bounded scan. 4. **Data consistency** — `AdminDescribeQueue` and SigV4 `GetQueueAttributes` both call `scanApproxCounters` at a fresh `nextTxnReadTS`, so a single point in time produces the same counters via either surface. 5. **Test coverage** — Existing admin / SQS race suites stay green via the new `nil` Queues argument added to `startAdminServer` call sites; the new bridge is exercised by the cross-package build itself. ## Stacking This PR is **independent** — branched from current `main` (which has the merged versions of #649 / #658 / #650 / counter implementation). Closing #659 in favour of this clean rewrite.
Per docs/design/README.md's lifecycle convention. The original P1–P4 plan has fully shipped: - P1 (admin skeleton + Dynamo + AdminForward) — #634/#635/#644/#648 - P2 (S3 endpoints incl. write paths and AdminForward integration) — #658 / #669 / #673 / #695 (TOCTOU safety net) - P3 (React SPA + embed) — #649 / #650 - P4 (TLS / role / CSRF / operator doc / deployment runbook / scripts/rolling-update.sh admin support) — #674 / #669 / #678 The AdminDeleteBucket TOCTOU caught during PR #669 review (the last "in-flight" item that kept the doc at _partial_) is fully resolved by the safety-net design landed in #695. What changed: - git mv 2026_04_24_partial_admin_dashboard.md → 2026_04_24_implemented_admin_dashboard.md (history follows the rename) - Header Status line: "Partial" → "Implemented", explanation updated to reflect the post-fix state and the rationale for promotion. - "Last updated" bumped to 2026-04-28 with the rename trigger. - Section heading "Outstanding open items" → "Out-of-scope follow-ups" — the remaining three entries (criterion 5, object browser, TLS hot-reload) are not in-flight work; they are deferred-at-design or Non-goal items. The TOCTOU bullet is removed (resolved) and replaced with a one-line cross-link to the safety-net design + admin_deployment.md §4.6 contract. - Removed the closing "rename trigger" sentence — we just did the rename. - Status table: P2 row now lists #695 alongside #658/#669/#673 so a future reader can find the TOCTOU fix from the index. - Cross-references updated everywhere the old filename appeared: docs/admin.md (header link + Cross-references) docs/admin_deployment.md (header link + final cross-ref) docs/design/2026_04_28_proposed_admin_delete_bucket_safety_net.md (Background section pointer) internal/admin/config.go (Section 7.1 reference comment) No code changes other than the comment-only filename refresh in config.go.
## Summary Promote the admin dashboard design doc from `_partial_` → `_implemented_` per `docs/design/README.md`'s lifecycle convention. PR #695 landed the TOCTOU safety-net fix (the last in-flight item that kept the doc at `_partial_`), so the original P1–P4 plan is now fully shipped: | Phase | Landed via | |---|---| | P1 (admin skeleton + Dynamo + AdminForward) | #634 / #635 / #644 / #648 | | P2 (S3 endpoints + writes + AdminForward S3 + TOCTOU fix) | #658 / #669 / #673 / **#695** | | P3 (React SPA + embed) | #649 / #650 | | P4 (TLS / role / CSRF / operator doc / runbook / script wiring) | #674 / #669 / #678 | The remaining three items in the doc move from "Outstanding open items" (in-flight) to **"Out-of-scope follow-ups"** (deferred-at-design or Non-goal): - AdminForward criterion 5 — rolling-upgrade flag, deferred behind a cluster-version bump that doesn't exist yet - S3 object browser — Non-goal per §2.2 - TLS cert hot-reload — out of scope per `docs/admin.md` ## Changes - `git mv` partial → implemented (history follows the rename) - Status line / Last-updated / status-table / Out-of-scope section content reflects the promotion - Cross-references updated in all 4 referencing files: `docs/admin.md`, `docs/admin_deployment.md`, `docs/design/2026_04_28_proposed_admin_delete_bucket_safety_net.md`, `internal/admin/config.go` (comment-only) ## Test plan - [x] No code changes other than a comment-only filename refresh in `config.go` - [x] `go build ./...` passes - [x] `golangci-lint run ./internal/admin/...` — 0 issues - [x] `grep -rn "2026_04_24_partial\|2026_04_24_proposed_admin"` returns nothing — no stale references
Summary
Phase P3 of docs/design/2026_04_24_proposed_admin_dashboard.md. The Go side (auth / JWT / cluster handler / dynamo handler / follower→leader forwarding /
embed.goskeleton) shipped in earlier PRs (#644 et al.) butStaticFSwas wired asnil, so/admin/*answered 404 for any non-API path. This PR ships the React bundle and flipsStaticFSon.vite.config.tswrites its build output straight intointernal/admin/dist/sonpm run build && go buildproduces a single-binary release.admin_session(HttpOnly) handled by the browser;admin_csrfvalue echoed inX-Admin-CSRFfor every POST/PUT/DELETE per the double-submit pattern. Read-only sessions hide write affordances; backend re-evaluates the role..gitignorekeeps the placeholderinternal/admin/dist/index.htmlcommitted (so//go:embed all:distalways has a match in a fresh clone) and excludes the hashed Vite asset outputs and npm/vite caches — committed bundles drift from source.Test plan
cd web/admin && npm install && npm run build— 192 KB JS / 60 KB gzip + 14 KB CSScd web/admin && npm run lint—tsc --strict --noUnusedLocals --noUncheckedSideEffectImportsgo build ./...— clean (placeholder + built bundle both compile)go test -race -count=1 ./internal/admin/...golangci-lint run ./internal/admin/...--adminEnabled --adminAllowInsecureDevCookie --adminListen 127.0.0.1:8080, visithttp://127.0.0.1:8080/admin/, log in with a SigV4 access key, confirm dashboard / dynamo list / dynamo detail / dynamo create / dynamo delete + see graceful "endpoint pending" on the S3 pages.Notes
StaticFSactivation). No replication, MVCC, OCC, or HLC code paths touched. Lenses 1 (data loss) / 2 (concurrency) / 4 (consistency) → no impact. Lens 3 (perf) → +1 cold read ofembed.FSper startup, ~1 MB binary growth (under the doc's 1–2 MB target). Lens 5 (test coverage) → SPA-side Vitest is documented as a P4 follow-up in the design doc; this PR does not regress any existing Go test.proposed_→partial_) are out of scope here and are tracked elsewhere.