Skip to content

feat(admin): React + Vite SPA for the admin dashboard (P3)#649

Merged
bootjp merged 2 commits intomainfrom
feat/admin-dashboard-spa
Apr 26, 2026
Merged

feat(admin): React + Vite SPA for the admin dashboard (P3)#649
bootjp merged 2 commits intomainfrom
feat/admin-dashboard-spa

Conversation

@bootjp
Copy link
Copy Markdown
Owner

@bootjp bootjp commented Apr 26, 2026

Summary

Phase P3 of docs/design/2026_04_24_proposed_admin_dashboard.md. The Go side (auth / JWT / cluster handler / dynamo handler / follower→leader forwarding / embed.go skeleton) shipped in earlier PRs (#644 et al.) but StaticFS was wired as nil, so /admin/* answered 404 for any non-API path. This PR ships the React bundle and flips StaticFS on.

  • React 18 + TypeScript + Vite, Tailwind CSS for tokens (Section 5.1). No UI framework dependency — bundle stays at ~60 KB gzip vs the doc's 150 KB ceiling (Section 7.2).
  • vite.config.ts writes its build output straight into internal/admin/dist/ so npm run build && go build produces a single-binary release.
  • Five pages per Section 5.2: Login, Dashboard, DynamoDB list/detail (with Create + Delete dialogs), S3 list/detail (placeholder — endpoints not yet wired, the SPA renders a soft "endpoint pending" notice on the 404).
  • Cookie-based session (Section 6): admin_session (HttpOnly) handled by the browser; admin_csrf value echoed in X-Admin-CSRF for every POST/PUT/DELETE per the double-submit pattern. Read-only sessions hide write affordances; backend re-evaluates the role.
  • .gitignore keeps the placeholder internal/admin/dist/index.html committed (so //go:embed all:dist always has a match in a fresh clone) and excludes the hashed Vite asset outputs and npm/vite caches — committed bundles drift from source.

Test plan

  • cd web/admin && npm install && npm run build — 192 KB JS / 60 KB gzip + 14 KB CSS
  • cd web/admin && npm run linttsc --strict --noUnusedLocals --noUncheckedSideEffectImports
  • go build ./... — clean (placeholder + built bundle both compile)
  • go test -race -count=1 ./internal/admin/...
  • golangci-lint run ./internal/admin/...
  • Manual smoke: start admin listener with --adminEnabled --adminAllowInsecureDevCookie --adminListen 127.0.0.1:8080, visit http://127.0.0.1:8080/admin/, log in with a SigV4 access key, confirm dashboard / dynamo list / dynamo detail / dynamo create / dynamo delete + see graceful "endpoint pending" on the S3 pages.

Notes

  • Self-review (5 lenses) — these changes only ship UI + 1 line of Go wiring (StaticFS activation). No replication, MVCC, OCC, or HLC code paths touched. Lenses 1 (data loss) / 2 (concurrency) / 4 (consistency) → no impact. Lens 3 (perf) → +1 cold read of embed.FS per startup, ~1 MB binary growth (under the doc's 1–2 MB target). Lens 5 (test coverage) → SPA-side Vitest is documented as a P4 follow-up in the design doc; this PR does not regress any existing Go test.
  • The S3 admin endpoints, follower→leader forwarding for write paths, and the design-doc lifecycle rename (proposed_partial_) are out of scope here and are tracked elsewhere.

@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 26, 2026

Warning

Rate limit exceeded

@bootjp has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 17 minutes and 23 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 17 minutes and 23 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ab981fdf-d75e-4974-80f8-3208e9b4dbfc

📥 Commits

Reviewing files that changed from the base of the PR and between 76cb788 and e3b9555.

⛔ Files ignored due to path filters (2)
  • internal/admin/dist/index.html is excluded by !**/dist/**
  • web/admin/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (29)
  • .gitignore
  • internal/admin/embed.go
  • main_admin.go
  • web/admin/.gitignore
  • web/admin/index.html
  • web/admin/package.json
  • web/admin/postcss.config.js
  • web/admin/src/App.tsx
  • web/admin/src/api/client.ts
  • web/admin/src/auth.tsx
  • web/admin/src/components/Layout.tsx
  • web/admin/src/components/Modal.tsx
  • web/admin/src/components/RequireAuth.tsx
  • web/admin/src/lib/useApi.ts
  • web/admin/src/main.tsx
  • web/admin/src/pages/Dashboard.tsx
  • web/admin/src/pages/DynamoDetail.tsx
  • web/admin/src/pages/DynamoList.tsx
  • web/admin/src/pages/Login.tsx
  • web/admin/src/pages/NotFound.tsx
  • web/admin/src/pages/S3Detail.tsx
  • web/admin/src/pages/S3List.tsx
  • web/admin/src/styles.css
  • web/admin/src/vite-env.d.ts
  • web/admin/tailwind.config.js
  • web/admin/tsconfig.app.json
  • web/admin/tsconfig.json
  • web/admin/tsconfig.node.json
  • web/admin/vite.config.ts
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/admin-dashboard-spa

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 26, 2026

@claude review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new React-based administrative dashboard for the elastickv service, which is embedded into the Go binary for distribution. The dashboard provides interfaces for cluster monitoring, DynamoDB table management, and S3 bucket administration. The review feedback highlights several opportunities to improve network efficiency and code maintainability, specifically by ensuring that AbortSignal is consistently passed through the API client to support request cancellation. Additionally, the review recommends removing the redundant safe helper function in the dashboard, as its logic is already handled by the useApiQuery hook.

Comment thread web/admin/src/api/client.ts Outdated
Comment on lines +205 to +225
listTables: (next_token?: string) =>
apiFetch<DynamoTableList>("/dynamo/tables", { query: { next_token } }),
describeTable: (name: string) =>
apiFetch<DynamoTable>(`/dynamo/tables/${encodeURIComponent(name)}`),
createTable: (req: CreateTableRequest) =>
apiFetch<DynamoTable>("/dynamo/tables", { method: "POST", body: req as unknown as Json }),
deleteTable: (name: string) =>
apiFetch<void>(`/dynamo/tables/${encodeURIComponent(name)}`, { method: "DELETE" }),
listBuckets: (next_token?: string) =>
apiFetch<S3BucketList>("/s3/buckets", { query: { next_token } }),
describeBucket: (name: string) =>
apiFetch<S3Bucket>(`/s3/buckets/${encodeURIComponent(name)}`),
createBucket: (req: CreateBucketRequest) =>
apiFetch<S3Bucket>("/s3/buckets", { method: "POST", body: req as unknown as Json }),
putBucketAcl: (name: string, acl: "private" | "public-read") =>
apiFetch<void>(`/s3/buckets/${encodeURIComponent(name)}/acl`, {
method: "PUT",
body: { acl },
}),
deleteBucket: (name: string) =>
apiFetch<void>(`/s3/buckets/${encodeURIComponent(name)}`, { method: "DELETE" }),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The API methods in the api object should accept an optional AbortSignal and pass it to apiFetch. This allows hooks like useApiQuery to cancel pending network requests when a component unmounts or dependencies change, improving efficiency and preventing state updates on unmounted components.

Comment thread web/admin/src/pages/Dashboard.tsx Outdated
Comment on lines +6 to +7
const tables = useApiQuery((signal) => safe(() => api.listTables(), signal), []);
const buckets = useApiQuery((signal) => safe(() => api.listBuckets(), signal), []);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Pass the signal directly to the API calls instead of using the safe wrapper. This ensures that the network requests are actually aborted when the component unmounts.

Comment thread web/admin/src/pages/Dashboard.tsx Outdated
Comment on lines +128 to +133
// safe wraps a promise so a 404 from a not-yet-wired backend route
// does not cancel the dashboard render. Other errors propagate.
function safe<T>(fn: () => Promise<T>, signal?: AbortSignal): Promise<T> {
if (signal?.aborted) return Promise.reject(new ApiError(0, "aborted", ""));
return fn();
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The safe helper function is redundant and its comment is misleading. It doesn't actually catch 404 errors (which are already handled gracefully by the SummaryCard component via the error state from useApiQuery). Furthermore, the manual signal check is unnecessary as useApiQuery already handles abort logic. This helper should be removed.

Comment thread web/admin/src/pages/DynamoDetail.tsx Outdated
export function DynamoDetailPage() {
const { name = "" } = useParams<{ name: string }>();
const { session } = useAuth();
const detail = useApiQuery((_signal) => api.describeTable(name), [name]);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Pass the signal to api.describeTable to allow request cancellation.

Comment thread web/admin/src/pages/DynamoList.tsx Outdated

export function DynamoListPage() {
const { session } = useAuth();
const tables = useApiQuery((_signal) => api.listTables(undefined), []);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Pass the signal to api.listTables to allow request cancellation.

Comment thread web/admin/src/pages/S3Detail.tsx Outdated
export function S3DetailPage() {
const { name = "" } = useParams<{ name: string }>();
const { session } = useAuth();
const detail = useApiQuery((_signal) => api.describeBucket(name), [name]);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Pass the signal to api.describeBucket to allow request cancellation.

Comment thread web/admin/src/pages/S3List.tsx Outdated

export function S3ListPage() {
const { session } = useAuth();
const buckets = useApiQuery((_signal) => api.listBuckets(), []);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Pass the signal to api.listBuckets to allow request cancellation.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 26, 2026

Claude finished @bootjp's task in 2m 44s —— View job


Code Review — feat(admin): React + Vite SPA for the admin dashboard (P3)

  • Gather context and read changed files
  • Review API client (web/admin/src/api/client.ts)
  • Review auth and hooks (auth.tsx, useApi.ts)
  • Review pages (Dashboard.tsx, DynamoList.tsx, DynamoDetail.tsx, S3List.tsx, S3Detail.tsx)
  • Review Go embedding (internal/admin/embed.go, main_admin.go)
  • Review security posture (CSRF, cookies, session)
  • Compile findings and post review

Overall this is a well-structured SPA. The Go side is clean, the auth model is sound, and the useApiQuery hook has good lifecycle semantics. Two concrete bugs in the signal-propagation path and a few smaller notes below.

Bug — safe() in Dashboard.tsx is a no-op that silently drops AbortSignal

Dashboard.tsx lines 6–7:

const tables = useApiQuery((signal) => safe(() => api.listTables(), signal), []);
const buckets = useApiQuery((signal) => safe(() => api.listBuckets(), signal), []);

safe() receives signal but never passes it to fn() (line 130–133):

function safe<T>(fn: () => Promise<T>, signal?: AbortSignal): Promise<T> {
  if (signal?.aborted) return Promise.reject(new ApiError(0, "aborted", ""));
  return fn();  // signal is discarded here
}

The early-abort check is also redundant — useApiQuery already handles that via its cancelled flag. More importantly, the comment says safe prevents 404 from "cancelling the dashboard render", but that suppression is already handled by SummaryCard testing error?.status === 404 && pendingMessage. The function has no real effect.

Recommendation: Remove safe entirely. The two loaders become:

const tables = useApiQuery((signal) => api.listTables(undefined, signal), []);
const buckets = useApiQuery((signal) => api.listBuckets(undefined, signal), []);

…which also requires the fix below.

Fix this →


Bug — AbortSignal not wired for listTables, describeTable, listBuckets, describeBucket

client.ts lines 205–215: unlike cluster (which correctly accepts signal?: AbortSignal), the four read methods don't:

cluster: (signal?: AbortSignal) =>
  apiFetch<ClusterInfo>("/cluster", { signal }),          // ✓ correct

listTables: (next_token?: string) =>
  apiFetch<DynamoTableList>("/dynamo/tables", {}),     // ✗ no signal

describeTable: (name: string) =>
  apiFetch<DynamoTable>(),                               // ✗ no signal

listBuckets: (next_token?: string) =>
  apiFetch<S3BucketList>("/s3/buckets", {}),           // ✗ no signal

describeBucket: (name: string) =>
  apiFetch<S3Bucket>(),                                  // ✗ no signal

All four page-level hooks receive the signal from useApiQuery but silently discard it ((_signal) => api.xxx(…)). The cancelled flag in useApiQuery prevents stale state updates after unmount, but the requests still run to completion, wasting network bandwidth and server connections.

Recommendation: Mirror the cluster signature on all four methods:

listTables: (next_token?: string, signal?: AbortSignal) =>
  apiFetch<DynamoTableList>("/dynamo/tables", { query: { next_token }, signal }),
describeTable: (name: string, signal?: AbortSignal) =>
  apiFetch<DynamoTable>(`/dynamo/tables/${encodeURIComponent(name)}`, { signal }),
listBuckets: (next_token?: string, signal?: AbortSignal) =>
  apiFetch<S3BucketList>("/s3/buckets", { query: { next_token }, signal }),
describeBucket: (name: string, signal?: AbortSignal) =>
  apiFetch<S3Bucket>(`/s3/buckets/${encodeURIComponent(name)}`, { signal }),

Then update the four call sites in DynamoList.tsx:10, DynamoDetail.tsx:11, S3List.tsx:10, S3Detail.tsx:11 to pass signal through.

Fix this →


Minor — markUnauthorized in useApiQuery dep array could cause spurious re-fetches

useApi.ts line 58:

}, [...deps, tick, markUnauthorized]);

markUnauthorized is a useCallback from AuthProvider and is currently stable across renders. However, if AuthProvider ever remounts (e.g. React Fast Refresh during development, or a future wrapping change), its identity changes and every active useApiQuery on the page fires a fresh network request. The loader itself is already exempted from the dep array via loaderRef for exactly this reason. Consider the same treatment:

const markUnauthorizedRef = useRef(markUnauthorized);
markUnauthorizedRef.current = markUnauthorized;
// then inside the effect, call markUnauthorizedRef.current(...)
// and remove markUnauthorized from the deps array

Minor — Modal has no focus trap and is missing aria-modal

Modal.tsx uses Escape and backdrop-click but doesn't trap focus or declare aria-modal="true" / role="dialog". Screen-reader users can navigate to background content while the modal is open. For an internal operator tool this is low urgency, but calling it out so it's tracked.


Minor — Secret key not trimmed on login submit

Login.tsx line 28:

await login(accessKey.trim(), secretKey);  // secretKey not trimmed

accessKey is trimmed but secretKey is not. A developer who accidentally pastes a key with trailing whitespace will get a cryptic 401. SigV4 secret keys never contain whitespace, so .trim() on secretKey is safe and would improve UX.


Go side — looks clean

internal/admin/embed.go: the //go:embed all:dist is correct — the all: prefix is needed because Vite may emit dot-prefixed files (.vite/manifest.json). The graceful fallback comment (placeholder vs full bundle) is accurate. The fs.Sub error wrapping uses errors.Join which is idiomatic.

main_admin.go: the one-line StaticFS activation (staticFS wired into ServerDeps) is the minimal change it should be. The dynamoTablesBridge separation that keeps internal/admin free of adapter imports is excellent architectural hygiene. The isLeaderChurnError refactor to HasSuffix (instead of Contains) correctly handles the false-positive scenario documented in the comment.

.gitignore / web/admin/.gitignore: committing only the placeholder index.html and excluding the hashed Vite outputs is the right call. The convention prevents stale bundles drifting in git while ensuring a fresh clone can still go build.


Summary

The two AbortSignal bugs (the safe dead-code wrapper and the four methods missing signal) are the only functional issues; everything else is polish. The auth, CSRF, session-expiry, and Go wiring are all solid. Addressing the signal propagation would make this ready to merge.

@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 26, 2026

Pushed fb8085cf addressing the converging findings from the Claude review and Gemini Code Assist:

Bugs (fixed):

  • api.listTables / describeTable / listBuckets / describeBucket now accept and pass through signal?: AbortSignal (mirroring api.cluster). All 4 page-level useApiQuery callsites updated to pass the signal.
  • Removed the safe() wrapper in Dashboard.tsx — it was a no-op that received the signal and discarded it; the 404 "endpoint pending" handling it claimed is already done by SummaryCard checking error?.status === 404.

Drive-by polish (also from the reviews):

  • useApiQuery no longer puts markUnauthorized in the effect deps; it's read via markUnauthorizedRef so an AuthProvider remount (Fast Refresh in dev) doesn't invalidate every active query. Mirrors the existing loaderRef pattern.
  • Login.tsx now .trim()s secretKey in addition to accessKey (SigV4 secrets never contain whitespace).

Out of scope (low-priority follow-up):

  • Modal a11y (focus trap, aria-modal, role="dialog") — flagged as polish for an internal operator tool; will land separately so this PR stays focused.

npm run build + npm run lint (strict TS) clean; go build ./... clean.

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Apr 26, 2026

Claude finished @bootjp's task in 1m 26s —— View job


Code Review — follow-up on fb8085cf

  • Gather context
  • Verify AbortSignal wiring in client.ts + all 4 call sites
  • Verify safe() removed in Dashboard.tsx
  • Verify markUnauthorized dep removal in useApi.ts
  • Verify secretKey.trim() in Login.tsx
  • Check for any new issues introduced
  • Post findings

All four issues from the previous review are cleanly resolved. Verified against the current branch state:

AbortSignal wiring (client.ts:205–216) — listTables, describeTable, listBuckets, describeBucket now all accept signal?: AbortSignal and forward it to apiFetch, matching the existing cluster signature. Mutations (createTable, deleteTable, etc.) correctly don't take a signal since they're user-initiated, not query-style calls.

All 4 call sites pass signal through correctly:

  • DynamoList.tsx:10api.listTables(undefined, signal)
  • DynamoDetail.tsx:11api.describeTable(name, signal)
  • S3List.tsx:10api.listBuckets(undefined, signal)
  • S3Detail.tsx:11api.describeBucket(name, signal)

safe() wrapper removed (Dashboard.tsx) — Gone entirely. The three useApiQuery calls on lines 6–8 all now pass signal directly. The 404 "endpoint pending" path is correctly handled by SummaryCard's pending check.

markUnauthorizedRef pattern (useApi.ts:31–32, 67) — The ref is assigned and kept current, markUnauthorized no longer appears in the useEffect dep array (line 67 reads [...deps, tick]). The comment on lines 27–30 explains the rationale clearly.

secretKey.trim() (Login.tsx:31) — Both sides now trimmed; the comment explains why it's safe for SigV4 secrets.

One small observation — not blocking, just noting for completeness: in useApi.ts the catch block checks both cancelled (line 47) and ctrl.signal.aborted (line 48). Since the cleanup always sets cancelled = true before calling ctrl.abort(), the ctrl.signal.aborted check is redundant. It's a harmless belt-and-suspenders pattern and the logic is sound either way.

Deferred item (Modal a11y: focus trap, aria-modal, role="dialog") is a reasonable call to leave for a follow-up PR — it has zero impact on functional correctness for this operator tool.

This is ready to merge. The signal propagation is now complete across all read paths, and the rest of the auth/CSRF/session wiring from the first review remains solid.

@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 26, 2026

Thanks for the thorough re-review on fb8085cf. With your "This is ready to merge" verdict, all four findings (AbortSignal plumbing across 4 read methods, safe() removal, markUnauthorized ref pattern, secretKey.trim()) verified resolved, and CI green on every shipped check, the loop has reached its exit condition. Stopping the periodic check now; please merge at your convenience.

The harmless redundancy you noted in useApi.ts (cancelled + ctrl.signal.aborted belt-and-suspenders) and the deferred Modal a11y item (focus trap / aria-modal / role=dialog) remain as low-priority follow-ups.

bootjp added 2 commits April 26, 2026 18:32
Phase P3 of docs/design/2026_04_24_proposed_admin_dashboard.md.
The Go side (auth, JWT, cluster handler, dynamo handler, follower->
leader forwarding, embed.go skeleton) shipped in earlier PRs but
StaticFS was wired as nil, leaving /admin/* answering 404 for any
non-API path. This PR adds the React bundle + flips StaticFS on.

Source layout (web/admin/):
- React 18 + TypeScript + Vite per design doc Section 5.1
- Tailwind CSS for tokens; no UI framework dependency (kept the
  bundle at ~60 KB gzip per Section 7.2's ~150 KB target)
- vite.config.ts writes the build output straight into
  internal/admin/dist so `npm run build && go build` produces a
  single binary with the latest SPA

Pages (Section 5.2):
- /login - SigV4 access-key + secret-key form, server-side JWT
  Cookie issued, status-specific error messages for 401/403/429
- / (dashboard) - cluster overview, raft groups, summary cards
- /dynamo - table list + Create dialog (full role only)
- /dynamo/:name - schema + GSIs + Delete confirmation modal
- /s3 - bucket list + Create dialog (endpoint pending, renders
  graceful "endpoint pending" notice on 404)
- /s3/:name - bucket meta + ACL toggle + Delete (same)

Auth (Section 6):
- Cookie-based session (admin_session, HttpOnly) handled by browser
- CSRF double-submit: admin_csrf cookie value echoed in
  X-Admin-CSRF header on every POST/PUT/DELETE (apiFetch in
  src/api/client.ts)
- Read-only sessions hide Create/Delete affordances and surface a
  RequireFullAccess notice; backend re-evaluates the role
- Session expiry tracked in sessionStorage; expires_at trip clears
  in-memory state and redirects to /login

Backend wiring:
- main_admin.go: load admin.StaticFS() and pass to ServerDeps so
  /admin/assets/* and the SPA fallback start serving real files
- internal/admin/embed.go: //go:embed all:dist on the build output
- .gitignore: keep the placeholder index.html committed (so
  go:embed always has at least one file in a fresh clone), exclude
  hashed Vite assets and the npm/vite caches - committed bundles
  invariably drift from source

Verification:
- npm run build: 192 KB JS / 60 KB gzip + 14 KB CSS
- npm run lint (tsc strict, noUnusedLocals, noUncheckedSideEffects)
- go build ./...
- go test -race ./internal/admin/...
- golangci-lint run ./internal/admin/...

S3 admin endpoints (CreateBucket / ListBuckets / PutBucketAcl /
DeleteBucket) are not yet wired on the Go side; the SPA pages render
a soft "endpoint pending" notice on the 404 they currently get and
will populate transparently once the handlers ship.
Two converging review findings (Claude bot + Gemini Code Assist):

1. AbortSignal not plumbed through listTables / describeTable /
   listBuckets / describeBucket. useApiQuery created an
   AbortController and passed signal to the loader, but those four
   methods discarded it - so requests ran to completion after a
   component unmounted, wasting network and server connections. Now
   every read method mirrors cluster's signature (`signal?: AbortSignal`)
   and every page-level useApiQuery callsite passes it through.

2. The safe() wrapper in Dashboard.tsx was a no-op that received
   `signal` and never passed it on. The "404 doesn't cancel render"
   suppression it claimed to provide is already handled by SummaryCard
   testing `error?.status === 404 && pendingMessage`. Removed.

Drive-by polish from the same reviews:

- useApiQuery's effect deps no longer include `markUnauthorized`.
  Read it through markUnauthorizedRef so an AuthProvider remount
  (React Fast Refresh in dev, or any future wrapping change) does
  not invalidate every active query and trigger a fresh round-trip
  on every page. Mirrors the existing loaderRef pattern.

- Login.tsx now trims `secretKey` in addition to `accessKey`. SigV4
  secrets never contain whitespace, so trimming is safe and saves
  an operator from a cryptic 401 after pasting a key with a
  trailing newline.

Modal accessibility (no focus trap, missing aria-modal / role=dialog)
was flagged as low-priority polish for an internal operator tool.
Tracked as a follow-up; not addressed here to keep this change
focused on the functional bugs.

Verified with `npm run build` and `npm run lint` (tsc --strict
--noUnusedLocals --noUncheckedSideEffectImports), and `go build ./...`.
@bootjp bootjp force-pushed the feat/admin-dashboard-spa branch from 6349276 to e3b9555 Compare April 26, 2026 09:33
@bootjp
Copy link
Copy Markdown
Owner Author

bootjp commented Apr 26, 2026

@claude review

Rebased onto main (e3b9555). The conflict was in main_admin.go's admin.NewServer(admin.ServerDeps{...}) block — main now has Forwarder: forwarder from #648, and this branch had StaticFS: staticFS. Resolved by keeping both. Build / vet / lint / admin + main tests all pass.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 26, 2026

Claude finished @bootjp's task in 1m 45s —— View job


Code Review — post-rebase check (onto main @ 76cb788)

  • Gather context on rebase
  • Read main_admin.go to verify conflict resolution
  • Diff origin/main...HEAD for unexpected changes
  • Verify previously-fixed items are intact
  • Post findings

The rebase is clean. No regressions, no unexpected changes.

Conflict resolution in main_admin.go — verified correct. The resolution kept both fields that came from separate branches:

// lines 448–460 after rebase:
staticFS, err := admin.StaticFS()
if err != nil {
    return nil, errors.Wrap(err, "open embedded admin SPA")
}
server, err := admin.NewServer(admin.ServerDeps{
    ...
    Forwarder:   forwarder,   // ← from #648 on main
    StaticFS:    staticFS,    // ← from this branch (was nil)
    ...
})

origin/main had Forwarder: forwarder + StaticFS: nil; this branch introduced StaticFS: staticFS. The resolution correctly keeps both — the staticFS initialization block is inserted immediately before admin.NewServer, which is the right placement.

All round-2 fixes remain intact:

  • client.tslistTables, describeTable, listBuckets, describeBucket all accept signal?: AbortSignal and forward it to apiFetch
  • All 4 page call sites pass signal through (DynamoList.tsx:10, DynamoDetail.tsx:11, S3List.tsx:10, S3Detail.tsx:11) ✓
  • Dashboard.tsxsafe() gone; useApiQuery calls pass signal directly ✓
  • useApi.tsmarkUnauthorizedRef pattern in place, markUnauthorized absent from deps array ✓
  • Login.tsxsecretKey.trim() present ✓

embed.go — unchanged and correct. StaticFS() does fs.Sub(distFS, "dist") with proper error wrapping; the all:dist embed directive handles dot-prefixed Vite outputs.

This is ready to merge.

@bootjp bootjp merged commit ecd7f66 into main Apr 26, 2026
8 checks passed
@bootjp bootjp deleted the feat/admin-dashboard-spa branch April 26, 2026 09:41
bootjp added a commit that referenced this pull request Apr 26, 2026
Two follow-ups from the review of #649:

1. useApi.ts: removed the `if (ctrl.signal.aborted) return;` check
   inside the catch handler. The cleanup function sets
   `cancelled = true` *before* calling `ctrl.abort()`, so any abort
   path is already covered by the preceding `cancelled` check. The
   second guard was harmless belt-and-suspenders dead code; the
   comment now records why one check is enough.

2. Modal.tsx: added the WAI-ARIA dialog contract (`role="dialog"`,
   `aria-modal="true"`, `aria-labelledby` pointing at the title)
   and a focus trap:
   - On open: store the previously-focused element, then focus the
     first focusable inside the dialog so keyboard users do not stay
     stranded on the page underneath. Empty dialogs fall back to
     focusing the dialog container itself with tabindex=-1.
   - Tab / Shift+Tab cycle within the dialog instead of escaping to
     the page underneath.
   - On close: restore focus to whoever opened the dialog (skipped
     when that element has been unmounted, e.g. the dialog deleted
     the row whose button opened it).
   - Escape and backdrop click stay wired through the existing
     `busy` gate so a half-applied operation cannot be dismissed.

   Implementation is vanilla DOM querying — focus-trap-react /
   react-focus-lock would have added 5–7 KB gzip for what is a
   ~40 line vanilla solution. Bundle delta from this PR is +0.4 KB
   gzip (60.23 -> 60.59 KB).

Verified with `npm run build` and `npm run lint` (tsc --strict
--noUnusedLocals --noUncheckedSideEffectImports), and `go build ./...`.

Stacked on top of #649 (feat/admin-dashboard-spa); merge that one
first, then this rebases cleanly onto main.
bootjp added a commit that referenced this pull request Apr 26, 2026
Bridges the in-process keyviz.MemSampler into the in-node admin
HTTP listener (Layer C) so the SPA can render the heatmap without
an extra hop through the gRPC Admin service. The SPA was merged in
PR #649 but has no KeyViz page yet — this PR is the backend half;
the SPA page is task B in the design's Phase 2 follow-up.

- New KeyVizSource interface in internal/admin (Snapshot only) so
  *keyviz.MemSampler satisfies it without a wrapper. Tests pass an
  in-memory fake.
- KeyVizHandler serves GET /admin/api/v1/keyviz/matrix. Query
  parameters: series (default writes per design §4.1),
  from_unix_ms, to_unix_ms, rows (capped at keyVizRowBudgetCap=1024
  per design §4.1).
- Returns 503 keyviz_disabled when the source is nil so the SPA
  renders an explicit "feature off" state instead of an empty matrix.
- Pivots column-major MatrixColumn into row-major JSON, mirroring
  the adapter (gRPC) handler from PR #646. The duplication is
  intentional for now — extracting a shared pivot helper is a
  future cleanup.
- ServerDeps.KeyViz field; buildAPIMux always registers the route
  (even when source is nil) so 503 is served instead of 404.
- main.go threads *keyviz.MemSampler through startAdminFromFlags →
  startAdminServer → buildAdminHTTPServer; keyvizSourceFromSampler
  converts the typed-nil to interface-nil so the handler's "is
  keyviz disabled" check fires correctly.

Tests:
- TestKeyVizHandlerReturnsServiceUnavailableWhenNoSource
- TestKeyVizHandlerRejectsNonGet
- TestKeyVizHandlerPivotsMatrix (two-column, two-route fixture)
- TestKeyVizHandlerSeriesParam (table-driven across all four enums)
- TestKeyVizHandlerSeriesParamRejectsUnknown (400 invalid_query)
- TestKeyVizHandlerHonorsRowsBudget
- TestKeyVizHandlerEncodesAggregateBucket
bootjp added a commit that referenced this pull request Apr 26, 2026
## Summary

Two follow-ups from the
[#649](#649) review that were
intentionally deferred to keep that PR focused on the functional bugs.

**Stacked on `feat/admin-dashboard-spa`** (#649). Once #649 merges, this
rebases cleanly onto `main`.

### 1. `useApi.ts` — drop the redundant abort guard

The `catch` handler had two checks back-to-back:

```ts
if (cancelled) return;
if (ctrl.signal.aborted) return;  // ← unreachable
```

The cleanup function sets `cancelled = true` *before* calling
`ctrl.abort()`, so any abort path is already covered by the first check.
Dropped the second guard and recorded *why* one check is enough in a
comment.

### 2. `Modal.tsx` — full WAI-ARIA dialog contract + focus trap

Added:

- `role="dialog"`, `aria-modal="true"`, `aria-labelledby` on the dialog
root (uses `useId()` for the title id).
- Focus management:
- On open: store the previously-focused element, focus the first
focusable inside the dialog.
- Tab / Shift+Tab cycle within the dialog instead of escaping to the
page underneath.
- On close: restore focus to whoever opened the dialog (skipped if that
element was unmounted by the dialog action — e.g. the dialog deleted the
row whose button opened it).
- Empty-dialog fallback focuses the dialog container itself with
`tabindex=-1`.

Vanilla DOM implementation — `focus-trap-react` / `react-focus-lock`
would have been 5–7 KB gzip for what is a ~40 line solution.

**Bundle delta**: +0.4 KB gzip (60.23 → 60.59 KB JS, 14.55 → 14.57 KB
CSS).

## Test plan

- [x] `cd web/admin && npm install && npm run build`
- [x] `cd web/admin && npm run lint` — `tsc --strict`
- [x] `go build ./...`
- [ ] Manual smoke (after #649 merges and the dashboard is visitable):
- Open Create Table / Delete Table dialog → focus lands inside the
dialog.
  - Tab past the last button → wraps back to the close button.
  - Shift+Tab from the close button → wraps to the last form field.
- Escape closes (when not `busy`) and focus returns to the trigger
button.
  - Screen reader announces the dialog name from the title text.
bootjp added a commit that referenced this pull request Apr 26, 2026
P2 slice 1 of
[docs/design/2026_04_24_proposed_admin_dashboard.md](https://github.com/bootjp/elastickv/blob/main/docs/design/2026_04_24_proposed_admin_dashboard.md).
Read-only S3 admin endpoints land first so the SPA's S3List and S3Detail
pages stop hitting 404. Write paths (POST/PUT/DELETE/ACL) ship in slice
2 together with AdminForward integration; until then they reply 405.

## Summary

- **`*adapter.S3Server.AdminListBuckets` / `AdminDescribeBucket`** —
SigV4-bypass read methods. Share `loadBucketMetaAt` + the
metadata-prefix scan with the SigV4 path, so a SigV4 `listBuckets` and
the admin dashboard cannot drift.
- **`internal/admin/buckets_source.go`** — `BucketsSource` interface +
`BucketSummary` DTO + sentinel errors (`ErrBucketsForbidden` /
`ErrBucketsNotLeader` / `ErrBucketsNotFound` /
`ErrBucketsAlreadyExists`). `ErrBucketsForbidden` is wired in this
slice; the others are reserved for slice 2 to keep the error vocabulary
additive.
- **`internal/admin/s3_handler.go`** — `S3Handler` with `handleList`
(paginated) + `handleDescribe` (404 for missing). Sub-paths under
`/buckets/{name}/` (the future `/acl`) return 404 here so a SPA bug
pointed at this build cannot accidentally hit the describe path with a
`"{name}/acl"` string.
- **Shared `list_pagination.go`** — centralises base64url cursor + limit
parsing/clamping so the Dynamo and S3 handlers cannot diverge on
validation rules. Drops the now-redundant `parseDynamoListLimit` /
`decodeDynamoNextToken` / `encodeDynamoNextToken` from
`dynamo_handler.go` in favour of `parseListPaginationParams` /
`decodeListNextToken` / `encodeListNextToken`.
- **`apiRouteTable` in `server.go`** — bundles the precomposed
middleware chains and dispatches by URL prefix. `buildAPIMux`'s body
went from 13 cyclomatic branches to 6, leaving headroom for the next
resource family (SQS queues, etc.) to land without another refactor.
- **Production wiring** — `main_s3.go`'s `startS3Server` now returns
`*adapter.S3Server`; `runtimeServerRunner` stores it;
`startAdminFromFlags` accepts it and threads `bucketsBridge` →
`admin.BucketsSource` → `ServerDeps.Buckets`.

## What is NOT in this PR

- `AdminCreateBucket` / `AdminPutBucketAcl` / `AdminDeleteBucket` on the
adapter and their HTTP/AdminForward counterparts. Slice 2.
- `RoleStore` plumb-through for S3. Slice 2 (read-only is fine for any
authenticated session today; the write paths will need it).
- `forwarded_from` audit trail for S3 admin writes. Slice 2 — needs the
`pb.AdminOperation` enum extended.
- The merge-freeze-acceptable changes to `proto/admin_forward.proto`.
Slice 2.

## Test plan

- [x] `go build ./...`
- [x] `go vet ./...`
- [x] `golangci-lint run` (admin + main + adapter packages: 0 issues)
- [x] `go test ./internal/admin/ -count=1 -race` — 13 new handler tests
pass
- [x] `go test . -count=1 -race` — main package passes
- [x] `go test -run "TestS3Server_Admin" ./adapter/ -count=1 -race` — 4
new adapter tests pass
- [ ] `go test ./adapter/` times out at 120s due to a pre-existing flake
(verified earlier on PR #648 against `main` — unrelated to this branch)
- [ ] End-to-end smoke against a 3-node cluster — needs the next slice +
the SPA PR (#649) to merge before there's anything for the SPA to render

## Acceptance criteria coverage (Section 4.1)

| Endpoint | This PR |
|---|---|
| `GET /admin/api/v1/s3/buckets` | ✓ |
| `POST /admin/api/v1/s3/buckets` | ⏳ slice 2 |
| `GET /admin/api/v1/s3/buckets/{name}` | ✓ |
| `PUT /admin/api/v1/s3/buckets/{name}/acl` | ⏳ slice 2 |
| `DELETE /admin/api/v1/s3/buckets/{name}` | ⏳ slice 2 |

## Self-review (5 lenses)

1. **Data loss**: No FSM / Raft / Pebble path changes.
`AdminListBuckets` / `AdminDescribeBucket` are read-only and use the
same `ScanAt` / `loadBucketMetaAt` as the SigV4 path.
2. **Concurrency**: No new shared state. The handler holds only the
immutable `BucketsSource` interface; the bridge holds only the immutable
`*S3Server`. `pinReadTS` is the same pattern the SigV4 path uses for
snapshot stability.
3. **Performance**: One additional metadata scan per `GET
/admin/api/v1/s3/buckets`, identical to the SigV4 listBuckets.
Pagination caps at 1000 buckets per page (silent clamp on oversize). No
hot-path changes.
4. **Data consistency**: Read-only; no commit-ts ordering risk. Admin
reads use the same lease-read window as the SigV4 path.
5. **Test coverage**: 13 handler tests + 4 adapter tests + the existing
dynamo/cluster test surfaces still pass after the shared-helper
refactor. The pagination cursor round-trip test pins the wire format so
a future cursor-encoding change cannot ship without breaking the test.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **New Features**
* Added S3 bucket admin API endpoints to list all buckets and retrieve
detailed bucket information, including creation timestamps.
* Implemented pagination support with cursor-based next_token for
browsing large bucket collections.
* Added proper authorization checks and error handling for admin bucket
operations (403 for forbidden, 404 for not found, 500 for errors).

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
bootjp added a commit that referenced this pull request Apr 26, 2026
Bridges the in-process keyviz.MemSampler into the in-node admin
HTTP listener (Layer C) so the SPA can render the heatmap without
an extra hop through the gRPC Admin service. The SPA was merged in
PR #649 but has no KeyViz page yet — this PR is the backend half;
the SPA page is task B in the design's Phase 2 follow-up.

- New KeyVizSource interface in internal/admin (Snapshot only) so
  *keyviz.MemSampler satisfies it without a wrapper. Tests pass an
  in-memory fake.
- KeyVizHandler serves GET /admin/api/v1/keyviz/matrix. Query
  parameters: series (default writes per design §4.1),
  from_unix_ms, to_unix_ms, rows (capped at keyVizRowBudgetCap=1024
  per design §4.1).
- Returns 503 keyviz_disabled when the source is nil so the SPA
  renders an explicit "feature off" state instead of an empty matrix.
- Pivots column-major MatrixColumn into row-major JSON, mirroring
  the adapter (gRPC) handler from PR #646. The duplication is
  intentional for now — extracting a shared pivot helper is a
  future cleanup.
- ServerDeps.KeyViz field; buildAPIMux always registers the route
  (even when source is nil) so 503 is served instead of 404.
- main.go threads *keyviz.MemSampler through startAdminFromFlags →
  startAdminServer → buildAdminHTTPServer; keyvizSourceFromSampler
  converts the typed-nil to interface-nil so the handler's "is
  keyviz disabled" check fires correctly.

Tests:
- TestKeyVizHandlerReturnsServiceUnavailableWhenNoSource
- TestKeyVizHandlerRejectsNonGet
- TestKeyVizHandlerPivotsMatrix (two-column, two-route fixture)
- TestKeyVizHandlerSeriesParam (table-driven across all four enums)
- TestKeyVizHandlerSeriesParamRejectsUnknown (400 invalid_query)
- TestKeyVizHandlerHonorsRowsBudget
- TestKeyVizHandlerEncodesAggregateBucket
bootjp added a commit that referenced this pull request Apr 26, 2026
…s []

Two medium findings from Gemini review on PR #670:

1. CountersTruncated field was stale on internal/admin.QueueSummary.
   The PR description called out that the field was dropped because
   main's sqsApproxCounters does not expose truncation, but I missed
   removing the JSON-side mirror in internal/admin/sqs_handler.go.
   The SPA was never reading it (SqsDetail.tsx already had the
   reference removed in this PR), so the field was load-bearing for
   nothing. Removed.

2. handleList serialised an empty queue catalog as
   {"queues": null} when AdminListQueues returned nil. The SPA
   iterates the array directly and would crash on null. Normalise
   nil to []string{} immediately before encoding so the response
   shape is always {"queues": []} even on the empty case.

No new tests because the existing internal/admin race test suite
already exercises the encoder path (would have caught a json.Marshal
break) and the empty-catalog case is exactly what the SPA hits the
moment a fresh node comes up — manual smoke during PR #649 already
verified the SPA renders an empty list cleanly when the array is [].

Verified with go build ./..., go test -race ./internal/admin/...,
go test -race -run TestSQS ./adapter/, golangci-lint run
./adapter/... ./internal/admin/... — all clean, no //nolint.
bootjp added a commit that referenced this pull request Apr 26, 2026
)

## Summary

Bridges the in-process `keyviz.MemSampler` into the in-node admin HTTP
listener (Layer C, `internal/admin`) so the SPA can render the heatmap
without an extra hop through the gRPC Admin service. The SPA was merged
in #649 but has no KeyViz page yet — this PR is the backend half; the
SPA page will follow as a separate slice.

- New `KeyVizSource` interface in `internal/admin` (Snapshot only) so
`*keyviz.MemSampler` satisfies it without a wrapper. Tests pass an
in-memory fake.
- `KeyVizHandler` serves `GET /admin/api/v1/keyviz/matrix`. Query
parameters: `series` (default `writes` per design §4.1), `from_unix_ms`,
`to_unix_ms`, `rows` (capped at `keyVizRowBudgetCap=1024` per design
§4.1).
- Returns `503 keyviz_disabled` when the source is nil so the SPA
renders an explicit "feature off" state instead of an empty matrix.
- Pivots column-major `MatrixColumn` into row-major JSON, mirroring the
adapter (gRPC) handler from PR #646. The duplication is intentional for
now — extracting a shared pivot helper is a future cleanup.
- `ServerDeps.KeyViz` field; `buildAPIMux` always registers the route
(even when source is nil) so `503` is served instead of `404`.
- `main.go` threads `*keyviz.MemSampler` through `startAdminFromFlags` →
`startAdminServer` → `buildAdminHTTPServer`; `keyvizSourceFromSampler`
converts the typed-nil to interface-nil so the handler's "is keyviz
disabled" check fires correctly.

## Test plan

- [x] `TestKeyVizHandlerReturnsServiceUnavailableWhenNoSource` — 503 +
`keyviz_disabled` code.
- [x] `TestKeyVizHandlerRejectsNonGet` — 405.
- [x] `TestKeyVizHandlerPivotsMatrix` — two-column, two-route fixture;
verifies missing-row-becomes-zero contract.
- [x] `TestKeyVizHandlerSeriesParam` — table-driven across all four enum
values.
- [x] `TestKeyVizHandlerSeriesParamRejectsUnknown` — 400 `invalid_query`
for typo'd series.
- [x] `TestKeyVizHandlerHonorsRowsBudget` — `?rows=2` returns top-2 by
activity, sorted by Start.
- [x] `TestKeyVizHandlerEncodesAggregateBucket` — virtual-bucket layout
with `route_count` from `MemberRoutesTotal` and `route_ids_truncated`
flag.
- [x] `go test ./internal/admin/... .` clean.
- [x] `golangci-lint run . ./internal/admin/...` clean.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added `/admin/api/v1/keyviz/matrix` API endpoint for retrieving key
visualization matrix data
* Supports filtering by metric type: reads, writes, read_bytes, and
write_bytes
  * Supports time-range filtering with Unix millisecond precision
  * Supports row limiting and activity-based sorting
  * Returns 503 status when key visualization sampling is disabled

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
bootjp added a commit that referenced this pull request Apr 26, 2026
Replaces PR #659, which conflicted heavily after main moved (PR #649
squashed; PR #658 added S3 admin endpoints; the Approximate counters
implementation now lives directly in adapter/sqs_catalog.go).

This PR:

Backend (adapter/sqs_admin.go + internal/admin/sqs_handler.go):
- SQSServer.AdminListQueues / AdminDescribeQueue / AdminDeleteQueue
  are SigV4-bypass entrypoints, mirroring the AdminListTables /
  AdminListBuckets pattern.
- AdminDescribeQueue uses the existing scanApproxCounters from
  sqs_catalog.go (already on main) so the admin path returns the
  same Visible / NotVisible / Delayed numbers as
  GetQueueAttributes("All") would, taken at one snapshot read TS.
- sqsQueuesBridge in main_admin.go re-shapes
  adapter.AdminQueueSummary into admin.QueueSummary, keeping
  internal/admin free of the heavy adapter dependency tree —
  same pattern as dynamoTablesBridge / s3BucketsBridge.
- admin.QueuesSource is opt-in; deployments that don't run
  --sqsAddress leave /admin/api/v1/sqs/* off the wire and the
  SPA renders a soft "endpoint pending" notice on the 404.
- Role re-evaluation against the live RoleStore on DELETE so a
  downgraded key cannot keep mutating with a still-valid JWT.
- apiRouteTable.dispatch refactored: resourceHandlerFor extracted
  so the dispatcher stays under cyclop=10 as new resources land
  (Dynamo, S3, SQS, future).

Frontend (web/admin/src/pages/SqsList.tsx, SqsDetail.tsx):
- /sqs queue list with refresh + per-row link to detail.
- /sqs/:name detail showing FIFO badge, counters card (Visible /
  In-flight / Delayed), raw attributes table, and a Delete
  confirmation Modal gated by RequireFullAccess.
- api/client.ts gains listQueues / describeQueue / deleteQueue
  with the same AbortSignal pattern used for cluster / dynamo /
  s3 reads.
- Layout nav adds an SQS tab between DynamoDB and S3.

Out of scope (recorded in the SQS partial design doc §16.2):
- PurgeQueue from the SPA. Underlying purgeQueueWithRetry is on
  main; the admin entrypoint is a trivial follow-up.
- Send / Peek / CreateQueue from the SPA. Each needs its own
  adapter entrypoint and form UX; deferred to keep this PR
  focused.

Verified with go build ./..., go test -race ./internal/admin/...,
go test -race -run TestSQS ./adapter/, go test -run TestStartAdmin .,
golangci-lint run ./adapter/... ./internal/admin/... ./...
(0 issues, no //nolint), and cd web/admin && npm run build.
bootjp added a commit that referenced this pull request Apr 26, 2026
…s []

Two medium findings from Gemini review on PR #670:

1. CountersTruncated field was stale on internal/admin.QueueSummary.
   The PR description called out that the field was dropped because
   main's sqsApproxCounters does not expose truncation, but I missed
   removing the JSON-side mirror in internal/admin/sqs_handler.go.
   The SPA was never reading it (SqsDetail.tsx already had the
   reference removed in this PR), so the field was load-bearing for
   nothing. Removed.

2. handleList serialised an empty queue catalog as
   {"queues": null} when AdminListQueues returned nil. The SPA
   iterates the array directly and would crash on null. Normalise
   nil to []string{} immediately before encoding so the response
   shape is always {"queues": []} even on the empty case.

No new tests because the existing internal/admin race test suite
already exercises the encoder path (would have caught a json.Marshal
break) and the empty-catalog case is exactly what the SPA hits the
moment a fresh node comes up — manual smoke during PR #649 already
verified the SPA renders an empty list cleanly when the array is [].

Verified with go build ./..., go test -race ./internal/admin/...,
go test -race -run TestSQS ./adapter/, golangci-lint run
./adapter/... ./internal/admin/... — all clean, no //nolint.
bootjp added a commit that referenced this pull request Apr 26, 2026
Per `docs/design/README.md`'s lifecycle convention, the admin dashboard
design doc is now "partial":

- **P1** (DynamoDB CRUD + AdminForward) — shipped via #634, #635, #644,
#648
- **P2** (S3 buckets list/create/delete/ACL + DescribeTable) — shipped
via #658, with #669 + #673 in flight
- **P3** (React SPA + embed) — shipped via #649, #650
- **P4** (TLS / role / CSRF / operator docs) — TLS, role, CSRF are
already live in P1; operator docs in #674

Independent of the in-flight slice 2 PRs (#669/#673) and the docs PR
(#674) — this rename only reflects what is already on main today, plus
an "Implementation status" table mapping each phase to the PR it landed
in.

## What this PR changes

- `git mv` the design doc from `2026_04_24_proposed_admin_dashboard.md`
to `2026_04_24_partial_admin_dashboard.md` so its history follows
- Add an "Implementation status" header table indexing each phase to the
PRs that landed it
- List the outstanding open items so future readers know what is still
owed against the original proposal:
- AdminForward acceptance criterion 5 (rolling-upgrade compat flag) —
deferred
  - S3 object browser — explicitly out of scope per Section 2 Non-goals
  - TLS cert hot-reload — restart-to-rotate is the documented model

When the rolling-upgrade flag lands, the doc gets renamed once more to
`2026_04_24_implemented_admin_dashboard.md` per the README's lifecycle
convention.
bootjp added a commit that referenced this pull request Apr 26, 2026
## Summary

Phase **3.B** of
[`docs/design/2026_04_24_partial_sqs_compatible_adapter.md`](https://github.com/bootjp/elastickv/blob/main/docs/design/2026_04_24_partial_sqs_compatible_adapter.md)
§16.4. Adds the AWS SQS **query protocol** (form-encoded request, XML
response) alongside the existing JSON protocol on the same listener —
older `aws-sdk-java` v1, `boto < 1.34`, and AWS CLI clients can now talk
to elastickv without modification. Detection happens per-request via
`Content-Type` / `X-Amz-Target` / `Action` presence; no flag, no
separate port. See the new proposal doc committed alongside.

This is an **architectural proof PR**: dispatch + decoding + encoding +
error envelope + the `*Core` refactor pattern are all in place, with
**three verbs** wired end-to-end as concrete proof. Each follow-up verb
is a parser + response struct + one switch arm — no further design work
needed.

### Verbs in this PR

| Verb | Why it's in the proof set |
|---|---|
| `CreateQueue` | Exercises the `Attribute.N.{Name,Value}`
indexed-collection parser. |
| `ListQueues` | Exercises the repeated-element XML response shape. |
| `GetQueueUrl` | Exercises the `<ErrorResponse>` envelope path via
`QueueDoesNotExist`. |

Every other verb returns a structured **501 `NotImplementedYet`** XML
envelope so operators see the gap explicitly. `SendMessage` /
`ReceiveMessage` / `DeleteMessage` are the highest-priority follow-ups
(they need the same `*Core` refactor on the FIFO send loop).

### Key design points

- **No new listener / no flag.** `pickSqsProtocol(*http.Request)`
decides per request. JSON and Query share the SQS port and the SigV4
path.
- **Wire-format-free cores.** `createQueue` / `listQueues` /
`getQueueUrl` are now `decode → core → encode` with `core(ctx, in) (out,
error)`. The JSON wrappers are unchanged in behavior; existing JSON
tests pass without modification.
- **DoS protection inherited.** Body read is bounded by the same
`sqsMaxRequestBodyBytes` the JSON path uses.
- **SigV4 unchanged.** The signed canonical request includes the
form-encoded body, so the existing SigV4 middleware verifies query
requests without code changes.
- **Error parity.** `<Code>` reuses the existing `sqsErr*` constants.
HTTP status mirrors what the JSON path returns, so SDK retry classifiers
work across protocols.
- **Cyclomatic budget honoured.** `handle()` was refactored to extract
`handleQueryProtocol` — `cyclop ≤ 10` per project rules, no `//nolint`.

### Known limitation (design §11.4)

`proxyToLeader`'s error writer always emits the JSON envelope, so a
query-protocol client hitting a follower during a leader flip sees one
JSON error before retry lands on the new leader. Follow-up PR threads
the detected protocol onto the request context so the proxy emits
matching XML.

## Test plan

- [x] `go build ./...` — clean
- [x] `go test -count=1 -race -run
"TestSQS|QueryProtocol|TestPickSqs|TestCollectIndexedKV|TestWriteSQSQueryError"
./adapter/` — passes
- [x] `golangci-lint run ./adapter/...` — `0 issues.`
- [x] `pickSqsProtocol` table tests cover documented edge cases (header
precedence, charset suffix, GET with Action, missing Action, nil
request).
- [x] `collectIndexedKVPairs` tests cover happy path, orphan Name, empty
input, unrelated prefix.
- [x] End-to-end via the in-process listener: CreateQueue / ListQueues /
GetQueueUrl round-trip on the query side.
- [x] **Cross-protocol parity**: a queue created via Query is visible
via JSON `GetQueueUrl` with the same URL.
- [x] Error envelope: 4xx maps to `<Type>Sender</Type>`, 5xx to
`<Type>Receiver</Type>`, namespace pinned, `x-amzn-ErrorType` header
set.
- [x] Unknown verb returns 501 with the `NotImplementedYet` XML
envelope.
- [x] Missing `Action` parameter returns 400 (per design §3).

## Self-review (5 lenses)

1. **Data loss** — Wire-format change only. Cores are byte-for-byte
identical to the previous handler bodies; no Raft / OCC / MVCC code is
touched.
2. **Concurrency** — No new shared state. Detection is request-local.
Body parsing is bounded.
3. **Performance** — One additional `Content-Type` string compare per
request on the dispatch hot path. Negligible.
4. **Data consistency** — `*Core` returns the same business-logic
outputs as before; the JSON tests are the regression net for parity.
Cross-protocol parity test pins behaviour.
5. **Test coverage** — 10 new test cases cover detection, parsing,
envelope shape, and three end-to-end verbs. Existing `TestSQS*` race
suite passes on the refactor.

## Stacking

This PR is **independent** — branched from current `main` (which has
#638 + #649 merged). It does not depend on PR #650 / PR #659. Merge
whenever ready.
bootjp added a commit that referenced this pull request Apr 26, 2026
)

## Summary

**Replaces #659**, which has unresolvable conflicts now that main has
moved on (PR #649 squashed into main; PR #658 added the S3 admin
endpoints; the Approximate counters implementation now lives directly in
`adapter/sqs_catalog.go` via `scanApproxCounters`). Rather than a
multi-day rebase, this PR re-applies the unique SQS admin code on a
fresh branch off current main.

What survived from #659:
- `adapter/sqs_admin.go` — `SQSServer.AdminListQueues` /
`AdminDescribeQueue` / `AdminDeleteQueue` (SigV4-bypass entrypoints,
same shape as `AdminListTables` / `AdminListBuckets`).
- `internal/admin/sqs_handler.go` — HTTP handler for
`/admin/api/v1/sqs/queues{,/{name}}` with role re-evaluation on DELETE.
- `web/admin/src/pages/SqsList.tsx` / `SqsDetail.tsx` — SPA pages for
the queues view + delete confirmation.

What changed during the re-apply:
- `AdminQueueCounters` is now `int64` (matches `sqsApproxCounters` from
main; bridge does no width conversion).
- `AdminDescribeQueue` calls main's `scanApproxCounters` instead of the
duplicate `computeApproxCounters` from the old branch — same numeric
output, single implementation.
- Dropped the `CountersTruncated` field; main's counter type doesn't
expose truncation. SPA's "truncated" pill came out with it.
- `apiRouteTable.dispatch` refactored to extract `resourceHandlerFor` so
the dispatcher stays under cyclop=10 as new resources land.

## Backend

- Re-evaluates the principal's role against the live `MapRoleStore` on
every `DELETE` so a downgraded key cannot keep mutating with a
still-valid JWT (Codex P1 pattern from earlier admin PRs).
- `admin.QueuesSource` is **opt-in**: deployments without `--sqsAddress`
leave `/admin/api/v1/sqs/*` off the wire; the SPA renders a soft
"endpoint pending" notice on the 404, mirroring the Tables / Buckets
`nil` contract.
- The bridge in `main_admin.go` (`sqsQueuesBridge`,
`convertAdminQueueSummary`, `translateAdminQueuesError`) keeps
`internal/admin` free of the heavy adapter dependency tree, same
architectural pattern as Dynamo and S3.

## Frontend

- **/sqs** queue list with refresh + per-row link to detail.
- **/sqs/:name** detail showing FIFO badge, counters card (Visible /
In-flight / Delayed), raw attributes table, and a Delete confirmation
`Modal` gated by `RequireFullAccess`.
- `api/client.ts` gains `listQueues` / `describeQueue` / `deleteQueue`
with the same `AbortSignal` pattern used for `cluster` / `dynamo` / `s3`
reads.
- Layout nav adds an SQS tab between DynamoDB and S3.

## Out of scope (recorded in
`docs/design/2026_04_24_proposed_sqs_compatible_adapter.md` Section 14,
deferred per the SQS partial doc §16.2)

- **PurgeQueue from the SPA** — the underlying `purgeQueueWithRetry`
adapter method is on main; the admin entrypoint is a trivial follow-up.
- **Send / Peek / CreateQueue from the SPA** — each needs its own
SigV4-bypass adapter entrypoint and form UX; deferred to keep this PR
focused.

## Test plan

- [x] `go build ./...` — clean
- [x] `go test -race ./internal/admin/...` — passes
- [x] `go test -race -run TestSQS ./adapter/` — passes
- [x] `go test -run TestStartAdmin .` — passes
- [x] `golangci-lint run ./adapter/... ./internal/admin/... ./...` — `0
issues.`, no `//nolint`
- [x] `cd web/admin && npm run build` — 49 modules, 199 KB JS / 61 KB
gzip + 14.7 KB CSS
- [ ] Manual smoke (after PR lands): start a node with `--sqsAddress
:4566 --adminEnabled --adminAllowInsecureDevCookie`, create a queue,
send a few messages, hit `/admin/sqs/<name>` → counters match
`GetQueueAttributes("All")`, Delete dialog returns to list.

## Self-review (5 lenses)

1. **Data loss** — Delete reuses the existing `deleteQueueWithRetry` OCC
path; counters are read-only. No new write paths.
2. **Concurrency** — Per-request leader check on Delete; counters scan
uses one snapshot read TS.
3. **Performance** — Counters bounded by main's existing
`sqsApproxCounterScanLimit`; admin reads are cheap point lookups + one
bounded scan.
4. **Data consistency** — `AdminDescribeQueue` and SigV4
`GetQueueAttributes` both call `scanApproxCounters` at a fresh
`nextTxnReadTS`, so a single point in time produces the same counters
via either surface.
5. **Test coverage** — Existing admin / SQS race suites stay green via
the new `nil` Queues argument added to `startAdminServer` call sites;
the new bridge is exercised by the cross-package build itself.

## Stacking

This PR is **independent** — branched from current `main` (which has the
merged versions of #649 / #658 / #650 / counter implementation). Closing
#659 in favour of this clean rewrite.
bootjp added a commit that referenced this pull request Apr 28, 2026
Per docs/design/README.md's lifecycle convention. The original
P1–P4 plan has fully shipped:

  - P1 (admin skeleton + Dynamo + AdminForward) — #634/#635/#644/#648
  - P2 (S3 endpoints incl. write paths and AdminForward integration)
        — #658 / #669 / #673 / #695 (TOCTOU safety net)
  - P3 (React SPA + embed) — #649 / #650
  - P4 (TLS / role / CSRF / operator doc / deployment runbook
        / scripts/rolling-update.sh admin support) — #674 / #669
        / #678

The AdminDeleteBucket TOCTOU caught during PR #669 review (the
last "in-flight" item that kept the doc at _partial_) is fully
resolved by the safety-net design landed in #695.

What changed:

  - git mv 2026_04_24_partial_admin_dashboard.md →
            2026_04_24_implemented_admin_dashboard.md (history
            follows the rename)
  - Header Status line: "Partial" → "Implemented", explanation
    updated to reflect the post-fix state and the rationale for
    promotion.
  - "Last updated" bumped to 2026-04-28 with the rename trigger.
  - Section heading "Outstanding open items" → "Out-of-scope
    follow-ups" — the remaining three entries (criterion 5,
    object browser, TLS hot-reload) are not in-flight work; they
    are deferred-at-design or Non-goal items. The TOCTOU bullet
    is removed (resolved) and replaced with a one-line
    cross-link to the safety-net design + admin_deployment.md
    §4.6 contract.
  - Removed the closing "rename trigger" sentence — we just did
    the rename.
  - Status table: P2 row now lists #695 alongside #658/#669/#673
    so a future reader can find the TOCTOU fix from the index.
  - Cross-references updated everywhere the old filename
    appeared:
      docs/admin.md (header link + Cross-references)
      docs/admin_deployment.md (header link + final cross-ref)
      docs/design/2026_04_28_proposed_admin_delete_bucket_safety_net.md
        (Background section pointer)
      internal/admin/config.go (Section 7.1 reference comment)

No code changes other than the comment-only filename refresh in
config.go.
bootjp added a commit that referenced this pull request Apr 28, 2026
## Summary

Promote the admin dashboard design doc from `_partial_` →
`_implemented_` per `docs/design/README.md`'s lifecycle convention. PR
#695 landed the TOCTOU safety-net fix (the last in-flight item that kept
the doc at `_partial_`), so the original P1–P4 plan is now fully
shipped:

| Phase | Landed via |
|---|---|
| P1 (admin skeleton + Dynamo + AdminForward) | #634 / #635 / #644 /
#648 |
| P2 (S3 endpoints + writes + AdminForward S3 + TOCTOU fix) | #658 /
#669 / #673 / **#695** |
| P3 (React SPA + embed) | #649 / #650 |
| P4 (TLS / role / CSRF / operator doc / runbook / script wiring) | #674
/ #669 / #678 |

The remaining three items in the doc move from "Outstanding open items"
(in-flight) to **"Out-of-scope follow-ups"** (deferred-at-design or
Non-goal):

- AdminForward criterion 5 — rolling-upgrade flag, deferred behind a
cluster-version bump that doesn't exist yet
- S3 object browser — Non-goal per §2.2
- TLS cert hot-reload — out of scope per `docs/admin.md`

## Changes

- `git mv` partial → implemented (history follows the rename)
- Status line / Last-updated / status-table / Out-of-scope section
content reflects the promotion
- Cross-references updated in all 4 referencing files: `docs/admin.md`,
`docs/admin_deployment.md`,
`docs/design/2026_04_28_proposed_admin_delete_bucket_safety_net.md`,
`internal/admin/config.go` (comment-only)

## Test plan

- [x] No code changes other than a comment-only filename refresh in
`config.go`
- [x] `go build ./...` passes
- [x] `golangci-lint run ./internal/admin/...` — 0 issues
- [x] `grep -rn "2026_04_24_partial\|2026_04_24_proposed_admin"` returns
nothing — no stale references
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant