Skip to content

Commit b218dd3

Browse files
authored
web/admin: SPA Messages tab + Purge button + DLQ chips (Phase 5) (#798)
## Summary Phase 5 of `docs/design/2026_05_16_proposed_admin_purge_queue.md` — operator-facing SPA UI for the `AdminPeekQueue` / `AdminPurgeQueue` endpoints PR #797 wired. The queue detail page now ships: - **DLQ chip + "DLQ for" sources list** — when another queue's `RedrivePolicy` points at this queue, a pill in the header marks it as a DLQ and a section below lists the source queues (each linked to its detail page so the operator can navigate the topology). - **Messages section** — paginated table of currently-visible messages: message-id prefix, sent timestamp, FIFO `MessageGroupId` (column shown only on FIFO queues), receive count, body preview, original size. Row-click opens a detail modal showing the full body, every attribute (typed `String` / `Binary` / `Number` etc.), and a "Copy as JSON" button writing a schema-versioned payload to the clipboard. - **Purge button** — labelled `Purge messages` or `Purge DLQ` depending on `is_dlq`. Confirmation modal requires typing the exact queue name; on 429 `PurgeQueueInProgress` the modal stays open and surfaces the rate-limit message. > ⚠️ **Stacked on PR #797.** This branch builds on `feat/sqs-admin-peek-purge-http` because the SPA needs the Phase 4 HTTP endpoints. Once #797 merges to main, please rebase (or merge #797 first then this auto-fast-forwards). I can rebase on request. ## Eager full-body fetch The list call sets `body_max_bytes = 262144` (= 256 KiB, SQS's stored-message hard cap) so the detail modal renders directly from the row already in memory — no re-peek round-trip on modal open. This eliminates the entire "row disappeared between list and modal" class of failures: concurrent purge, ReceiveMessage from another client, visibility timer started, leader step-down (design doc §3.5). Page size stays at the documented Phase 5 default of 20 rows. Worst case: 20 × 256 KiB = 5 MiB response, well under typical network and JSON-parse budgets. ## API client changes (`web/admin/src/api/client.ts`) - `SqsQueueSummary` gains `is_dlq` + optional `dlq_sources`. - New types: `SqsPeekedAttribute`, `SqsPeekedMessage`, `SqsPeekResult`, `SqsPeekOptions`. - New methods: `api.peekQueue(name, opts, signal)` (signal-aware so navigation aborts the in-flight peek) and `api.purgeQueue(name)`. ## Handler 429 envelope (`internal/admin/sqs_handler.go`) The Phase 4 `writePurgeInProgress` was emitting the JSON body with a `code` key, which diverged from `writeJSONError`'s canonical `error` key — so `apiFetch.ts` could not extract the AWS-style sentinel into `ApiError.code` consistently with every other 4xx. Switched to `error` (with `retry_after_seconds` still in the body alongside the canonical `Retry-After` header). The matching unit test assertion was updated. **Caller audit (semantic-change rule):** the wire-shape adjustment has no other consumer yet — Phase 4 (#797) introduced the endpoint and the test was the sole reader. Updated. ## Risk Low. New SPA UI only; the handler change is a 1-key rename matching the existing admin error envelope. ## Self-review (5 passes) 1. **Data loss** — read endpoint (peek) plus the existing `AdminPurgeQueue` wrapper. No new write paths. 2. **Concurrency** — eager full-body fetch eliminates the modal re-peek race. AbortController cancels in-flight peek on navigation / unmount; `DOMException name=AbortError` is filtered before setError to avoid stale-component warnings. 3. **Performance** — page size capped at 20 (default; server clamps to ≤ 100). Body cap 256 KiB. No hot-path allocations. 4. **Consistency** — Purge button reuses the queue's name as the confirmation token, matching the existing Delete confirmation. The 429 response shape now matches every other admin 4xx so the SPA error handling is uniform. 5. **Test coverage** — `npm run lint` (tsc -b --noEmit) passes. `npm run build` (vite) passes. Go-side `TestSqsHandler_PurgeQueue_RateLimited429` updated for the `error`-key envelope. No frontend test infra in the project (intentional; the `lint` script is the contract). ## Test plan - [x] `cd web/admin && npm run lint` (tsc -b --noEmit) passes - [x] `cd web/admin && npm run build` (vite build) passes - [x] `go test -race -count=1 ./internal/admin/...` passes (12 existing + the 429 envelope update) - [x] `golangci-lint run ./internal/admin/...` 0 issues - [ ] CI on this PR - [ ] Manual smoke: navigate to a queue detail page, peek messages, open the modal, copy-as-JSON, attempt purge twice (second attempt should hit 429) ## Out of scope (follow-ups) - **Throttle integration**: dedicated per-queue admin-peek bucket (`bucketActionAdminPeek` + `resolveActionConfig` explicit case + typed `adminPeekThrottledError` / `PeekThrottledError`). Leader-only + `Limit ≤ 100` already bound the steady-state cost; the dedicated bucket adds a per-queue cap. - **Audit logging & Prometheus counters** (design doc §3.6): `admin.sqs.purge_queue` audit line + `elastickv_sqs_admin_purge_queue_total{queue, outcome}` / `elastickv_sqs_admin_peek_queue_total{queue, outcome}`. - **Page-size selector** (20 / 50 / 100) + cumulative-response-size warning. The MVP defaults to 20 to keep the worst-case at 5 MiB; larger sizes can land as a follow-up if operators request them. - **`principalForReadSensitive` live `RoleStore` re-check** (design doc Goal 8) — blocked on the wider live-role plumbing. - **Design doc rename** `_proposed_` → `_implemented_` once Phase 6 lands (just the rename — Phases 2/3/4/5 are functionally complete). <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added message peeking with pagination for SQS queues * Introduced Dead Letter Queue (DLQ) indicators and source queue references * Added queue purge workflow with confirmation modal * Added "Copy as JSON" functionality for message details * **Updates** * Enhanced SQS error response formatting for improved API consistency <!-- review_stack_entry_start --> [![Review Change Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/bootjp/elastickv/pull/798?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai -->
2 parents ab1d300 + 0370c0c commit b218dd3

4 files changed

Lines changed: 500 additions & 25 deletions

File tree

internal/admin/sqs_handler.go

Lines changed: 7 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -40,16 +40,9 @@ type QueueSummary struct {
4040
CreatedAt *time.Time `json:"created_at,omitempty"`
4141
Attributes map[string]string `json:"attributes,omitempty"`
4242
Counters QueueCounters `json:"counters"`
43-
// IsDLQ is true when at least one other queue's RedrivePolicy
44-
// resolves its deadLetterTargetArn to this queue. The SPA uses
45-
// the flag to switch the Messages-tab framing and the Purge
46-
// button label between "Purge messages" and "Purge DLQ".
43+
// True when another queue's RedrivePolicy points at this one.
4744
IsDLQ bool `json:"is_dlq"`
48-
// DLQSources lists the names of queues whose RedrivePolicy
49-
// points at this queue, sorted lexicographically. Empty when
50-
// IsDLQ is false; the SPA renders these as chips on the queue
51-
// detail page so the operator confirms what feeds the DLQ
52-
// before purging.
45+
// Source queue names that point at this DLQ, sorted lex.
5346
DLQSources []string `json:"dlq_sources,omitempty"`
5447
}
5548

@@ -698,10 +691,9 @@ func writeQueuesError(w http.ResponseWriter, err error, logger *slog.Logger, r *
698691
}
699692
}
700693

701-
// writePurgeInProgress emits the 429 response shape the design doc
702-
// §3.4 specifies: Retry-After header (rounded up to whole seconds so
703-
// a client retrying exactly at the deadline is guaranteed to clear)
704-
// + JSON body { code, message, retry_after_seconds }.
694+
// writePurgeInProgress emits the 429 wire shape (Retry-After header
695+
// + JSON body { error, message, retry_after_seconds }). Whole-second
696+
// rounding-up so a deadline-exact retry is guaranteed to clear.
705697
func writePurgeInProgress(w http.ResponseWriter, err *PurgeInProgressError) {
706698
secs := int64(err.RetryAfter / time.Second)
707699
if err.RetryAfter%time.Second != 0 {
@@ -712,10 +704,11 @@ func writePurgeInProgress(w http.ResponseWriter, err *PurgeInProgressError) {
712704
}
713705
w.Header().Set("Retry-After", strconv.FormatInt(secs, 10))
714706
w.Header().Set("Content-Type", "application/json; charset=utf-8")
707+
w.Header().Set("X-Content-Type-Options", "nosniff")
715708
w.Header().Set("Cache-Control", "no-store")
716709
w.WriteHeader(http.StatusTooManyRequests)
717710
_ = json.NewEncoder(w).Encode(map[string]any{
718-
"code": "PurgeQueueInProgress",
711+
"error": "PurgeQueueInProgress",
719712
"message": "only one PurgeQueue operation on each queue is allowed every 60 seconds",
720713
"retry_after_seconds": secs,
721714
})

internal/admin/sqs_handler_test.go

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -347,7 +347,10 @@ func TestSqsHandler_PurgeQueue_RateLimited429(t *testing.T) {
347347
require.Equal(t, "43", rec.Header().Get("Retry-After"))
348348
var body map[string]any
349349
require.NoError(t, json.Unmarshal(rec.Body.Bytes(), &body))
350-
require.Equal(t, "PurgeQueueInProgress", body["code"])
350+
// The "error" key (not "code") matches writeJSONError's envelope
351+
// so apiFetch.ts can extract the AWS-style sentinel consistently
352+
// with every other 4xx error.
353+
require.Equal(t, "PurgeQueueInProgress", body["error"])
351354
require.EqualValues(t, 43, body["retry_after_seconds"])
352355
}
353356

web/admin/src/api/client.ts

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -210,12 +210,48 @@ export interface SqsQueueSummary {
210210
created_at?: string;
211211
attributes?: Record<string, string>;
212212
counters: SqsQueueCounters;
213+
// True when another queue's RedrivePolicy points at this one.
214+
is_dlq: boolean;
215+
// Source queue names that point at this DLQ, sorted lex.
216+
dlq_sources?: string[];
213217
}
214218

215219
export interface SqsQueueList {
216220
queues: string[];
217221
}
218222

223+
// SqsPeekedAttribute mirrors AWS's typed MessageAttribute shape;
224+
// binary_value arrives base64-encoded.
225+
export interface SqsPeekedAttribute {
226+
data_type: string;
227+
string_value?: string;
228+
binary_value?: string;
229+
}
230+
231+
export interface SqsPeekedMessage {
232+
message_id: string;
233+
body: string;
234+
body_truncated: boolean;
235+
body_original_size: number;
236+
sent_timestamp: string;
237+
receive_count: number;
238+
group_id?: string;
239+
deduplication_id?: string;
240+
attributes?: Record<string, SqsPeekedAttribute>;
241+
}
242+
243+
export interface SqsPeekResult {
244+
messages: SqsPeekedMessage[];
245+
// Omitted when the walk has fully completed for this MVCC snapshot.
246+
next_cursor?: string;
247+
}
248+
249+
export interface SqsPeekOptions {
250+
limit?: number;
251+
cursor?: string;
252+
body_max_bytes?: number;
253+
}
254+
219255
// KeyViz wire shapes mirror internal/admin/keyviz_handler.go
220256
// (KeyVizMatrix / KeyVizRow). Go []byte fields arrive as
221257
// base64-encoded strings via encoding/json — keep them as `string` on
@@ -311,6 +347,21 @@ export const api = {
311347
apiFetch<SqsQueueSummary>(`/sqs/queues/${encodeURIComponent(name)}`, { signal }),
312348
deleteQueue: (name: string) =>
313349
apiFetch<void>(`/sqs/queues/${encodeURIComponent(name)}`, { method: "DELETE" }),
350+
// Non-destructive peek of currently-visible messages. Server clamps
351+
// limit to [1, 100] and body_max_bytes to [256, 262144].
352+
peekQueue: (name: string, opts?: SqsPeekOptions, signal?: AbortSignal) =>
353+
apiFetch<SqsPeekResult>(`/sqs/queues/${encodeURIComponent(name)}/messages`, {
354+
query: {
355+
limit: opts?.limit,
356+
cursor: opts?.cursor,
357+
body_max_bytes: opts?.body_max_bytes,
358+
},
359+
signal,
360+
}),
361+
// Drains the queue's messages while leaving meta/ARN/RedrivePolicy intact.
362+
// 60-second rate limit per queue: second purge inside the window → 429.
363+
purgeQueue: (name: string) =>
364+
apiFetch<void>(`/sqs/queues/${encodeURIComponent(name)}/messages`, { method: "DELETE" }),
314365
keyVizMatrix: (params: KeyVizParams, signal?: AbortSignal) =>
315366
apiFetch<KeyVizMatrix>("/keyviz/matrix", {
316367
query: {

0 commit comments

Comments
 (0)