Skip to content

Commit 84a764a

Browse files
committed
admin: enforce JWT role gate in S3Handler.principalForWrite
Codex P2 + coderabbitai Minor on PR #669 caught that the S3 admin handler diverged from the Dynamo path in production wiring. The previous shape was: if h.roles == nil { /* JWT check */ ... return principal } /* h.roles != nil — production: skip JWT, only live role */ role, _ := h.roles.LookupRole(...) if !role.AllowsWrite() { 403 } So a session whose JWT was minted as `read-only` could **escalate** to write capability the moment the live role index promoted that access key to `full` — exactly the kind of between-login-and-revocation window the JWT was supposed to freeze. DynamoHandler.principalForWrite (dynamo_handler.go:413) fires the JWT gate unconditionally and then layers the live role check on top, which is the safer ordering: the JWT can only narrow capability, never widen it. Aligned the S3 handler with that contract. The fix also picks up a second case the Dynamo path already covers: a JWT minted as `full` whose live role has since been demoted to read-only or removed entirely is now rejected with 403 instead of silently sliding through. Per CLAUDE.md "test the bug first" — `newS3HandlerForTest` never called `WithRoleStore`, so every existing test ran the `h.roles == nil` branch and the production path was unproven. Added two new tests: TestS3Handler_PrincipalForWrite_JWTGateFiresWithRoleStore — table sweep over all three S3 admin write endpoints (create, put_acl, delete) with WithRoleStore wired and JWT=read-only / live=full. All three must 403. TestS3Handler_PrincipalForWrite_LiveRoleNarrowsJWT — JWT=full + live=read-only or removed-from-index. The full live-role-narrowing branch was unproven before; this pins both demotion and revocation cases. Also documented the AdminDeleteBucket TOCTOU that coderabbitai flagged as 🔴 / 🟠 in the same review: AdminDeleteBucket scans ObjectManifestPrefixForBucket at readTS (limit=1, an "is bucket empty?" probe) and dispatches the BucketMetaKey delete with that point key only in ReadKeys. A concurrent PutObject inserting a manifest key in the scanned prefix between readTS and commit will not conflict — the OCC validator only inspects keys appearing in ReadKeys, and there is no ReadRanges mechanism. This is a pre-existing race that adapter/s3.go:deleteBucket (SigV4 path) inherits as well — closing the gap requires either bumping BucketGenerationKey on every PutObject so it serves as an OCC token here, or extending OperationGroup with ReadRanges and teaching the FSM to validate range emptiness atomically with commit. Both are larger schema changes outside this PR's scope. Recorded the limitation as a code comment on AdminDeleteBucket so the next reader of the function knows the contract; tracking the fix as a follow-up rather than expanding the PR's surface. go test -race ./internal/admin/ — passes. golangci-lint run ./internal/admin/... ./adapter/... — 0 issues.
1 parent e911458 commit 84a764a

3 files changed

Lines changed: 139 additions & 18 deletions

File tree

adapter/s3_admin.go

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -358,6 +358,30 @@ func (s *S3Server) AdminPutBucketAcl(ctx context.Context, principal AdminPrincip
358358
// authorisation contract as the other admin write methods. The
359359
// bucket-must-be-empty rule mirrors the SigV4 deleteBucket path —
360360
// the dashboard cannot force a recursive delete, by design.
361+
//
362+
// Known orphan-race limitation (coderabbitai 🔴 / 🟠 on PR #669):
363+
// the empty-bucket probe (ScanAt with limit=1 on
364+
// ObjectManifestPrefixForBucket) reads at readTS but the
365+
// subsequent BucketMetaKey delete only carries that single point
366+
// key in its ReadKeys set. A concurrent PutObject that inserts a
367+
// manifest key in the scanned prefix between readTS and the
368+
// delete's commitTS will not conflict — the OCC validator only
369+
// inspects keys that appear in ReadKeys, and there is no
370+
// ReadRanges mechanism today. The object's manifest key survives
371+
// under a now-deleted bucket meta and becomes orphaned.
372+
//
373+
// This race exists pre-existing in the SigV4 path
374+
// (adapter/s3.go:deleteBucket — same shape, same limitation), so
375+
// AdminDeleteBucket inherits the contract; closing the gap
376+
// requires either (a) bumping BucketGenerationKey on every
377+
// PutObject so it can serve as an OCC token in this read set, or
378+
// (b) extending OperationGroup with ReadRanges and teaching the
379+
// FSM to validate range emptiness atomically with commit. Both
380+
// are larger changes outside this PR's scope; tracked in
381+
// docs/design/2026_04_24_partial_admin_dashboard.md under the
382+
// Outstanding open items section. Operators concerned about the
383+
// orphan window today should pause writes against the target
384+
// bucket before issuing the admin delete.
361385
func (s *S3Server) AdminDeleteBucket(ctx context.Context, principal AdminPrincipal, name string) error {
362386
if !principal.Role.canWrite() {
363387
return ErrAdminForbidden

internal/admin/s3_handler.go

Lines changed: 35 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -362,36 +362,53 @@ func (h *S3Handler) handleDelete(w http.ResponseWriter, r *http.Request, name st
362362
}
363363

364364
// principalForWrite mirrors DynamoHandler.principalForWrite: pull
365-
// the JWT principal out of the request context, re-resolve the role
366-
// against the live RoleStore, and reject anything below Full. Even
367-
// a still-valid JWT with role=full does not get a free pass —
368-
// operators who revoke an access key get the change picked up on
369-
// the next admin write rather than waiting for the JWT to expire.
365+
// the JWT principal out of the request context, enforce the
366+
// JWT-embedded role first (defence-in-depth — a token minted as
367+
// read-only never escalates, even if the live role index says
368+
// `full` after a recent promotion), and — when a RoleStore is
369+
// configured — re-validate the access key against the live role
370+
// index so a key downgraded or revoked since login cannot keep
371+
// mutating with a still-valid JWT. Even a still-valid JWT with
372+
// role=full does not get a free pass — operators who revoke an
373+
// access key get the change picked up on the next admin write
374+
// rather than waiting for the JWT to expire.
370375
func (h *S3Handler) principalForWrite(w http.ResponseWriter, r *http.Request) (AuthPrincipal, bool) {
371376
principal, ok := PrincipalFromContext(r.Context())
372377
if !ok {
373378
writeJSONError(w, http.StatusUnauthorized, "unauthenticated",
374379
"no session principal")
375380
return AuthPrincipal{}, false
376381
}
377-
if h.roles == nil {
378-
// Production wiring always sets a RoleStore; nil is a test
379-
// fallthrough we accept so single-handler unit tests can
380-
// reach the source without standing up the auth chain.
381-
if !principal.Role.AllowsWrite() {
382+
// JWT role gate fires unconditionally so a session whose token
383+
// was minted as read-only cannot perform writes even if the
384+
// live role index later promotes the key to full. The previous
385+
// shape skipped this check whenever h.roles != nil, which made
386+
// it the only authorisation gate in production wiring — a
387+
// security regression vs DynamoHandler that Codex P2 +
388+
// coderabbitai flagged on PR #669 (the JWT freezes the role
389+
// at login, the live role index can only constrain further).
390+
if !principal.Role.AllowsWrite() {
391+
writeJSONError(w, http.StatusForbidden, "forbidden",
392+
"this endpoint requires a full-access role")
393+
return AuthPrincipal{}, false
394+
}
395+
// Live re-validation. Skipped when no RoleStore is configured
396+
// (single-handler unit tests where wiring up the role index
397+
// would obscure the source-level behaviour the test is
398+
// exercising); production wiring always sets one.
399+
if h.roles != nil {
400+
liveRole, found := h.roles.LookupRole(principal.AccessKey)
401+
if !found || !liveRole.AllowsWrite() {
382402
writeJSONError(w, http.StatusForbidden, "forbidden",
383403
"this endpoint requires a full-access role")
384404
return AuthPrincipal{}, false
385405
}
386-
return principal, true
387-
}
388-
role, found := h.roles.LookupRole(principal.AccessKey)
389-
if !found || !role.AllowsWrite() {
390-
writeJSONError(w, http.StatusForbidden, "forbidden",
391-
"this endpoint requires a full-access role")
392-
return AuthPrincipal{}, false
406+
// Use the live role downstream; the JWT may carry a stale
407+
// value but the live one is authoritative once both gates
408+
// agree the request is allowed.
409+
principal.Role = liveRole
393410
}
394-
return AuthPrincipal{AccessKey: principal.AccessKey, Role: role}, true
411+
return principal, true
395412
}
396413

397414
// writeBucketsError translates a BucketsSource error into the

internal/admin/s3_handler_test.go

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -440,6 +440,86 @@ func TestS3Handler_CreateBucket_ReadOnlyRoleRejected(t *testing.T) {
440440
"role gate must short-circuit before the source is reached")
441441
}
442442

443+
// TestS3Handler_PrincipalForWrite_JWTGateFiresWithRoleStore pins the
444+
// fix for Codex P2 + coderabbitai Minor on PR #669: the previous
445+
// shape skipped the JWT role check whenever h.roles != nil
446+
// (production wiring), so a session whose token was minted as
447+
// read-only could mutate buckets if the live role index had since
448+
// promoted the key to full. Aligning with DynamoHandler, the JWT
449+
// gate now fires unconditionally — the live store can only narrow
450+
// the JWT's role, never widen it.
451+
//
452+
// All three S3 admin write endpoints exercise the same
453+
// principalForWrite, so the table sweeps each one to lock down
454+
// the contract per-endpoint (a future op that forgets to call
455+
// principalForWrite would fail the relevant row immediately).
456+
func TestS3Handler_PrincipalForWrite_JWTGateFiresWithRoleStore(t *testing.T) {
457+
cases := []struct {
458+
name string
459+
method string
460+
path string
461+
body string
462+
}{
463+
{"create_bucket", http.MethodPost, pathS3Buckets, validCreateBucketBody()},
464+
{"put_bucket_acl", http.MethodPut, pathS3Buckets + "/x/acl", `{"acl":"private"}`},
465+
{"delete_bucket", http.MethodDelete, pathS3Buckets + "/x", ""},
466+
}
467+
for _, tc := range cases {
468+
t.Run(tc.name, func(t *testing.T) {
469+
src := &stubBucketsSource{}
470+
// Live RoleStore says AKIA_RO is full — exactly the
471+
// pre-fix escalation window. JWT carries read-only
472+
// (the "freeze at login" contract). Handler must reject.
473+
roles := MapRoleStore{"AKIA_RO": RoleFull}
474+
h := NewS3Handler(src).WithRoleStore(roles)
475+
var body io.Reader
476+
if tc.body != "" {
477+
body = strings.NewReader(tc.body)
478+
}
479+
req := httptest.NewRequest(tc.method, tc.path, body)
480+
req = withReadOnlyPrincipalForS3(req)
481+
rec := httptest.NewRecorder()
482+
h.ServeHTTP(rec, req)
483+
require.Equal(t, http.StatusForbidden, rec.Code,
484+
"JWT read-only must reject even when live role index says full")
485+
require.Empty(t, src.lastCreateInput.BucketName,
486+
"role gate must short-circuit before the source is reached")
487+
require.Empty(t, src.lastDeleteName,
488+
"role gate must short-circuit before the source is reached")
489+
})
490+
}
491+
}
492+
493+
// TestS3Handler_PrincipalForWrite_LiveRoleNarrowsJWT covers the
494+
// other half of the contract: a JWT minted as full is rejected
495+
// when the live role index has demoted the key to read-only or
496+
// removed it entirely. Already covered conceptually by the
497+
// existing "ReadOnlyRoleRejected" tests for the h.roles == nil
498+
// path, but those leave the live-role-narrowing branch unproven.
499+
func TestS3Handler_PrincipalForWrite_LiveRoleNarrowsJWT(t *testing.T) {
500+
cases := []struct {
501+
name string
502+
roles MapRoleStore
503+
}{
504+
{"live role demoted to read-only", MapRoleStore{"AKIA_FULL": RoleReadOnly}},
505+
{"live role removed from index", MapRoleStore{}},
506+
}
507+
for _, tc := range cases {
508+
t.Run(tc.name, func(t *testing.T) {
509+
src := &stubBucketsSource{}
510+
h := NewS3Handler(src).WithRoleStore(tc.roles)
511+
req := httptest.NewRequest(http.MethodPost, pathS3Buckets,
512+
strings.NewReader(validCreateBucketBody()))
513+
req = withFullPrincipal(req)
514+
rec := httptest.NewRecorder()
515+
h.ServeHTTP(rec, req)
516+
require.Equal(t, http.StatusForbidden, rec.Code)
517+
require.Empty(t, src.lastCreateInput.BucketName,
518+
"live role gate must short-circuit before the source is reached")
519+
})
520+
}
521+
}
522+
443523
func TestS3Handler_CreateBucket_AlreadyExistsReturns409(t *testing.T) {
444524
src := &stubBucketsSource{
445525
buckets: map[string]BucketSummary{"public-assets": {Name: "public-assets"}},

0 commit comments

Comments
 (0)