Skip to content

HYPERFLEET-1186 - fix: align documentation with codebase and fix pageSize default#211

Open
sherine-k wants to merge 4 commits into
openshift-hyperfleet:mainfrom
sherine-k:HYPERFLEET-1168
Open

HYPERFLEET-1186 - fix: align documentation with codebase and fix pageSize default#211
sherine-k wants to merge 4 commits into
openshift-hyperfleet:mainfrom
sherine-k:HYPERFLEET-1168

Conversation

@sherine-k

@sherine-k sherine-k commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR reviews and re-aligns the docs in anticipation of v1.0.0 release.
It also fixes the pageSize default to 20, aligning it with the default values in api specs.

  • HYPERFLEET-1186

Test Plan

  • Unit tests added/updated
  • make test-all passes
  • make lint passes
  • Helm chart changes validated with make test-helm (if applicable)
  • Deployed to a development cluster and verified (if Helm/config changes)
  • E2E tests passed (if cross-component or major changes)

@openshift-ci

openshift-ci Bot commented Jun 10, 2026

Copy link
Copy Markdown

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 91ebbcec-c95e-442d-b2ca-6eba235c54c7

📥 Commits

Reviewing files that changed from the base of the PR and between bdb9bed and 83ff866.

📒 Files selected for processing (7)
  • CLAUDE.md
  • README.md
  • docs/api-resources.md
  • docs/config.md
  • docs/database.md
  • docs/deployment.md
  • docs/development.md
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • openshift-hyperfleet/architecture (manual)
  • openshift-hyperfleet/hyperfleet-api (manual)
  • openshift-hyperfleet/hyperfleet-sentinel (manual)
  • openshift-hyperfleet/hyperfleet-adapter (manual)
  • openshift-hyperfleet/hyperfleet-broker (manual)
✅ Files skipped from review due to trivial changes (4)
  • CLAUDE.md
  • docs/development.md
  • docs/database.md
  • docs/deployment.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/api-resources.md

📝 Walkthrough

Summary by CodeRabbit

  • Documentation
    • Expanded API documentation with authentication, validation, and force-delete operations
    • Helm-centric deployment guide with production checklist and monitoring setup
    • Updated configuration and development guides for enhanced setup clarity
    • Default pagination page size reduced to 20 items; Go 1.25+ now required

Walkthrough

This PR updates API and operator documentation (endpoints, statuses, spec validation, error/problem-details, pagination default in docs), deployment/config/developer guides (Helm flow, caller-identity resolution, config precedence), database schema and delete lifecycle docs, and reduces service list pagination defaults from 100 to 20 with a matching unit test update.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 10 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
No Injection Vectors ⚠️ Warning pkg/services/resource.go uses fmt.Sprintf("kind = '%s' AND owner_id = '%s'", ...) and concatenates untrusted Search into the WHERE expression passed to generic.List (CWE-89). Don’t embed kind/ownerID into SQL/search strings. Use parameterized filters/placeholders in the DAO, or strictly validate/escape inputs (e.g., UUID) before building the query.
✅ Passed checks (10 passed)
Check name Status Explanation
Title check ✅ Passed Title clearly summarizes the main changes: documentation alignment for v1.0.0 release and pageSize default fix from 100 to 20.
Description check ✅ Passed Description accurately reflects the PR objectives: documentation realignment for v1.0.0 and pageSize default correction to match API specs.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Sec-02: Secrets In Log Output ✅ Passed Repo-wide scan of non-_test.go and non-.example Go logging calls found no slog/log/logr/zap or fmt.Print* statements exposing token/password/credential/secret (CWE-532).
No Hardcoded Secrets ✅ Passed CWE-798 check: searched PR docs/code for apiKey/secret/token/password string literals, user:pass URLs, and long base64 secrets; only placeholder values like config-password/secret-password found.
No Weak Cryptography ✅ Passed Repo-wide scan of .go files found no imports of crypto/md5, crypto/des, crypto/rc4, or crypto/sha1, and no ECB identifiers; targeted changed service files also had none.
No Privileged Containers ✅ Passed PR #211 changes only README/docs and pkg/services Go files; PR file list contains no Kubernetes/OpenShift manifests, Helm templates, or Dockerfiles to review for privileged settings.
No Pii Or Sensitive Data In Logs ✅ Passed PASS: request_logging_middleware logs headers as maskedHeaders; masking defaults enabled and redacts Authorization/Cookie/etc (CWE-532). No request/response bodies logged; no fmt.Print* exposing PII.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@rafabene

Copy link
Copy Markdown
Contributor

Hello @sherine-k I think this is a wrong ticket number in the title. Can you confirm, please?

@sherine-k sherine-k marked this pull request as ready for review June 11, 2026 08:33
@openshift-ci openshift-ci Bot requested review from jsell-rh and kuudori June 11, 2026 08:33
@sherine-k sherine-k changed the title HYPERFLEET-1168 - docs: align documentation with codebase and fix pageSize default HYPERFLEET-1186 - docs: align documentation with codebase and fix pageSize default Jun 11, 2026
@hyperfleet-ci-bot

hyperfleet-ci-bot Bot commented Jun 11, 2026

Copy link
Copy Markdown

Risk Score: 2 — risk/medium

Signal Detail Points
PR size 1335 lines (>500) +2
Sensitive paths none +0
Test coverage Tests cover changed packages +0

Computed by hyperfleet-risk-scorer

sherine-k and others added 2 commits June 11, 2026 10:37
Update README.md, api-resources.md, database.md, and deployment.md to
match the current state of the codebase.

- Update Go version from 1.24 to 1.25 and PostgreSQL from 13 to 14
- Add generic resource types (WifConfigs, Channels, Versions) and plugins/ to project structure
- Remove Ready condition from all JSON response examples (only Reconciled and LastKnownReconciled remain)
- Fix adapter_statuses polymorphic columns: owner_type/owner_id to resource_type/resource_id
- Fix soft delete column: deleted_at to deleted_time with deleted_by audit field
- Replace labels table documentation with resources table documentation
- Add Spec Validation section with opt-in qualifier linking to deployment.md
- Add Statuses Endpoint vs. Resource Endpoint section
- Restructure deployment.md into two chapters: Kubernetes/Helm and Local Execution
- Fix pageSize default from 100 to 20 to match TypeSpec contract
- Change migration tool reference from GORM to gormigrate
- Update license to Apache License 2.0

Co-Authored-By: Claude Opus 4.6
The runtime default was 100 but the TypeSpec API contract defines
pageSize as 20. Update the code to match the spec.

- Change default Size from 100 to 20 in NewListArguments (types.go)
- Update nil-guard defaults in ListByKind and ListByOwner (resource.go)
- Update unit test assertion to expect 20 (types_test.go)

Co-Authored-By: Claude <noreply@anthropic.com>
Document all features implemented between v0.2.0 and v1.0.0 that were
missing or partially covered in existing documentation.

- Add PATCH, DELETE, and force-delete endpoint sections to api-resources.md
- Add RFC 9457 Problem Details error response section with examples
- Add delete lifecycle (Active → Finalizing → Hard-Deleted) to database.md
- Add descriptor-driven delete policies (restrict/cascade) to database.md
- Add caller identity documentation (header vs JWT precedence) to config.md
- Add caller identity Helm config examples to deployment.md
- Update endpoint listing tables for clusters and nodepools
- Update README.md feature list and endpoint summaries for v1.0.0
- Fix incorrect "fails to load configuration" claims in deployment.md
- Fix "startup remains non-blocking" claim in openapi/README.md
- Fix trailing whitespace and double blank lines in database.md, deployment.md

Co-Authored-By: Claude <noreply@anthropic.com>
@openshift-ci

openshift-ci Bot commented Jun 11, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign pnguyen44 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (1)
pkg/services/resource.go (1)

275-275: ⚡ Quick win

Extract a shared default page-size constant to avoid contract drift.

Line 275 and Line 303 hardcode 20, while pkg/services/types.go also defines the same default in NewListArguments. Keep one source of truth (e.g., DefaultListSize) and reuse it in all fallback initializers to prevent future mismatch across handlers/services/tests.

Proposed diff
--- a/pkg/services/types.go
+++ b/pkg/services/types.go
@@
 const MaxListSize = 65500
+const DefaultListSize int64 = 20
@@
 	listArgs := &ListArguments{
 		Page:   1,
-		Size:   20,
+		Size:   DefaultListSize,
 		Search: "",
 	}
--- a/pkg/services/resource.go
+++ b/pkg/services/resource.go
@@
 	if args == nil {
-		args = &ListArguments{Page: 1, Size: 20}
+		args = &ListArguments{Page: 1, Size: DefaultListSize}
 	}
@@
 	if args == nil {
-		args = &ListArguments{Page: 1, Size: 20}
+		args = &ListArguments{Page: 1, Size: DefaultListSize}
 	}

Also applies to: 303-303

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/services/resource.go` at line 275, There are two places in resource
initialization that hardcode Size: 20 (the ListArguments literal at the resource
handlers); extract and reuse a single constant (e.g., DefaultListSize) declared
alongside NewListArguments in pkg/services/types.go, then replace the literal 20
in the ListArguments initializers (and any other fallback initializers) with
DefaultListSize so ListArguments, NewListArguments and the handlers share the
same source of truth.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/api-resources.md`:
- Around line 331-332: Update the docs text describing cluster deletion to
reflect descriptor-driven, policy-dependent behavior: instead of saying the
soft-delete "cascades the soft-delete to all child nodepools", state that
children are handled conditionally per the deletion policy (see the service
logic in resource deletion): nodepools with required adapters are soft-deleted
(set deleted_time/deleted_by and increment generation and enter Finalizing),
while children without required adapters are hard-deleted immediately; keep the
Finalizing explanation and that hard-delete occurs once adapters report
Finalized=True.

In `@docs/database.md`:
- Around line 62-73: Update the delete-lifecycle docs to remove the unsupported
"adapter quorum race" wording and instead state that adapter finalization is
concurrency-guarded: adapter status processing serializes per-resource via
GetForUpdate and hard-delete only proceeds when adapters report Finalized=True
at the matching generation (ObservedGeneration == resource.Generation) as
checked in tryHardDeleteNodePool and tryHardDeleteCluster; clarify that
child-creation-under-finalizing-parent is already prevented by
ResourceService.Create which acquires the parent row lock (GetForUpdate) and
rejects when parent.DeletedTime != nil (the cluster→nodepools
force-delete/create path likewise rejects deleted parents); add a robustness
note to recursive delete policy that registry.Validate currently does not detect
ParentKind cycles so deleteResourceTree / ChildrenOf must be protected (either
add cycle detection in registry.Validate or guard recursion with a visited set
to avoid unbounded recursion); and explicitly state that restrict enforcement
(ExistsByOwner used in checkCanDelete) runs inside TransactionMiddleware and
relies on the parent row lock for correct isolation.
- Line 72: Update the force-delete documentation to correctly reflect
implemented semantics: state that POST .../force-delete requires the resource to
already be in Finalizing (i.e., deleted_time set) and only bypasses the
“required adapters must report Finalized=True” gating (do not say it bypasses
finalization entirely), document idempotency that repeated POSTs after a
hard-delete return 404 (service maps gorm.ErrRecordNotFound to NotFound), and
clarify cascade behavior that Cluster force-delete hard-deletes all child
NodePools and their adapter statuses while NodePool force-delete only
hard-deletes the NodePool and its adapter statuses; reference
Finalizing/deleted_time, POST .../force-delete, Cluster, and NodePool in the
updated wording.

In `@docs/deployment.md`:
- Around line 496-503: The documentation incorrectly states the command fails
when no config file is found; update docs (including docs/deployment.md) to
describe the actual loader.Load() behavior: when no config file is present the
loader returns nil and logs "No config file found, using environment variables
and flags only", so the application falls back to environment variables and
flags and continues; only treat failures where loader.Load() returns an error as
fatal (as done in the serve command). Reference loader.Load(), the log message
"No config file found, using environment variables and flags only", and
pkg/config/loader.go when making this correction.

In `@README.md`:
- Line 122: The force-delete endpoints lack caller authorization checks; update
the handlers and service layer to enforce RBAC by validating the calling
principal before performing destructive operations: in pkg/handlers/cluster.go
add an authorization check that extracts caller identity/roles from the request
and denies access if not permitted, then pass an authorized context/principal
into pkg/services/cluster.go; in pkg/services/cluster.go verify the principal's
permission prior to calling clusterDao.GetForUpdate and before deleting adapter
statuses/child nodepools; also tighten NodePool logic where GetByIDAndOwner is
used to ensure it validates caller identity/permissions (not just path
ownership) and finally update README.md to document the required RBAC/permission
prerequisites for invoking the force-delete endpoints.
- Around line 92-93: Update README.md to document the trust boundary for config
file paths used by the loader (the --config flag and HYPERFLEET_CONFIG env var)
and explicitly state that these paths must be operator-trusted; note that the
loader only performs an existence check (os.Stat) for explicit sources and then
reads via viper.ReadInConfig, so callers must ensure the file path, permissions
and ownership are secure (suggested mode/owner and disallow world-writable
access). Also change the wording to clarify that schema/content validation is
performed in code (viper.UnmarshalExact + validator.Struct plus custom
TLS/JWT/timeouts/identity checks), and that the documented gap is
path/permission safety rather than configuration content validation.

---

Nitpick comments:
In `@pkg/services/resource.go`:
- Line 275: There are two places in resource initialization that hardcode Size:
20 (the ListArguments literal at the resource handlers); extract and reuse a
single constant (e.g., DefaultListSize) declared alongside NewListArguments in
pkg/services/types.go, then replace the literal 20 in the ListArguments
initializers (and any other fallback initializers) with DefaultListSize so
ListArguments, NewListArguments and the handlers share the same source of truth.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 8dcd780c-beb6-4d3d-b1da-1195c70532ec

📥 Commits

Reviewing files that changed from the base of the PR and between e8fcb1c and bdb9bed.

📒 Files selected for processing (9)
  • README.md
  • docs/api-resources.md
  • docs/config.md
  • docs/database.md
  • docs/deployment.md
  • openapi/README.md
  • pkg/services/resource.go
  • pkg/services/types.go
  • pkg/services/types_test.go
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • openshift-hyperfleet/architecture (manual)
  • openshift-hyperfleet/hyperfleet-api (manual)
  • openshift-hyperfleet/hyperfleet-sentinel (manual)
  • openshift-hyperfleet/hyperfleet-adapter (manual)
  • openshift-hyperfleet/hyperfleet-broker (manual)

Comment thread docs/api-resources.md Outdated
Comment thread docs/database.md
Comment on lines +62 to +73
Resources follow a three-phase delete lifecycle:

```text
Active ──(DELETE)──▶ Finalizing ──(adapters report Finalized=True)──▶ Hard-Deleted
└──(POST /force-delete)──▶ Hard-Deleted
```

1. **Active** — Normal state. Resource is visible in list queries and can be updated.
2. **Finalizing** (soft-deleted) — `DELETE` sets `deleted_time` and `deleted_by`, increments `generation`. The resource stays in the database so adapters can observe the deletion and clean up external state. Soft-deleted records are excluded from list queries by default. New child resources cannot be created under a finalizing parent (returns `409 Conflict`).
3. **Hard-Deleted** — Permanently removed from the database. This happens automatically when all required adapters report `Finalized=True` at the current generation. If adapters are stuck, `POST .../force-delete` bypasses finalization and hard-deletes immediately.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify delete lifecycle uses proper transaction isolation

rg -n -C5 'deleted_time|Finalizing' pkg/dao/ pkg/services/ --type=go | rg -C3 'Begin|Transaction|Lock'

# Check for generation increment atomicity
ast-grep --pattern $'func $_(ctx context.Context, id string) $_ {
  $$$
  deleted_time
  $$$
  generation
  $$$
}'

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 61


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show docs snippet around the referenced lines
echo "---- docs/database.md (around lines 50-90) ----"
nl -ba docs/database.md | sed -n '50,95p'

# Search for referenced lifecycle terms in Go
echo "---- rg lifecycle terms (case-sensitive) ----"
rg -n --hidden --no-ignore-vcs "force-delete|Finalizing|Finalized|deleted_time|deleted_by|generation" . -S -g'*.go' || true

echo "---- rg lifecycle terms in SQL/migrations/docs ----"
rg -n --hidden --no-ignore-vcs "force-delete|Finalizing|Finalized|deleted_time|deleted_by|generation" . -S -g'*.sql' -g'*.md' -g'*.go' -g'*.yaml' -g'*.yml' || true

echo "---- rg delete policies text (restrict/cascade) ----"
rg -n --hidden --no-ignore-vcs "restrict|cascade|force-delete|POST /force-delete|/force-delete" . -S -g'*.go' -g'*.md' || true

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 188


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate where DELETE lifecycle is implemented by searching for status transitions / queries
echo "---- rg for 'Finalize' / 'hard delete' ----"
rg -n --hidden --no-ignore-vcs "hard-?deleted|HardDeleted|Finalized|Finalize|force delete|force-delete" . -S -g'*.go' || true

echo "---- rg for 'deleted_time' exact column usage ----"
rg -n --hidden --no-ignore-vcs "deleted_time" . -S -g'*.go' -g'*.sql' -g'*.yaml' -g'*.yml' || true

echo "---- rg for 'generation' updates related to deletes ----"
rg -n --hidden --no-ignore-vcs "generation.*(deleted|delete|finaliz)" . -S -g'*.go' || true

echo "---- list likely DAO/service directories ----"
ls -la
find . -maxdepth 3 -type d \( -name 'dao' -o -name 'service*' -o -name 'pkg' \) 2>/dev/null | head

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 24932


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- docs/database.md (around lines 50-95) ----"
nl -ba docs/database.md | sed -n '50,95p'

echo "---- rg lifecycle terms in Go (case-sensitive) ----"
rg -n --hidden --no-ignore-vcs "force-delete|Finalizing|Finalized|deleted_time|deleted_by|generation" . -S -g'*.go' || true

echo "---- rg lifecycle terms in SQL/migrations/docs ----"
rg -n --hidden --no-ignore-vcs "force-delete|Finalizing|Finalized|deleted_time|deleted_by|generation" . -S -g'*.sql' -g'*.md' -g'*.yaml' -g'*.yml' -g'*.go' || true

echo "---- rg delete policy text ----"
rg -n --hidden --no-ignore-vcs "restrict|cascade|force-delete|POST /force-delete|/force-delete" . -S -g'*.go' -g'*.md' || true

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 188


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- docs/database.md lines 55-90 (with pseudo line numbers) ----"
awk 'NR>=55 && NR<=90 {printf "%d\t%s\n", NR, $0}' docs/database.md

echo "---- rg restrict/cascade/force-delete implementation text in Go/handlers ----"
rg -n --hidden --no-ignore-vcs "(restrict|cascade|force-delete|force delete|enforced recursively|cycle)" . -S -g'*.go' -g'*.md'

echo "---- locate force-delete handlers ----"
rg -n --hidden --no-ignore-vcs "ForceDelete|/force-delete" pkg plugins -S -g'*.go'

echo "---- inspect likely delete policy code locations ----"
rg -n --hidden --no-ignore-vcs "delete policy|policy.*delete|DeletePolicy|cascade|restrict" pkg -S -g'*.go'

echo "---- transaction/locking primitives in relevant services/daos ----"
rg -n --hidden --no-ignore-vcs "(BeginTx|Begin\\(|Commit\\(|Rollback\\(|SELECT FOR UPDATE|FOR UPDATE|Lock\\(|Mutex|lock\\.)" pkg -S -g'*.go'

echo "---- show relevant sections in node_pool.go/cluster.go/aggregation.go/resource.go ----"
# Use sed to avoid huge output; extract around the comment blocks found earlier by rg.
sed -n '520,640p' pkg/services/node_pool.go
sed -n '420,560p' pkg/services/cluster.go
sed -n '220,420p' pkg/services/aggregation.go
sed -n '720,820p' pkg/services/aggregation.go
sed -n '120,200p' pkg/services/resource.go

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 50079


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show remainder of deleteResourceTree and checkCanDelete implementation
echo "---- pkg/services/resource.go (Delete + deleteResourceTree continuation) ----"
sed -n '120,260p' pkg/services/resource.go

echo "---- pkg/services/resource.go (checkCanDelete + helpers) ----"
rg -n "func \\(s \\*sqlResourceService\\) checkCanDelete|checkCanDelete\\(" -S pkg/services/resource.go
sed -n '260,430p' pkg/services/resource.go

# Show node_pool.go soft-delete (Delete) and where tryHardDeleteNodePool is called
echo "---- pkg/services/node_pool.go relevant sections ----"
rg -n "func \\(s \\*sqlNodePoolService\\) Delete|MarkDeleted|ForceDelete|tryHardDeleteNodePool|final" pkg/services/node_pool.go -S
sed -n '180,340p' pkg/services/node_pool.go
sed -n '340,620p' pkg/services/node_pool.go

# Show cluster.go soft-delete (Delete) and where tryHardDeleteCluster is called
echo "---- pkg/services/cluster.go relevant sections ----"
rg -n "func \\(s \\*sqlClusterService\\) Delete|MarkDeleted|ForceDelete|tryHardDeleteCluster|tryHardDelete" pkg/services/cluster.go -S
sed -n '90,240p' pkg/services/cluster.go
sed -n '240,560p' pkg/services/cluster.go

# Show registry.ChildrenOf to see if cycle detection exists
echo "---- pkg/registry - ChildrenOf implementation ----"
rg -n "func (ChildrenOf|ChildrenOf\\()" pkg/registry -S
rg -n "ChildrenOf\\(" pkg/registry -S
# Print descriptor registry parts
sed -n '1,220p' pkg/registry/registry.go 2>/dev/null || true
sed -n '1,220p' pkg/registry/descriptor.go
ls -ლა pkg/registry

# Show ForceDelete tests for idempotency / repeated calls
echo "---- ForceDelete tests (look for 'again' or repeated) ----"
rg -n "force-delete.*again|ForceDelete.*again|force delete.*twice|twice" test pkg/services/*_test.go -S || true
rg -n "force-delete" pkg/services/*_test.go test/integration/*_test.go -S | head -n 80

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 44036


🏁 Script executed:

#!/bin/bash
set -euo pipefail
rg -n --hidden --no-ignore-vcs "(Finalizing state|deleted_time.*nil|DeletedTime == nil|deleted_time|DeletedTime != nil|in Finalizing)" pkg/handlers pkg/services -S -g'*.go'

rg -n --hidden --no-ignore-vcs "force-delete" test pkg/handlers pkg/services -S -g'*.go' | head -n 200
rg -n --hidden --no-ignore-vcs "ForceDelete.*(twice|again|already)" test -S -g'*.go'
rg -n --hidden --no-ignore-vcs "not in Finalizing state" pkg -S -g'*.go'

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 5396


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- pkg/services/resource.go lines 1-120 (find parent.DeletedTime != nil) ----"
sed -n '1,140p' pkg/services/resource.go

echo "---- pkg/handlers create/patch blocks for finalizing (search 409 / Finalizing / deleted_time) ----"
rg -n --hidden --no-ignore-vcs "Finalizing|deleted_time|DeletedTime != nil|not in Finalizing state" pkg/handlers -S -g'*.go'

echo "---- find transaction middleware usage in handlers/services (BeginTx/TransactionMiddleware) ----"
rg -n --hidden --no-ignore-vcs "(transaction|TransactionMiddleware|WithTx|BeginTx|db\\.With|gorm\\.Begin|Begin\\()" pkg/handlers pkg/services pkg/db -S -g'*.go'

echo "---- pkg/services/resource.go checkCanDelete caller context ----"
rg -n --hidden --no-ignore-vcs "checkCanDelete\\(" -S pkg/services/resource.go

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 13265


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- Where TransactionMiddleware is applied ----"
rg -n --hidden --no-ignore-vcs "TransactionMiddleware\\(" pkg/handlers cmd plugins -S -g'*.go'

echo "---- cluster_nodepools create/patch checks for cluster.DeletedTime ----"
rg -n --hidden --no-ignore-vcs "cluster\\.DeletedTime != nil|DeletedTime != nil.*cluster|marked for deletion" pkg/handlers -S -g'*.go' || true

echo "---- find nodepool create handler and its conflict condition ----"
rg -n --hidden --no-ignore-vcs "NodePool.*Create|CreateNodePool|nodepools.*post" pkg/handlers -S -g'*.go' || true

echo "---- check generic child creation in resource.go already shown; now confirm it maps to 409 in handlers ----"
rg -n --hidden --no-ignore-vcs "marked for deletion" pkg/handlers -S -g'*.go' || true

echo "---- find force-delete idempotency tests (called twice) ----"
rg -n --hidden --no-ignore-vcs "force-delete.*returns|force-deleted.*again|ForceDelete.*again|already.*force-delete" test pkg -S -g'*.go' || true

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 3563


Fix delete-lifecycle review: remove unsupported adapter-quorum race claims; add real gaps (policy recursion cycles, clarify locking/isolation)

  • The Active → Finalizing → Hard-Deleted quorum is concurrency-guarded: adapter status processing serializes per resource via GetForUpdate, and hard-delete only runs when adapters report Finalized=True at the same generation (adapterStatus.ObservedGeneration == <resource>.Generation) in tryHardDeleteNodePool/tryHardDeleteCluster (CWE-362).
  • Child creation under a finalizing parent is already enforced with 409s: ResourceService.Create locks the parent (GetForUpdate) and rejects when parent.DeletedTime != nil, and the cluster→nodepools force-delete/create path similarly rejects when cluster.DeletedTime != nil (CWE-367). The docs don’t need extra TOCTOU language here.
  • The recursive delete policy section still needs a robustness note: registry.Validate does not detect cyclic ParentKind graphs, so recursive deletion (deleteResourceTree / ChildrenOf) can recurse indefinitely if descriptors are mis-registered (CWE-674). Add cycle detection in registry.Validate or guard recursion with a visited set.
  • restrict enforcement (ExistsByOwner in checkCanDelete) is not explicitly documented in terms of transaction/isolation guarantees; the docs should state that the parent delete path runs under a transaction (TransactionMiddleware) and relies on the parent row lock to avoid inconsistent decisions (CWE-367).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/database.md` around lines 62 - 73, Update the delete-lifecycle docs to
remove the unsupported "adapter quorum race" wording and instead state that
adapter finalization is concurrency-guarded: adapter status processing
serializes per-resource via GetForUpdate and hard-delete only proceeds when
adapters report Finalized=True at the matching generation (ObservedGeneration ==
resource.Generation) as checked in tryHardDeleteNodePool and
tryHardDeleteCluster; clarify that child-creation-under-finalizing-parent is
already prevented by ResourceService.Create which acquires the parent row lock
(GetForUpdate) and rejects when parent.DeletedTime != nil (the cluster→nodepools
force-delete/create path likewise rejects deleted parents); add a robustness
note to recursive delete policy that registry.Validate currently does not detect
ParentKind cycles so deleteResourceTree / ChildrenOf must be protected (either
add cycle detection in registry.Validate or guard recursion with a visited set
to avoid unbounded recursion); and explicitly state that restrict enforcement
(ExistsByOwner used in checkCanDelete) runs inside TransactionMiddleware and
relies on the parent row lock for correct isolation.

Comment thread docs/database.md Outdated
Comment thread docs/deployment.md Outdated
Comment thread README.md Outdated
Comment on lines +92 to +93
The `migrate` and `serve` commands require a configuration file. The loader checks `--config` flag, then `HYPERFLEET_CONFIG` env var, then `/etc/hyperfleet/config.yaml`, then `./configs/config.yaml`.
If none are found, the command fails with `failed to load configuration`. Copy the example config or point to your own.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify config loader validates file permissions and paths

rg -n -C3 'HYPERFLEET_CONFIG|--config' --type=go --glob '!vendor/**' | head -50

# Check for path validation or permission checks
ast-grep --pattern $'func $_(path string) $_ {
  $$$
  os.Stat($$$)
  $$$
}'

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 3029


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Inspect config loader path resolution + any permission/ownership checks
echo "=== pkg/config/loader.go: resolveAndReadConfigFile ==="
rg -n "resolveAndReadConfigFile|HYPERFLEET_CONFIG|/etc/hyperfleet/config.yaml|configs/config.yaml|ReadInConfig|os\.Stat|UnmarshalExact" pkg/config/loader.go
echo
sed -n '1,220p' pkg/config/loader.go

# 2) Inspect config validation/unmarshal in loader.go (schema enforcement)
echo
echo "=== pkg/config/loader.go: validation calls ==="
rg -n "UnmarshalExact|validator\.Struct|Validate|identity|JWT|TLS|timeout" pkg/config/loader.go

# 3) Force-delete wiring: handler -> service -> auth/RBAC/owner scoping
echo
echo "=== ForceDelete handlers/services (ownership/RBAC) ==="
rg -n "ForceDelete|force-delete" cmd pkg/handlers pkg/services cmd/hyperfleet-api/server/routes.go
echo
# show key snippets if present
for f in cmd/hyperfleet-api/server/routes.go \
         pkg/handlers/cluster.go \
         pkg/handlers/cluster_nodepools.go \
         pkg/services/cluster.go \
         pkg/services/node_pool.go \
         pkg/services/resource.go
do
  if [ -f "$f" ]; then
    echo
    echo "=== $f (relevant lines) ==="
    rg -n "ForceDelete|force-delete|JWTHandler|PublicPaths|CallerIdentityMiddleware|deleteResourceTree|restrict|cascade" "$f" || true
  fi
done

# 4) deleteResourceTree recursion guards
echo
echo "=== pkg/services/resource.go: deleteResourceTree ==="
rg -n "func .*deleteResourceTree|deleteResourceTree|visited|depth|cycle|restrict|cascade" pkg/services/resource.go || true
if [ -f pkg/services/resource.go ]; then
  sed -n '1,260p' pkg/services/resource.go
fi

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 30155


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Force-delete authN/authZ + middleware wiring
rg -n "PublicPaths|JWTHandler|CallerIdentityMiddleware|force-delete|force delete|ForceDelete" cmd/hyperfleet-api/server/routes.go
echo "----"
sed -n '1,220p' cmd/hyperfleet-api/server/routes.go
echo "----"

rg -n "handleForceDelete|ForceDeleteRequest|jwt|RBAC|authorize|permission|role" pkg/handlers/framework.go pkg/handlers/cluster.go pkg/handlers/cluster_nodepools.go
echo "----"
sed -n '120,230p' pkg/handlers/framework.go
echo "----"
sed -n '140,230p' pkg/handlers/cluster.go
echo "----"
sed -n '120,210p' pkg/handlers/cluster_nodepools.go
echo "----"

# Service-level checks for ForceDelete
rg -n "func \\(s \\*sqlClusterService\\) ForceDelete|func \\(s \\*sqlNodePoolService\\) ForceDelete|DeletedTime|owner|GetBy.*Owner|auth|RBAC|permission" pkg/services/cluster.go pkg/services/node_pool.go
echo "----"
sed -n '450,620p' pkg/services/cluster.go
echo "----"
sed -n '220,360p' pkg/services/node_pool.go
echo "----"

# deleteResourceTree recursion/cycle guards
rg -n "deleteResourceTree\\(|visited|depth|cycle|max depth" pkg/services/resource.go
echo "----"
sed -n '130,260p' pkg/services/resource.go
echo "----"

# Registry descriptor graph validation (cycles, depth limits)
rg -n "cycle|acyclic|topolog|visited|graph|ChildrenOf\\(" pkg/registry
fd . pkg/registry -t f | head -50
for f in $(fd -t f pkg/registry | tr '\n' ' '); do
  # keep output small by only showing likely validation locations
  rg -n "cycle|acyclic|topolog|graph|ChildrenOf\\(" "$f" && echo "---- in $f ----" && break
done

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 28168


README.md: Document the file-path trust boundary for --config / HYPERFLEET_CONFIG

README.md lines 92-93: --config and HYPERFLEET_CONFIG are used as the config file path (only an existence check via os.Stat for explicit sources, then viper.ReadInConfig), with no allowlist/permission/ownership validation—external path control is a CWE-73 risk (path-controlled file read). Document that the path must be operator-trusted and specify expected file permissions/ownership.

Also adjust the wording to reflect that config content is validated in code (viper.UnmarshalExact + validator.Struct + custom TLS/JWT/timeouts/identity checks); the missing gap is path/permission safety, not schema/content enforcement.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` around lines 92 - 93, Update README.md to document the trust
boundary for config file paths used by the loader (the --config flag and
HYPERFLEET_CONFIG env var) and explicitly state that these paths must be
operator-trusted; note that the loader only performs an existence check
(os.Stat) for explicit sources and then reads via viper.ReadInConfig, so callers
must ensure the file path, permissions and ownership are secure (suggested
mode/owner and disallow world-writable access). Also change the wording to
clarify that schema/content validation is performed in code
(viper.UnmarshalExact + validator.Struct plus custom TLS/JWT/timeouts/identity
checks), and that the documented gap is path/permission safety rather than
configuration content validation.

Comment thread README.md
- `GET/POST /api/hyperfleet/v1/clusters`
- `GET /api/hyperfleet/v1/clusters/{id}`
- `GET/PATCH/DELETE /api/hyperfleet/v1/clusters/{id}`
- `POST /api/hyperfleet/v1/clusters/{id}/force-delete`

@coderabbitai coderabbitai Bot Jun 11, 2026

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if force-delete handler enforces stricter authz than regular DELETE

rg -n -C5 'force-delete|ForceDelete' --type=go pkg/handlers/ | head -80

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 4993


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Locate routes/endpoints mentioning force-delete and where authz/middleware is wired.
rg -n --hidden --no-ignore-vcs "force-delete" . || true
rg -n --hidden --no-ignore-vcs "ForceDelete" . || true

# 2) Inspect surrounding handler code for ForceDelete and any normal DELETE methods.
echo "---- cluster.go around ForceDelete and any Delete handlers ----"
sed -n '130,240p' pkg/handlers/cluster.go || true

echo "---- cluster_nodepools.go around ForceDelete and any Delete handlers ----"
sed -n '1,260p' pkg/handlers/cluster_nodepools.go || true

# 3) Inspect framework handler for handleForceDelete / authz-related hooks.
sed -n '1,240p' pkg/handlers/framework.go || true

# 4) README snippet around line ~122 (as referenced by review comment)
ls -la || true
sed -n '90,150p' README.md || true

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 30129


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Inspect plugin/router wiring around force-delete handlers.
sed -n '1,140p' plugins/clusters/plugin.go

# 2) Find router/middleware/auth wiring keywords.
rg -n --hidden --no-ignore-vcs "middleware|auth|RBAC|Role|scope|Authorize|authorization|Permission|entitle|claims|user|principal" pkg cmd plugins charts configs serve . || true

# 3) Check how handlers get context: ownership/user extracted somewhere.
rg -n --hidden --no-ignore-vcs "GetByIDAndOwner|owner|tenant|project|account|userID|principal|subject" pkg | head -200 || true

# 4) Inspect cluster/node_pool service ForceDelete implementations for any explicit authz checks.
sed -n '450,660p' pkg/services/cluster.go
sed -n '220,380p' pkg/services/node_pool.go

# 5) Inspect regular DELETE handlers to compare logic surface.
# cluster DELETE handler
sed -n '120,220p' pkg/handlers/cluster.go
# nodepool DELETE handler
sed -n '80,200p' pkg/handlers/cluster_nodepools.go

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 50392


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- pkg/services/cluster.go: ForceDelete + any SoftDelete/Delete around it ----"
rg -n "func \\(s \\*sqlClusterService\\) (ForceDelete|SoftDelete|Delete)" pkg/services/cluster.go
sed -n '430,620p' pkg/services/cluster.go

echo "---- pkg/services/node_pool.go: ForceDelete + any SoftDelete/Delete around it ----"
rg -n "func \\(s \\*sqlNodePoolService\\) (ForceDelete|SoftDelete|Delete)" pkg/services/node_pool.go
sed -n '180,360p' pkg/services/node_pool.go

echo "---- pkg/dao: ForceDelete-related queries in cluster/node_pool ----"
rg -n "ForceDelete" pkg/dao || true
rg -n "SoftDelete|Soft delete|deleted_time" pkg/dao/cluster*.go pkg/dao/*node*pool*.go || true

echo "---- handler delete endpoints (for comparison): pkg/handlers/cluster.go SoftDelete/DELETE semantics ----"
rg -n "func \\(h ClusterHandler\\) (SoftDelete|Delete|ForceDelete)" pkg/handlers/cluster.go
sed -n '120,220p' pkg/handlers/cluster.go

echo "---- handler delete endpoints (for comparison): pkg/handlers/cluster_nodepools.go SoftDelete/ForceDelete semantics ----"
rg -n "func \\(h ClusterNodePoolsHandler\\) (SoftDelete|ForceDelete)" pkg/handlers/cluster_nodepools.go
sed -n '80,200p' pkg/handlers/cluster_nodepools.go

echo "---- cmd/hyperfleet-api/server: middleware/router auth wiring (targeted) ----"
rg -n "Use\\(|middleware|auth\\.|authz|RBAC|authorize|permission|JWT|Identity" cmd/hyperfleet-api/server cmd/hyperfleet-api/environments cmd/hyperfleet-api/*.go || true

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 22615


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Cluster service: SoftDelete implementation details
echo "---- pkg/services/cluster.go: SoftDelete (and related delete methods) ----"
sed -n '120,260p' pkg/services/cluster.go
sed -n '260,420p' pkg/services/cluster.go
sed -n '420,560p' pkg/services/cluster.go

# Node pool service: see SoftDelete/ Delete / ForceDelete call patterns re owner checks
echo "---- pkg/services/node_pool.go: SoftDelete and Delete ----"
sed -n '150,280p' pkg/services/node_pool.go
sed -n '280,360p' pkg/services/node_pool.go

# DAO cluster: see whether GetForUpdate or SoftDelete uses caller/owner scoping
echo "---- pkg/dao/cluster*.go: search for GetForUpdate / SoftDelete ownership ----"
rg -n "type .*ClusterDao|GetForUpdate|SoftDelete|DeletedTime|ByOwner|Owner" pkg/dao --glob '*cluster*' || true
rg -n "GetForUpdate\\(|SoftDelete\\(|Delete\\(" pkg/dao --glob '*cluster*' || true

# Show cluster dao implementation for GetForUpdate and any owner-filtered queries
# (we don't know filename; use fd-like search with rg results)
echo "---- showing matching cluster dao files with GetForUpdate ----"
rg -n "GetForUpdate\\(" pkg/dao --glob '*cluster*' --files-with-matches || true

# Print relevant sections from all files that define GetForUpdate in cluster dao (limited)
for f in $(rg -n "func \\(.*\\) GetForUpdate\\(" pkg/dao --glob '*cluster*' --files-with-matches); do
  echo "==== $f ===="
  sed -n '1,220p' "$f"
  sed -n '220,420p' "$f"
done || true

# Node pool DAO: GetByIDAndOwner scoping
echo "---- pkg/dao/node*pool*.go: GetByIDAndOwner ----"
rg -n "GetByIDAndOwner\\(" pkg/dao --glob '*node*pool*' || true
for f in $(rg -n "GetByIDAndOwner\\(" pkg/dao --glob '*node*pool*' --files-with-matches || true); do
  echo "==== $f ===="
  sed -n '1,260p' "$f"
done

# Handlers/tests: look for authz expectations for force-delete
echo "---- pkg/handlers/cluster_test.go: search for Unauthorized/Forbidden around ForceDelete ----"
rg -n "ForceDelete|force-delete|401|403|Unauthorized|Forbidden" pkg/handlers/cluster_test.go pkg/handlers/cluster_nodepools_test.go | head -200 || true

echo "---- pkg/handlers/cluster_nodepools_test.go: search for authz around ForceDelete ----"
rg -n "Unauthorized|Forbidden|401|403" pkg/handlers/cluster_nodepools_test.go pkg/handlers/cluster_test.go | head -200 || true

echo "---- integration tests: search for force-delete authz expectations ----"
rg -n "force-delete.*(401|403|Unauthorized|Forbidden)" test/integration -S --glob '*force*' || true
rg -n "Unauthorized|Forbidden|401|403" test/integration/clusters_test.go test/integration/node_pools_test.go | head -200 || true

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 38188


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- README.md: search for force-delete auth/RBAC/runbook text ----"
rg -n "force-delete|ForceDelete|RBAC|role|permission|authorize|authorization|runbook|safety|admin" README.md docs/api-resources.md docs/runbook.md docs/database.md || true
echo "---- README.md around line ~100-150 (includes force-delete endpoint list) ----"
sed -n '90,160p' README.md || true

echo "---- docs/api-resources.md: force-delete cluster section ----"
rg -n "POST /api/hyperfleet/v1/clusters/\\{.*\\}/force-delete|force-delete cluster|Force Delete Cluster|Force Delete" docs/api-resources.md || true
# show likely relevant sections by line ranges
sed -n '340,420p' docs/api-resources.md || true

echo "---- docs/api-resources.md: force-delete nodepool section ----"
sed -n '520,600p' docs/api-resources.md || true

echo "---- cmd/hyperfleet-api/server/routes.go: full to see if authz is wired ----"
sed -n '1,200p' cmd/hyperfleet-api/server/routes.go || true
sed -n '200,420p' cmd/hyperfleet-api/server/routes.go || true

echo "---- search cmd/hyperfleet-api/server for authorize/authz/RBAC names ----"
rg -n --no-ignore-vcs "authorize|authorization|authz|RBAC|role|permission|Scope" cmd/hyperfleet-api/server pkg/middleware pkg/auth || true

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 16418


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- Find Cluster model struct in pkg/api ----"
rg -n "type Cluster struct" pkg/api || true
# show likely file
f=$(rg -n "type Cluster struct" pkg/api --files-with-matches | head -n 1 || true)
echo "cluster struct file: $f"
if [ -n "${f:-}" ]; then
  sed -n '1,220p' "$f"
fi

echo "---- Search for owner_id / OwnerID in Cluster struct or fields ----"
rg -n "OwnerID|owner_id|owner" "$f" || true

echo "---- Confirm cluster handler SoftDelete (regular DELETE) calls service without ownership scoping ----"
sed -n '130,220p' pkg/handlers/cluster.go

echo "---- docs/api-resources.md force-delete cluster section for any RBAC/access control text ----"
# narrow by lines around "Force Delete Cluster"
rg -n "### Force Delete Cluster|Force Delete Cluster|force-delete cluster" docs/api-resources.md
# print a chunk around those lines
sed -n '350,410p' docs/api-resources.md
echo "---- docs/api-resources.md force-delete nodepool section for RBAC/access control text ----"
sed -n '540,590p' docs/api-resources.md

Repository: openshift-hyperfleet/hyperfleet-api

Length of output: 7568


Enforce authorization for force-delete endpoints (CWE-285).

  • Cluster POST /api/hyperfleet/v1/clusters/{id}/force-delete has no caller authorization scoping: pkg/handlers/cluster.go forwards directly to pkg/services/cluster.go where clusterDao.GetForUpdate selects by id = ? and then hard-deletes cluster + adapter statuses + child nodepools.
  • NodePool POST /api/hyperfleet/v1/clusters/{cluster_id}/nodepools/{nodepool_id}/force-delete only scopes by path relationship (GetByIDAndOwner filters id = ? AND owner_id = ? with owner_id = cluster_id), not by caller identity/permissions.
  • Docs describe the operational safety conditions for force-delete, but do not document any RBAC/permission requirements for invoking it.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` at line 122, The force-delete endpoints lack caller authorization
checks; update the handlers and service layer to enforce RBAC by validating the
calling principal before performing destructive operations: in
pkg/handlers/cluster.go add an authorization check that extracts caller
identity/roles from the request and denies access if not permitted, then pass an
authorized context/principal into pkg/services/cluster.go; in
pkg/services/cluster.go verify the principal's permission prior to calling
clusterDao.GetForUpdate and before deleting adapter statuses/child nodepools;
also tighten NodePool logic where GetByIDAndOwner is used to ensure it validates
caller identity/permissions (not just path ownership) and finally update
README.md to document the required RBAC/permission prerequisites for invoking
the force-delete endpoints.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're just listing the endpoints in the README.md. I think we can talk about authorization verification in a later section, with a reminder in api-resources for example, pointing to the authentication.md ?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sherine-k — Agreed, the endpoint listing section isn't the place for auth docs. A note in docs/api-resources.md under the Force Delete sections (cluster + nodepool) pointing to docs/authentication.md for access-control requirements is the right structure.

That said, the documentation concern is separate from the underlying code issue (CWE-285): pkg/handlers/cluster.go and pkg/handlers/cluster_nodepools.go currently forward ForceDelete calls straight to the service/DAO layer with no caller-identity authorization check — only path-parameter ownership (id / owner_id) is validated. That's a live security gap independent of what the docs say.

Would you like me to open a follow-up GitHub issue to track the missing authorization enforcement in the force-delete handlers?

@sherine-k

Copy link
Copy Markdown
Contributor Author

/test lint

Comment thread pkg/services/types.go
listArgs := &ListArguments{
Page: 1,
Size: 100,
Size: 20,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning

Blocking

Category: Standards

The PR title uses docs: but this file (and resource.go, types_test.go) contains a functional code change — reducing the default pageSize from 100 to 20. Per the HyperFleet commit standard, all changes must be consistent with the declared type.

This is also a breaking behavioral change for API consumers who rely on the default page size. Suggest:

  1. Split into a separate commit with fix: type (e.g., HYPERFLEET-1186 - fix: reduce default pageSize from 100 to 20)
  2. Call out the behavior change explicitly in the PR description so reviewers and consumers are aware

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rafabene
Thanks for the review!:+1:
This is already in a separate commit : d2d1e75

I will update the PR title too. This is to align what the api spec sets, which takes precedence over the defaults defined, which sounded a bit high (100) to @rh-amarin and I.

Comment thread docs/database.md Outdated

### node_pools
Child resources owned by clusters, representing groups of compute nodes. Uses foreign key relationship with cascade delete.
Child resources owned by clusters, representing groups of compute nodes. References clusters via `owner_id` with a `RESTRICT` foreign key. Same column layout as clusters (including `labels`, `status_conditions`, `deleted_time`, `deleted_by`).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tip

nit — non-blocking suggestion

Category: Inconsistency

"Same column layout as clusters" is slightly misleading — node_pools has owner_id (FK to clusters) which clusters doesn't. Consider rewording to clarify which columns are shared vs. unique:

Suggested change
Child resources owned by clusters, representing groups of compute nodes. References clusters via `owner_id` with a `RESTRICT` foreign key. Same column layout as clusters (including `labels`, `status_conditions`, `deleted_time`, `deleted_by`).
Child resources owned by clusters, representing groups of compute nodes. References clusters via `owner_id` with a `RESTRICT` foreign key. Shares the same core columns as clusters (`labels`, `status_conditions`, `deleted_time`, `deleted_by`) plus `owner_id` for the parent relationship.

@sherine-k sherine-k changed the title HYPERFLEET-1186 - docs: align documentation with codebase and fix pageSize default HYPERFLEET-1186 - fix: align documentation with codebase and fix pageSize default Jun 11, 2026
Comment thread pkg/services/types.go
listArgs := &ListArguments{
Page: 1,
Size: 100,
Size: 20,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tip

nit — non-blocking suggestion

Category: Pattern

The literal 20 is repeated in three places (types.go:28, resource.go:275, resource.go:303). Since there's already a MaxListSize constant at line 22, consider adding a companion:

Suggested change
Size: 20,
Size: DefaultPageSize,

with const DefaultPageSize = 20 defined alongside MaxListSize on line 22. This keeps future page-size changes atomic — the PR itself had to update 3 sites to change this value.

Comment thread docs/api-resources.md Outdated

```json
{
"type": "about:blank",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning

Blocking

Category: Standards

All three error examples use "type": "about:blank", but the code (pkg/errors/errors.go) defines specific type URIs:

ErrorTypeBase       = "https://api.hyperfleet.io/errors/"
ErrorTypeValidation = ErrorTypeBase + "validation-error"
ErrorTypeNotFound   = ErrorTypeBase + "not-found"
ErrorTypeConflict   = ErrorTypeBase + "conflict"

The Error Model Standard also mandates these URIs. Please update:

  • Line 780: "about:blank""https://api.hyperfleet.io/errors/validation-error"
  • Line 808: "about:blank""https://api.hyperfleet.io/errors/not-found"
  • Line 821: "about:blank""https://api.hyperfleet.io/errors/conflict"

Comment thread docs/api-resources.md Outdated
| `type` | string | Yes | URI reference identifying the problem type |
| `title` | string | Yes | Short human-readable summary |
| `status` | integer | Yes | HTTP status code |
| `detail` | string | No | Human-readable explanation specific to this occurrence |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning

Blocking

Category: Standards

detail is marked "No" in the "Always present" column, but:

  1. The code (pkg/errors/errors.go AsProblemDetails()) always populates Detail: &e.Reason
  2. The Error Model Standard marks detail as Required
  3. All three examples in this very section include detail
Suggested change
| `detail` | string | No | Human-readable explanation specific to this occurrence |
| `detail` | string | Yes | Human-readable explanation specific to this occurrence |

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants