feat(aws): add cleanup tool to AI#2157
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR adds an orphan-resource cleanup capability for AWS BYOC deployments: a new ChangesOrphan Cleanup Tool and AWS Implementation
Sequence Diagram(s)sequenceDiagram
participant User
participant CleanupTool as HandleCleanupTool
participant ByocAws
participant AWS as AWS APIs
participant EC as ElicitationController
User->>CleanupTool: invoke cleanup_resources
CleanupTool->>ByocAws: DiscoverOrphans(projectName)
ByocAws->>AWS: scan ALB/RDS/ECR/Route53 by prefix
AWS-->>ByocAws: matching resources
ByocAws-->>CleanupTool: []OrphanResource
alt no orphans found
CleanupTool-->>User: "no blocking leftovers"
else orphans found, non-interactive
CleanupTool-->>User: report + rerun instructions
else orphans found, interactive
loop each orphan
CleanupTool->>EC: RequestEnum(yes/no)
EC-->>User: prompt
User-->>EC: answer
EC-->>CleanupTool: choice
alt yes
CleanupTool->>ByocAws: CleanupOrphan(resource)
ByocAws->>AWS: unblock resource (disable protection / delete images / delete record)
AWS-->>ByocAws: result
end
end
CleanupTool-->>User: summary + "run defang down" guidance
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.2)level=warning msg="The linter 'gomodguard' is deprecated (since v2.12.0) due to: new major version. Replaced by gomodguard_v2." Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (2)
src/pkg/agent/tools/cleanup_test.go (1)
76-121: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd a
CleanupOrphanfailure case to the table.This suite never exercises
cleanupErr, so the new failed-cleanup reporting path inHandleCleanupToolcan regress without a test catching it. Add a case that returnscleanupErrand assert both the failed item output and the finalfailedcounter.Example table entry
{ name: "discovery error is surfaced", provider: &mockCleanerProvider{discoverErr: errors.New("boom")}, expectErr: "failed to discover leftover resources: boom", }, + { + name: "cleanup error is reported", + provider: &mockCleanerProvider{orphans: twoOrphans[:1], cleanupErr: errors.New("cannot clean")}, + confirm: "yes", + expectContains: []string{"failed: cannot clean", "0 cleaned, 0 skipped, 1 failed"}, + expectCleaned: nil, + }, }As per coding guidelines, "Use table-driven tests for multiple scenarios" and "Add tests for new behavior and important failure modes."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pkg/agent/tools/cleanup_test.go` around lines 76 - 121, Add a table-driven failure case in cleanup_test.go for the CleanupOrphan path that returns cleanupErr from the mock cleaner provider. Update the test table around HandleCleanupTool to include a scenario with orphans plus a failing cleanup, and assert both the failed item is reported in the output and the final summary shows the expected failed count. Use the existing mockCleanerProvider, cleanupErr, and expectContains/expectCleaned patterns so the new case covers the regression path without changing other scenarios.Source: Coding guidelines
src/pkg/clouds/aws/elbv2.go (1)
25-27: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winWrap AWS errors with operation context.
These helpers currently return bare SDK errors, so downstream cleanup output loses whether listing load balancers or toggling deletion protection failed. Please wrap the original error with the operation and target identifier while preserving the provider detail. As per coding guidelines, "Preserve cloud-provider error detail while wrapping errors with operation context in cloud SDK wrappers."
Also applies to: 49-55
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pkg/clouds/aws/elbv2.go` around lines 25 - 27, The AWS ELBv2 helper calls are returning bare SDK errors, so add operation context while preserving the original provider error detail in the error path. Update the relevant wrapper functions in elbv2.go, including the DescribeLoadBalancers call and the deletion-protection-related helper mentioned in the comment, to wrap returned errors with the operation name and target identifier using their existing function names and parameters. Keep the underlying AWS error attached so downstream cleanup can still surface the provider-specific failure.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/cmd/cli/command/compose.go`:
- Around line 485-500: Skip initializing the debugger in the non-interactive
failure path. In the compose command’s error handling block, guard the call to
debug.NewDebugger so it only runs when interactive prompting is possible, since
handleTailAndMonitorErr already exits early for global.NonInteractive. Keep the
existing deploymentErr return flow, but avoid the agent.New/paid-tier lookup
work and the “Failed to initialize debugger” warning in CI/non-interactive runs.
In `@src/pkg/agent/tools/cleanup.go`:
- Around line 32-35: The cleanup flow is ignoring the error from
loader.ProjectWorkingDir and falling back to an empty working directory, which
can send stacks.NewManager down the wrong project context. Update the code in
cleanup.go so the ProjectWorkingDir call is checked and any failure is returned
immediately as a wrapped error before creating the stack manager. Keep the fix
localized around ProjectWorkingDir and stacks.NewManager so failures are
deterministic and not masked by the empty string fallback.
In `@src/pkg/cli/client/byoc/aws/cleanup.go`:
- Around line 78-80: `DiscoverOrphans` is swallowing category lookup failures
and can return an empty orphan list with no error, which makes
`HandleCleanupTool` report a false “No leftover resources found”. Update
`DiscoverOrphans` to track errors from the AWS lookup helpers (for example the
`aws.FindLoadBalancersByPrefix` / other category lookups at the referenced
blocks) and, if nothing was discovered successfully, return a wrapped aggregated
error instead of nil; only return nil when at least one discovery path
succeeded. Keep the existing `term.Warnf` logging, but make sure the final
return reflects total discovery failure so `HandleCleanupTool` can surface it.
In `@src/pkg/cli/client/byoc/aws/domain_test.go`:
- Around line 40-43: The r53Mock.ChangeResourceRecordSets stub currently returns
nil, nil, which can hide unexpected DNS mutation calls during tests. Update this
mock to fail fast in the test path by panicking or otherwise recording and
asserting the request when ChangeResourceRecordSets is invoked, so accidental
cleanup mutations are surfaced immediately. Use the ChangeResourceRecordSets
method on r53Mock as the place to make this behavior explicit.
In `@src/pkg/clouds/aws/ecr.go`:
- Around line 61-65: The BatchDeleteImage call in the ECR helper currently
treats any HTTP 200 as success, which can hide per-image failures. Update the
deletion flow in the BatchDeleteImage-related helper to inspect the returned
BatchDeleteImageOutput.Failures after the svc.BatchDeleteImage call, and return
an error when that slice is non-empty. Keep the check alongside the existing
error handling so the helper only reports success when all requested images were
actually deleted.
In `@src/pkg/debug/debug.go`:
- Around line 90-104: The prompt-default lookup in isPaidAccount is performing a
live WhoAmI network call and can block NewDebugger on the full command context;
make that remote side effect explicit and bounded. Rename or wrap the helper so
the call site in NewDebugger clearly signals network I/O, and use a short-lived
context with timeout/deadline for the Fabric lookup instead of the long command
context. Keep the fallback behavior to false in the error path and preserve the
existing term.Debug logging around the whoami failure.
---
Nitpick comments:
In `@src/pkg/agent/tools/cleanup_test.go`:
- Around line 76-121: Add a table-driven failure case in cleanup_test.go for the
CleanupOrphan path that returns cleanupErr from the mock cleaner provider.
Update the test table around HandleCleanupTool to include a scenario with
orphans plus a failing cleanup, and assert both the failed item is reported in
the output and the final summary shows the expected failed count. Use the
existing mockCleanerProvider, cleanupErr, and expectContains/expectCleaned
patterns so the new case covers the regression path without changing other
scenarios.
In `@src/pkg/clouds/aws/elbv2.go`:
- Around line 25-27: The AWS ELBv2 helper calls are returning bare SDK errors,
so add operation context while preserving the original provider error detail in
the error path. Update the relevant wrapper functions in elbv2.go, including the
DescribeLoadBalancers call and the deletion-protection-related helper mentioned
in the comment, to wrap returned errors with the operation name and target
identifier using their existing function names and parameters. Keep the
underlying AWS error attached so downstream cleanup can still surface the
provider-specific failure.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 133488ad-b0e1-4d3e-b56c-950cf8fbc213
⛔ Files ignored due to path filters (1)
src/go.sumis excluded by!**/*.sum
📒 Files selected for processing (21)
src/cmd/cli/command/compose.gosrc/go.modsrc/pkg/agent/tools/cleanup.gosrc/pkg/agent/tools/cleanup_test.gosrc/pkg/agent/tools/tools.gosrc/pkg/cli/client/byoc/aws/byoc.gosrc/pkg/cli/client/byoc/aws/cleanup.gosrc/pkg/cli/client/byoc/aws/domain_test.gosrc/pkg/cli/client/cleanup.gosrc/pkg/cli/stacks.gosrc/pkg/cli/stacks_test.gosrc/pkg/clouds/aws/ecr.gosrc/pkg/clouds/aws/ecr_test.gosrc/pkg/clouds/aws/elbv2.gosrc/pkg/clouds/aws/elbv2_test.gosrc/pkg/clouds/aws/rds.gosrc/pkg/clouds/aws/rds_test.gosrc/pkg/clouds/aws/route53.gosrc/pkg/clouds/aws/route53_cleanup_test.gosrc/pkg/debug/debug.gosrc/pkg/utils.go
| func (r r53Mock) ChangeResourceRecordSets(ctx context.Context, params *route53.ChangeResourceRecordSetsInput, optFns ...func(*route53.Options)) (*route53.ChangeResourceRecordSetsOutput, error) { | ||
| // TODO: implement if needed | ||
| return nil, nil | ||
| } |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Fail fast on unexpected DNS mutation calls in this mock.
Returning nil, nil here makes any accidental ChangeResourceRecordSets call look successful, so tests can miss a broken cleanup path. Panic or record/assert the request instead. Based on learnings, in Go test files under this repo it's acceptable for mocks to panic to surface issues quickly during tests.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/pkg/cli/client/byoc/aws/domain_test.go` around lines 40 - 43, The
r53Mock.ChangeResourceRecordSets stub currently returns nil, nil, which can hide
unexpected DNS mutation calls during tests. Update this mock to fail fast in the
test path by panicking or otherwise recording and asserting the request when
ChangeResourceRecordSets is invoked, so accidental cleanup mutations are
surfaced immediately. Use the ChangeResourceRecordSets method on r53Mock as the
place to make this behavior explicit.
Source: Learnings
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/pkg/debug/debug_test.go (1)
27-36: 📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick winMock now silently no-ops instead of failing fast, and the new feedback flow has no test coverage.
AskOnepreviously presumably failed loudly on an unexpected response type; theok-guarded check now silently skips setting*stringresponses (used by the newpromptForFeedback), so a wiring bug there would go unnoticed. Based on learnings, mocks in this repo should panic on type mismatches rather than add defensive handling, since panics surface issues fast.Additionally, no test exercises
promptForFeedback's new string-based flow or the"Feedback Prompt Answered"tracking property added indebug.go.♻️ Suggested mock + test addition
type mockSurveyor struct { - response bool + response bool + feedbackResponse string } func (s *mockSurveyor) AskOne(q survey.Prompt, response interface{}, opts ...survey.AskOpt) error { - if boolptr, ok := response.(*bool); ok { - *boolptr = s.response + switch v := response.(type) { + case *bool: + *v = s.response + case *string: + *v = s.feedbackResponse + default: + panic(fmt.Sprintf("unexpected response type %T", response)) } return nil }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pkg/debug/debug_test.go` around lines 27 - 36, Update mockSurveyor.AskOne in debug_test.go to fail fast on unexpected response types instead of silently ignoring them, so the mock panics or otherwise clearly errors when the passed response is not the expected type. Then add test coverage for promptForFeedback in debug.go using the mockSurveyor to exercise the new string-based prompt flow and verify the "Feedback Prompt Answered" tracking property is recorded.Sources: Path instructions, Learnings
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/pkg/debug/debug.go`:
- Around line 132-139: The feedback tracking in promptAndTrackDebugSession is
now sending raw free-text from promptForFeedback to analytics, which should be
avoided. Update the event emitted via track.Evt for “Feedback Prompt Answered”
to stop forwarding the full feedback string, and instead either track only that
feedback was provided or pass a truncated/redacted value before calling
track.Track. Keep the failure path and the promptForFeedback flow intact, and
make the change wherever this feedback-answer tracking is duplicated.
---
Outside diff comments:
In `@src/pkg/debug/debug_test.go`:
- Around line 27-36: Update mockSurveyor.AskOne in debug_test.go to fail fast on
unexpected response types instead of silently ignoring them, so the mock panics
or otherwise clearly errors when the passed response is not the expected type.
Then add test coverage for promptForFeedback in debug.go using the mockSurveyor
to exercise the new string-based prompt flow and verify the "Feedback Prompt
Answered" tracking property is recorded.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a0bde84f-f4da-488c-97c1-24e68df7467a
📒 Files selected for processing (4)
src/cmd/cli/command/compose.gosrc/pkg/cli/client/byoc/aws/cleanup.gosrc/pkg/debug/debug.gosrc/pkg/debug/debug_test.go
💤 Files with no reviewable changes (1)
- src/pkg/cli/client/byoc/aws/cleanup.go
🚧 Files skipped from review as they are similar to previous changes (1)
- src/cmd/cli/command/compose.go
| feedback, err := d.promptForFeedback() | ||
| if err != nil { | ||
| track.Evt(eventName+" Feedback Prompt Failed", append([]track.Property{P("reason", err)}, eventProperty...)...) | ||
| return err | ||
| } | ||
| track.Evt(eventName+" Feedback Prompt Answered", append([]track.Property{P("feedback", good)}, eventProperty...)...) | ||
| track.Evt(eventName+" Feedback Prompt Answered", append([]track.Property{P("feedback", feedback)}, eventProperty...)...) | ||
| return nil | ||
| } |
There was a problem hiding this comment.
🔒 Security & Privacy | 🟠 Major | ⚡ Quick win
Raw free-text feedback is sent to analytics.
promptForFeedback now collects arbitrary free-form text via survey.Input, and promptAndTrackDebugSession forwards it verbatim as a tracked property: track.Evt(eventName+" Feedback Prompt Answered", append([]track.Property{P("feedback", feedback)}, eventProperty...)...). Previously this was a bounded boolean confirm; now any text the user types (which could include PII or sensitive deployment details) is sent to the analytics backend via track.Track.
Consider truncating/redacting the feedback before tracking, or only tracking that feedback was given (not its content), unless capturing raw text is an explicit product requirement.
Also applies to: 157-167
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/pkg/debug/debug.go` around lines 132 - 139, The feedback tracking in
promptAndTrackDebugSession is now sending raw free-text from promptForFeedback
to analytics, which should be avoided. Update the event emitted via track.Evt
for “Feedback Prompt Answered” to stop forwarding the full feedback string, and
instead either track only that feedback was provided or pass a
truncated/redacted value before calling track.Track. Keep the failure path and
the promptForFeedback flow intact, and make the change wherever this
feedback-answer tracking is duplicated.
Description
Add cleanup tool to AI, for cleaning up protected resources.
Linked Issues
Fixes #1041
Checklist
Summary by CodeRabbit
New Features
Bug Fixes