Skip to content

feat: implement lifecycle supervisor and mock backend driver#10

Merged
yanurag-dev merged 2 commits into
mainfrom
feat/phase2-orchestration
May 10, 2026
Merged

feat: implement lifecycle supervisor and mock backend driver#10
yanurag-dev merged 2 commits into
mainfrom
feat/phase2-orchestration

Conversation

@yanurag-dev
Copy link
Copy Markdown
Owner

@yanurag-dev yanurag-dev commented May 10, 2026

This PR implements the core orchestration layer for Sandforge.

Changes:

  • Supervisor: Implemented a thread-safe state machine for sandbox lifecycles (Start, RunCommand, Stop).
  • Mock Backend: Added an in-memory driver for testing the supervisor without VM dependencies.
  • Unit Tests: Added comprehensive tests for lifecycle transitions and policy enforcement.

Ref: #9

Summary by CodeRabbit

  • Tests

    • Added comprehensive tests for sandbox lifecycle management, including provisioning, execution, and cleanup operations.
  • Chores

    • Added sandbox management infrastructure with lifecycle state tracking and policy evaluation for controlled sandbox operations.

Review Change Stack

- Add Supervisor state machine and lifecycle management (Start, RunCommand, Stop).
- Implement MockBackend for testing the orchestration layer without a real VM.
- Add comprehensive unit tests covering state transitions, concurrency safety (sync.RWMutex), and policy enforcement.

Resolves: #9
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 10, 2026

Warning

Rate limit exceeded

@yanurag-dev has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 48 minutes and 41 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9e26b63e-a977-43ff-b6e3-f49f750caeca

📥 Commits

Reviewing files that changed from the base of the PR and between 5eed12e and 722b6f1.

📒 Files selected for processing (3)
  • internal/backend/mock.go
  • internal/supervisor/supervisor.go
  • internal/supervisor/supervisor_test.go
📝 Walkthrough

Walkthrough

This PR introduces a mock sandbox backend and a supervisor orchestrator that manages sandbox lifecycle with state transitions. The MockBackend provides in-memory storage of sandboxes with thread-safe operations, while the Supervisor coordinates policy enforcement and state management across startup, command execution, and shutdown phases.

Changes

Sandbox Backend and Supervisor

Layer / File(s) Summary
Lifecycle State and Data Contracts
internal/supervisor/supervisor.go
Defines State as string-valued constants (StateRequested, StateProvisioning, StateReady, StateExecuting, StateCopyingArtifacts, StateDestroying, StateDestroyed, StateError). Adds SandboxInstance struct to track id, spec, state, backend handle, and error. Defines Supervisor with mutex-protected instance map, backend, and policy engine; adds NewSupervisor initializer.
Mock Backend Implementation
internal/backend/mock.go
Introduces MockBackend with mutex-protected in-memory sandboxes map. NewMockBackend initializes the map. CreateSandbox generates mock-%d handles based on map size. DestroySandbox removes entries. MountWorkspace and CopyOut are success-only no-ops. Exec returns zero ExitCode and mock output in Stdout.
Supervisor Lifecycle Operations
internal/supervisor/supervisor.go
Start evaluates sandbox policy, rejects duplicates, transitions instance through requestedprovisioningready, and stores backend handle. RunCommand requires instance in ready, evaluates exec policy, transitions to executing, calls backend, and reverts to ready on success. Stop transitions to destroying, calls backend destroy, marks destroyed, and removes instance from map.
Supervisor Lifecycle Tests
internal/supervisor/supervisor_test.go
TestSupervisorLifecycle verifies sandbox startup reaches StateReady, allowed command execution succeeds with zero exit code, forbidden command triggers policy error, and Stop removes the instance.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Poem

🐰 A sandbox takes form, step by careful step,
Mock backends store, supervisors keep,
State transitions dance through ready and run,
Policy checks guard what must be done,
Tests affirm the journey, from start until done! 🎯

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title clearly summarizes the main changes: implementing a lifecycle supervisor and mock backend driver, which are the core components added across the three files in this changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/phase2-orchestration

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@yanurag-dev yanurag-dev changed the title Phase 2: Orchestration & Mocking feat: implement lifecycle supervisor and mock backend driver May 10, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
internal/backend/mock.go (1)

33-38: ⚡ Quick win

Mock backend should fail on unknown handles.

Exec and DestroySandbox currently succeed for nonexistent handles. That can let supervisor lifecycle bugs pass tests undetected.

Suggested fix
 func (m *MockBackend) Exec(handle string, req api.ExecRequest) (api.ExecResult, error) {
+	m.mu.RLock()
+	_, exists := m.sandboxes[handle]
+	m.mu.RUnlock()
+	if !exists {
+		return api.ExecResult{}, fmt.Errorf("sandbox handle not found: %s", handle)
+	}
+
 	return api.ExecResult{
 		ExitCode: 0,
 		Stdout:   fmt.Sprintf("mock output for %v", req.Command),
 	}, nil
 }
@@
 func (m *MockBackend) DestroySandbox(handle string) error {
 	m.mu.Lock()
 	defer m.mu.Unlock()
+	if _, exists := m.sandboxes[handle]; !exists {
+		return fmt.Errorf("sandbox handle not found: %s", handle)
+	}
 	delete(m.sandboxes, handle)
 	return nil
 }

Also applies to: 44-48

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/backend/mock.go` around lines 33 - 38, The MockBackend methods Exec
and DestroySandbox currently always succeed; update both implementations on
MockBackend to first check whether the provided handle exists in the mock's
internal registry/state (e.g., the map or slice that tracks created sandboxes)
and, if not present, return a non-nil error (for example fmt.Errorf("unknown
handle: %s", handle)) instead of a successful api.ExecResult or nil error; keep
the existing successful return behavior only when the handle is found. Ensure
you modify the Exec and DestroySandbox methods on MockBackend accordingly.
internal/supervisor/supervisor_test.go (1)

11-86: ⚡ Quick win

Add a contention test for concurrent lifecycle calls.

Please add a test that runs RunCommand concurrently (and/or RunCommand racing with Stop) to lock in expected state-machine behavior and catch regressions in synchronization.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/supervisor/supervisor_test.go` around lines 11 - 86, Add a new
subtest inside TestSupervisorLifecycle that stresses concurrent lifecycle
operations: spawn multiple goroutines that call sup.RunCommand(id, req)
concurrently (and another set that calls sup.Stop(id) racing with RunCommand)
using a sync.WaitGroup and channels to coordinate start, then assert that no
data race occurs, RunCommand returns either a valid ExecResult or a well-typed
error, and that supervisor.instances map and instance.State remain consistent
(use sup.instances, StateReady, sup.RunCommand, sup.Stop to locate code). Ensure
the test waits for all goroutines to finish, checks for expected outcomes
(allowed exit codes or expected error when stopped), and runs with -race in CI
to catch synchronization regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/backend/mock.go`:
- Around line 24-25: The handle generation in the mock backend uses
len(m.sandboxes) which can produce duplicate handles after deletions; add an
integer counter field (e.g., nextID) to the mock backend struct and use it to
build handles (fmt.Sprintf("mock-%d", m.nextID)) instead of len(m.sandboxes),
incrementing nextID after each creation; also ensure the counter is initialized
where the mock backend is constructed (e.g., NewMockBackend or the mock struct
initializer) so handles are always unique regardless of map deletions and no
existing sandbox entries are overwritten.

In `@internal/supervisor/supervisor.go`:
- Around line 74-85: The SandboxInstance field writes/reads (State, Handle,
Error) are done without synchronization after s.mu is released, causing races in
Start, RunCommand and inconsistent behavior with Stop; fix by adding
synchronization to SandboxInstance (e.g., add an embedded sync.Mutex or
sync.RWMutex on SandboxInstance and provide methods like SetState, GetState,
SetError, SetHandle that lock/unlock appropriately) and update Supervisor
methods (Start, RunCommand, Stop) to use those instance-level methods so the
ready-check-to-transition in RunCommand is performed atomically under the
instance lock (or alternatively hold s.mu for the entire create/update sequence)
and remove unsynchronized direct field assignments/read accesses.
- Around line 43-48: NewSupervisor currently accepts backend and engine without
validation which can cause panics later; change its signature to return
(*Supervisor, error), validate that backend != nil and engine != nil inside
NewSupervisor, and return a descriptive error (e.g. fmt.Errorf("NewSupervisor:
backend is nil") / fmt.Errorf("NewSupervisor: policy engine is nil")) if either
is nil; on success return the constructed &Supervisor{instances: make(...),
backend: backend, policy: engine}, nil and update callers to handle the error.
- Around line 132-148: In Stop, avoid holding s.mu across the
backend.DestroySandbox call: acquire s.mu, fetch instance from s.instances[id],
set instance.State = StateDestroying, then unlock s.mu before calling
s.backend.DestroySandbox(instance.Handle); if DestroySandbox returns an error,
re-lock s.mu to set instance.State = StateError and instance.Error = err (and
return), otherwise re-lock s.mu to remove the instance from the instances map
and perform any final cleanup; ensure all reads/writes to s.instances and
instance fields occur while s.mu is held.

---

Nitpick comments:
In `@internal/backend/mock.go`:
- Around line 33-38: The MockBackend methods Exec and DestroySandbox currently
always succeed; update both implementations on MockBackend to first check
whether the provided handle exists in the mock's internal registry/state (e.g.,
the map or slice that tracks created sandboxes) and, if not present, return a
non-nil error (for example fmt.Errorf("unknown handle: %s", handle)) instead of
a successful api.ExecResult or nil error; keep the existing successful return
behavior only when the handle is found. Ensure you modify the Exec and
DestroySandbox methods on MockBackend accordingly.

In `@internal/supervisor/supervisor_test.go`:
- Around line 11-86: Add a new subtest inside TestSupervisorLifecycle that
stresses concurrent lifecycle operations: spawn multiple goroutines that call
sup.RunCommand(id, req) concurrently (and another set that calls sup.Stop(id)
racing with RunCommand) using a sync.WaitGroup and channels to coordinate start,
then assert that no data race occurs, RunCommand returns either a valid
ExecResult or a well-typed error, and that supervisor.instances map and
instance.State remain consistent (use sup.instances, StateReady, sup.RunCommand,
sup.Stop to locate code). Ensure the test waits for all goroutines to finish,
checks for expected outcomes (allowed exit codes or expected error when
stopped), and runs with -race in CI to catch synchronization regressions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8b0d90e3-f992-4126-9ecc-6bfb3becca50

📥 Commits

Reviewing files that changed from the base of the PR and between bda95f2 and 5eed12e.

📒 Files selected for processing (3)
  • internal/backend/mock.go
  • internal/supervisor/supervisor.go
  • internal/supervisor/supervisor_test.go

Comment thread internal/backend/mock.go Outdated
Comment thread internal/supervisor/supervisor.go Outdated
Comment thread internal/supervisor/supervisor.go Outdated
Comment thread internal/supervisor/supervisor.go Outdated
- Add per-instance RWMutex to SandboxInstance to prevent data races.
- Implement atomic state transitions in Supervisor.
- Update NewSupervisor to validate inputs and return errors.
- Improve MockBackend with unique handle generation and existence checks.
- Add concurrent stress tests to supervisor_test.go.
@yanurag-dev yanurag-dev merged commit 149c5e5 into main May 10, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant