Skip to content

Commit 6f6d83c

Browse files
authored
Merge pull request opensandbox-group#839 from alibaba/codex/update-agent-guidance
Docs: Update agent guidance docs
2 parents 4d70daf + 470f0c5 commit 6f6d83c

6 files changed

Lines changed: 171 additions & 15 deletions

File tree

AGENTS.md

Lines changed: 23 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,20 +4,37 @@ Use this file as the root router for the monorepo. Prefer the nearest `AGENTS.md
44

55
## Repository Map
66

7-
- `server/`: lifecycle control plane and server tests
7+
- `server/`: FastAPI lifecycle control plane, Docker/Kubernetes runtime integration, snapshot metadata, and server tests
88
- `components/execd/`: in-sandbox execution daemon
9-
- `sdks/`: language SDKs and generated clients
10-
- `specs/`: public API contracts
11-
- `components/`, `cli/`, `docs/`, `kubernetes/`, `sandboxes/`: runtime, tooling, and deployment surfaces
9+
- `components/egress/`: per-sandbox network egress policy sidecar
10+
- `components/ingress/`: ingress gateway and endpoint routing
11+
- `components/internal/`: shared Go helpers used by runtime components
12+
- `sdks/`: sandbox, code-interpreter, and MCP SDKs plus generated clients
13+
- `specs/`: public OpenAPI contracts and examples
14+
- `kubernetes/`: Kubernetes operator, CRDs, task-executor, Helm charts, and Kind e2e tests
15+
- `cli/`: `osb` command-line client and bundled CLI skills
16+
- `tests/`: cross-language end-to-end SDK tests
17+
- `docs/`, `examples/`, `sandboxes/`, `oseps/`: documentation, samples, images/environments, and proposals
1218

1319
## Routing
1420

1521
- For `server/**`, or lifecycle server behavior, sandbox creation flow, or user-visible server config, read `server/AGENTS.md`.
1622
- For `sdks/**`, or SDK generation, handwritten adapters, or cross-language SDK alignment, read `sdks/AGENTS.md`.
1723
- For `specs/**`, or API contract, schema, or example changes, read `specs/AGENTS.md`.
24+
- For `kubernetes/**`, or CRDs, controller behavior, task execution, Helm/Kustomize deployment, pool scheduling, pause/resume snapshots, or Kind e2e tests, read `kubernetes/AGENTS.md`.
1825
- For cross-cutting changes spanning spec, server, and SDKs, start with `specs/AGENTS.md` and then read affected consumer guides.
26+
- For runtime component changes under `components/**`, read the nearest `README.md` or `DEVELOPMENT.md`; keep component APIs aligned with `specs/` and SDK consumers.
27+
- For CLI changes under `cli/**`, read `cli/README.md` and verify command help/output behavior alongside unit tests.
28+
- For cross-language e2e tests under `tests/**`, read the language-local README and keep test assumptions aligned with current server and SDK behavior.
1929
- For areas without a local `AGENTS.md`, use the nearest `README.md`, `DEVELOPMENT.md`, and CI workflow as the next source of truth.
2030

31+
## Working Principles
32+
33+
- Think before coding: state assumptions, surface ambiguity, and ask or push back when the request has conflicting interpretations.
34+
- Simplicity first: implement the smallest solution that satisfies the request; avoid speculative features, one-off abstractions, and unnecessary configurability.
35+
- Surgical changes: touch only files and lines needed for the task, match local style, and do not refactor or delete unrelated pre-existing code.
36+
- Goal-driven execution: translate non-trivial work into verifiable success criteria, add or update focused tests when behavior changes, and loop until checks pass or blockers are clear.
37+
2138
## Guardrails
2239

2340
Always:
@@ -26,6 +43,7 @@ Always:
2643
- Treat `specs/*` as public contract sources.
2744
- Keep spec, implementation, SDKs, docs, examples, config, and CLI behavior aligned when user-visible behavior changes.
2845
- When changing `specs/*`, also update or verify affected server, SDK, docs, and release outputs when practical.
46+
- When changing CRDs or Kubernetes public behavior, update or verify generated manifests, Helm/Kustomize deployment output, server Kubernetes integration, and docs when practical.
2947
- Prefer additive, backward-compatible changes for public interfaces.
3048
- Regenerate derived outputs when the source-of-truth file changes.
3149
- Update tests when behavior changes or bugs are fixed.
@@ -35,6 +53,7 @@ Always:
3553
Ask first:
3654

3755
- Breaking public API, SDK, config, protocol, or CLI changes
56+
- Breaking CRD, annotation, label, Helm values, or Kubernetes deployment changes
3857
- Intentional drift between a public contract and its implementation
3958
- User-visible config or behavior changes without a clear migration story
4059

CLAUDE.md

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
# OpenSandbox Claude Guide
2+
3+
Use this file as the Claude Code entry point for the OpenSandbox monorepo. Treat `AGENTS.md` as the canonical router and prefer the nearest local `AGENTS.md` for task-specific rules.
4+
5+
## Read First
6+
7+
- Root rules: `AGENTS.md`
8+
- Server changes: `server/AGENTS.md`
9+
- SDK changes: `sdks/AGENTS.md`
10+
- Spec changes: `specs/AGENTS.md`
11+
- Kubernetes changes: `kubernetes/AGENTS.md`
12+
- Areas without a local `AGENTS.md`: read the nearest `README.md`, `DEVELOPMENT.md`, and relevant CI workflow.
13+
14+
## Repository Map
15+
16+
- `server/`: FastAPI lifecycle control plane, Docker/Kubernetes runtime integration, snapshot metadata, and server tests
17+
- `components/execd/`: in-sandbox execution daemon
18+
- `components/egress/`: per-sandbox network egress policy sidecar
19+
- `components/ingress/`: ingress gateway and endpoint routing
20+
- `sdks/`: sandbox, code-interpreter, and MCP SDKs plus generated clients
21+
- `specs/`: public OpenAPI contracts and examples
22+
- `kubernetes/`: Kubernetes operator, CRDs, task-executor, Helm charts, and Kind e2e tests
23+
- `cli/`: `osb` command-line client and bundled CLI skills
24+
- `tests/`: cross-language end-to-end SDK tests
25+
- `docs/`, `examples/`, `sandboxes/`, `oseps/`: documentation, samples, images/environments, and proposals
26+
27+
## Working Principles
28+
29+
- Think before coding: state assumptions, surface ambiguity, and ask or push back when the request has conflicting interpretations.
30+
- Simplicity first: implement the smallest solution that satisfies the request; avoid speculative features, one-off abstractions, and unnecessary configurability.
31+
- Surgical changes: touch only files and lines needed for the task, match local style, and do not refactor or delete unrelated pre-existing code.
32+
- Goal-driven execution: translate non-trivial work into verifiable success criteria, add or update focused tests when behavior changes, and loop until checks pass or blockers are clear.
33+
34+
## Guardrails
35+
36+
Always:
37+
38+
- Keep changes focused on the user request.
39+
- Keep spec, implementation, SDKs, docs, examples, config, CLI, and Kubernetes behavior aligned when user-visible behavior changes.
40+
- Prefer additive, backward-compatible changes for public interfaces.
41+
- Regenerate derived outputs when source-of-truth files change.
42+
- Update tests when behavior changes or bugs are fixed.
43+
- Prefer focused package/file checks before full-suite validation.
44+
- Mention unrun or blocked verification in the final handoff.
45+
46+
Ask first:
47+
48+
- Breaking public API, SDK, config, protocol, CLI, CRD, annotation, label, Helm value, or deployment changes
49+
- Intentional drift between a public contract and its implementation
50+
- User-visible config or behavior changes without a clear migration story
51+
52+
Never:
53+
54+
- Edit generated output as the only fix.
55+
- Mix unrelated component work into the same change.
56+
- Refactor adjacent code just because it is nearby.
57+
58+
## Review Focus
59+
60+
- Prioritize breaking changes in specs, SDK interfaces, config, CLI behavior, CRDs, annotations, labels, and protocols.
61+
- Flag protocol changes that are unnecessary, inconsistent, or hard to implement.
62+
- Flag source-of-truth boundary violations and missing downstream updates.
63+
- Call out missing tests and compatibility risks explicitly.

kubernetes/AGENTS.md

Lines changed: 22 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,25 @@
11
# Kubernetes AGENTS
22

3-
You are working on the OpenSandbox Kubernetes operator and task-executor. Treat CRD types and annotation contracts as public interfaces, and prefer additive, backward-compatible changes.
3+
You are working on the OpenSandbox Kubernetes operator, snapshot controller flow, and task-executor. Treat CRD types and annotation/label contracts as public interfaces, and prefer additive, backward-compatible changes.
44

55
For detailed development setup, architecture deep-dive, coding standards, testing guide, and deployment workflows, see [DEVELOPMENT.md](./DEVELOPMENT.md).
66

77
## Scope
88

9-
- `apis/`: CRD type definitions (BatchSandbox, Pool)
9+
- `apis/`: CRD type definitions (BatchSandbox, Pool, SandboxSnapshot)
1010
- `cmd/controller/`: controller manager entry point
1111
- `cmd/task-executor/`: task-executor entry point
1212
- `internal/controller/`: BatchSandbox and Pool reconcilers, allocator, eviction, update, and strategy logic
13+
- `internal/controller/*pause*`, `internal/controller/*snapshot*`: pause/resume and rootfs snapshot reconciliation
1314
- `internal/scheduler/`: in-process task scheduler (assigns tasks to sandbox pods)
1415
- `internal/task-executor/`: task execution runtime (process/container), manager, and HTTP server
1516
- `internal/utils/`: shared helpers (pod, finalizer, field index, expectations, logging)
1617
- `pkg/client/`: generated clientset, informer, and lister
1718
- `pkg/task-executor/`: task-executor public types and config
19+
- `pkg/utils/`: public-ish helper contracts used by server-side Kubernetes integration
1820
- `config/`: Kustomize overlays, RBAC, CRD bases, samples
1921
- `charts/opensandbox-controller/`: Helm chart for deployment
22+
- `cmd/image-committer/` and `Dockerfile.image-committer`: image used by pause/resume rootfs commit jobs
2023
- `test/e2e/`: end-to-end tests (Kind-based)
2124
- `test/e2e_task/`: task-executor e2e tests
2225
- `test/e2e_runtime/`: runtime-class e2e tests (gVisor)
@@ -45,14 +48,17 @@ The controller communicates allocation state through annotations on BatchSandbox
4548

4649
- `sandbox.opensandbox.io/alloc-status`: JSON `{"pods":["pod-1","pod-2"]}` — current pod allocation
4750
- `sandbox.opensandbox.io/alloc-release`: JSON `{"pods":["pod-3"]}` — pods released back to pool
51+
- `sandbox.opensandbox.io/endpoints`: JSON endpoint list consumed by server-side endpoint resolution
4852

49-
Do not change annotation keys or JSON shapes without updating both the writer (`allocator.go`, `apis.go`) and all readers (`batchsandbox_controller.go`, `allocation_store_test.go`).
53+
Do not change annotation keys or JSON shapes without updating both writers and all readers, including controller tests and any server-side Kubernetes integration that parses them.
5054

5155
## Label Contracts
5256

5357
- `sandbox.opensandbox.io/pool-name`: labels pool-owned pods
5458
- `sandbox.opensandbox.io/pool-revision`: revision hash for rolling updates
5559
- `batch-sandbox.sandbox.opensandbox.io/pod-index`: pod index within a BatchSandbox
60+
- `pool.opensandbox.io/evict`: marks idle pool pods for eviction
61+
- `pool.opensandbox.io/eviction-handler`: selects pool eviction handler implementation
5662

5763
## Commands
5864

@@ -129,12 +135,21 @@ cd kubernetes
129135
make manifests generate
130136
```
131137

138+
Pause/resume focused checks:
139+
140+
```bash
141+
cd kubernetes
142+
go test ./internal/controller/ -run 'Test(DispatchPauseResume|HandlePause|HandleResume|ContinueResume|CompletePause|SyncPauseOrClear|SandboxSnapshot)' -v
143+
make test-e2e-pause-resume
144+
```
145+
132146
## Architecture Overview
133147

134-
Two controllers run inside the controller manager:
148+
Core reconciliation flows:
135149

136-
1. **BatchSandboxReconciler**: Owns Pod objects. Handles pod scaling (non-pooled mode), pool allocation parsing, task scheduling, status updates, and expiry cleanup.
150+
1. **BatchSandboxReconciler**: Owns Pod objects. Handles pod scaling (non-pooled mode), pool allocation parsing, task scheduling, status updates, expiry cleanup, and pause/resume handoff.
137151
2. **PoolReconciler**: Owns Pod objects and watches BatchSandbox objects. Handles pod allocation to sandboxes, pool scaling (buffer/pool min/max), rolling updates, eviction, and status.
152+
3. **SandboxSnapshot flow**: Internal CR and commit Job orchestration used by pause/resume to persist and restore root filesystems.
138153

139154
Allocation flow: PoolReconciler.Schedule → Allocator.Schedule → allocate/deallocate → PersistPoolAllocation → SyncSandboxAllocation (writes annotation to BatchSandbox).
140155

@@ -146,6 +161,7 @@ Always:
146161

147162
- Run `make manifests generate` after changing `apis/` types.
148163
- Run `make test` after controller or allocator changes.
164+
- Update CRD YAML, Helm values/templates, Kustomize manifests, and docs together when controller flags or CRD behavior changes.
149165
- Add focused regression tests for bug fixes in controller or allocator logic.
150166
- Keep reconciler logic idempotent — controllers may reconcile the same object concurrently.
151167
- Preserve annotation backward compatibility; add new fields rather than renaming existing ones.
@@ -156,6 +172,7 @@ Ask first:
156172
- Changing CRD spec fields (additive changes are fine; removal or renaming is breaking)
157173
- Changing annotation keys or JSON shapes
158174
- Changing pool allocation or scheduling semantics
175+
- Changing pause/resume snapshot semantics, controller snapshot flags, or image-committer trust assumptions
159176
- Large reorganizations across `controller/`, `scheduler/`, and `task-executor/`
160177

161178
Never:

sdks/AGENTS.md

Lines changed: 33 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,14 +7,15 @@ You are working on OpenSandbox SDKs. Keep generated and handwritten code separat
77
- `sandbox/**`
88
- `code-interpreter/**`
99
- `mcp/**`
10+
- workspace-level SDK build and release metadata
1011

1112
If the task is driven by spec changes, also read `../specs/AGENTS.md`.
1213

1314
## Key Areas
1415

15-
- `sandbox/python`, `sandbox/javascript`, `sandbox/kotlin`, `sandbox/csharp`
16+
- `sandbox/python`, `sandbox/javascript`, `sandbox/kotlin`, `sandbox/csharp`, `sandbox/go`
1617
- `code-interpreter/python`, `code-interpreter/javascript`, `code-interpreter/kotlin`, `code-interpreter/csharp`
17-
- `mcp/`
18+
- `mcp/sandbox/python`
1819
- Workspace config in `package.json`, `pnpm-workspace.yaml`, and shared build files
1920

2021
## Generated Code
@@ -26,6 +27,7 @@ Generator-owned paths include:
2627
- `sandbox/python/src/opensandbox/api/**`
2728
- `sandbox/javascript/src/api/*.ts`
2829
- `sandbox/kotlin/sandbox-api/build/generated/**`
30+
- language-specific OpenAPI outputs produced by local generator scripts or Gradle tasks
2931

3032
Handwritten logic belongs in adapters, services, facades, converters, and stable SDK models.
3133

@@ -60,6 +62,17 @@ uv run pytest tests/ -v
6062
uv build
6163
```
6264

65+
Python code-interpreter SDK:
66+
67+
```bash
68+
cd sdks/code-interpreter/python
69+
uv sync
70+
uv run ruff check
71+
uv run pyright
72+
uv run pytest
73+
uv build
74+
```
75+
6376
JavaScript sandbox SDK:
6477

6578
```bash
@@ -71,6 +84,16 @@ pnpm run build
7184
pnpm run test
7285
```
7386

87+
JavaScript code-interpreter SDK:
88+
89+
```bash
90+
cd sdks/code-interpreter/javascript
91+
pnpm run lint
92+
pnpm run typecheck
93+
pnpm run build
94+
pnpm run test
95+
```
96+
7497
Kotlin sandbox SDK:
7598

7699
```bash
@@ -79,11 +102,19 @@ cd sdks/sandbox/kotlin
79102
./gradlew spotlessApply :sandbox:test
80103
```
81104

105+
Go sandbox SDK:
106+
107+
```bash
108+
cd sdks/sandbox/go
109+
go test ./...
110+
```
111+
82112
## Guardrails
83113

84114
Always:
85115

86116
- For spec-driven changes, regenerate affected SDK code, update handwritten layers, then run affected language checks.
117+
- For MCP changes, keep tool schemas, client setup docs, and sandbox SDK dependency behavior aligned.
87118
- Add a regression test for every bug fix.
88119
- Prefer tests for request mapping, response conversion, error mapping, streaming behavior, and resource cleanup.
89120
- Keep package-local validation fast before widening to multi-language verification.

server/AGENTS.md

Lines changed: 24 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,29 @@
11
# Server AGENTS
22

3-
You are working on the OpenSandbox lifecycle server. Keep the route layer thin and put behavior in services, validators, or runtime helpers.
3+
You are working on the OpenSandbox lifecycle server. Keep the route layer thin and put behavior in services, validators, repositories, or runtime helpers.
44

55
## Scope
66

77
- `opensandbox_server/**`
88
- `tests/**`
9+
- `configuration.md`, `docker-compose.example.yaml`, and other server-facing docs/examples
910

1011
If the task changes lifecycle API contracts in `../specs/sandbox-lifecycle.yml`, also read `../specs/AGENTS.md`.
12+
If the task changes Kubernetes runtime behavior, also read `../kubernetes/AGENTS.md`.
1113

1214
## Key Paths
1315

14-
- `opensandbox_server/main.py`: app entry point and startup wiring
16+
- `opensandbox_server/cli.py`: `opensandbox-server` CLI entry point and config initialization
17+
- `opensandbox_server/main.py`: FastAPI app entry point and startup wiring
1518
- `opensandbox_server/api/`: FastAPI routes and request/response schemas
1619
- `opensandbox_server/services/`: business logic and runtime integration
20+
- `opensandbox_server/services/docker/`: Docker runtime, endpoints, port allocation, diagnostics, and snapshot runtime
21+
- `opensandbox_server/services/k8s/`: Kubernetes providers, templates, informer, egress, pool, diagnostics, and pause/resume runtime integration
22+
- `opensandbox_server/repositories/`: persistence backends, including snapshot metadata
1723
- `opensandbox_server/integrations/`: optional external integrations
24+
- `opensandbox_server/extensions/`: extension loading and optional behavior hooks
25+
- `opensandbox_server/middleware/`: authentication and request middleware
26+
- `opensandbox_server/config.py`: TOML config model, defaults, validation, and environment integration
1827
- `tests/`: unit, integration, smoke, and Kubernetes-focused tests
1928

2029
## Commands
@@ -29,6 +38,14 @@ uv run pytest tests/test_docker_service.py
2938
uv run pytest tests/test_schema.py
3039
```
3140

41+
Kubernetes-focused checks:
42+
43+
```bash
44+
cd server
45+
uv run pytest tests/k8s
46+
uv run pytest tests/test_routes_pause_resume.py tests/test_routes_snapshots.py tests/test_snapshot_service.py
47+
```
48+
3249
Typed or broader validation:
3350

3451
```bash
@@ -40,8 +57,8 @@ uv run pytest
4057
Local startup:
4158

4259
```bash
43-
cp server/opensandbox_server/examples/example.config.toml ~/.sandbox.toml
4460
cd server
61+
uv run opensandbox-server init-config ~/.sandbox.toml --example docker
4562
uv run python -m opensandbox_server.main
4663
```
4764

@@ -58,6 +75,9 @@ chmod +x tests/smoke.sh
5875
Always:
5976

6077
- Keep FastAPI routes thin and delegate behavior to services, validators, or runtime helpers.
78+
- Keep runtime-specific behavior in Docker/Kubernetes service modules; shared API behavior belongs in common services or validators.
79+
- Keep snapshot state changes coordinated across route handlers, services, repositories, and runtime-specific snapshot implementations.
80+
- Keep TOML config defaults, config examples, README/configuration docs, and CLI `init-config` output aligned.
6181
- Extend existing fixtures and helpers before adding parallel abstractions.
6282
- Add focused regression tests with every bug fix or behavior change.
6383

@@ -66,6 +86,7 @@ Ask first:
6686
- Removing or renaming public endpoints
6787
- Changing config shape or defaults in a user-visible way
6888
- Introducing new external service dependencies
89+
- Changing snapshot, pause/resume, renew-intent, ingress, egress, or pool semantics
6990
- Large reorganizations across `api/`, `services/`, and `tests/`
7091

7192
Never:

specs/AGENTS.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ You are maintaining OpenSandbox public API contracts. Treat the spec files in th
55
## Scope
66

77
- `sandbox-lifecycle.yml`
8+
- `diagnostic-api.yml`
89
- `execd-api.yaml`
910
- `egress-api.yaml`
1011
- `README*.md`
@@ -13,10 +14,13 @@ When a contract change affects downstream code, also read the nearest consumer g
1314

1415
- `../server/AGENTS.md` for lifecycle server impact
1516
- `../sdks/AGENTS.md` for SDK-facing contract changes
17+
- component README/DEVELOPMENT files under `../components/` for execd or egress impact
18+
- `../cli/README.md` for CLI-visible diagnostics, lifecycle, or egress changes
1619

1720
## Contract Map
1821

19-
- `sandbox-lifecycle.yml`: lifecycle API used by `server/` and sandbox SDKs
22+
- `sandbox-lifecycle.yml`: lifecycle API used by `server/`, `cli/`, and sandbox SDKs
23+
- `diagnostic-api.yml`: diagnostics API used by server diagnostics, CLI diagnostics, and troubleshooting flows
2024
- `execd-api.yaml`: execution API used by `components/execd/` and code-interpreter SDKs
2125
- `egress-api.yaml`: egress sidecar API and related docs
2226

@@ -53,6 +57,7 @@ Always:
5357
- Keep operation IDs, schema names, examples, and descriptions consistent with existing naming.
5458
- Regenerate derived outputs after spec edits.
5559
- Update affected consumers in the same change when practical.
60+
- Keep spec examples aligned with server schemas and generated SDK models.
5661
- Call out downstream areas you did not verify.
5762

5863
Ask first:

0 commit comments

Comments
 (0)