Skip to content

Commit a94f443

Browse files
authored
feat: pool CEL validation, tenant DNS cap, deploy tooling, skill (#101)
- Align Pool CRD CEL with console API (min volumes + 3-server rule); validate_pool_total_volumes + tests; regen CRDs - Cap Tenant metadata.name at 55 chars for derived Service names (-console) - Deploy scripts: docker_build_cached, RUSTFS_DOCKER_NO_CACHE; 4-node help/sudo cleanup - Dockerfiles: cargo-chef pin / frontend image tweaks - Add .cursor/skills/rustfs-operator-contribute for commit/PR workflow; adjust .gitignore for skills - CHANGELOG, README, examples, scripts docs Made-with: Cursor
1 parent 42f2102 commit a94f443

20 files changed

Lines changed: 752 additions & 124 deletions

File tree

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
---
2+
name: rustfs-operator-contribute
3+
description: Commits, pushes, and opens pull requests for the RustFS Operator repo per CONTRIBUTING.md and AGENTS.md. Use when the user asks to commit, push to remote my, submit a PR upstream, or follow project contribution workflow.
4+
---
5+
6+
# RustFS Operator — commit, push, PR
7+
8+
## Preconditions
9+
10+
- Run from repository root: `/home/jhw/my/operator` (or clone path).
11+
- Source of truth: [`CONTRIBUTING.md`](../../../CONTRIBUTING.md), [`Makefile`](../../../Makefile), [`.github/pull_request_template.md`](../../../.github/pull_request_template.md).
12+
13+
## Before commit
14+
15+
1. Run **`make pre-commit`** (fmt-check → clippy → test → console-lint → console-fmt-check). Fix failures before committing.
16+
2. User-visible changes: update **[`CHANGELOG.md`](../../../CHANGELOG.md)** under `[Unreleased]` (Keep a Changelog).
17+
3. **Commit message**: [Conventional Commits](https://www.conventionalcommits.org/), **English**, subject **≤ 72 characters** (e.g. `fix(pool): align CEL with console validation`).
18+
19+
## Commit
20+
21+
```bash
22+
git add -A
23+
git status
24+
git commit -m "type(scope): short description"
25+
```
26+
27+
## Push to fork (`my`)
28+
29+
Remote is typically `my``git@github.com:GatewayJ/operator.git` (verify with `git remote -v`).
30+
31+
```bash
32+
git push my main
33+
```
34+
35+
If `main` is non-fast-forward on `my`, integrate or use `git push my main --force-with-lease` only when intentionally replacing fork history (dangerous).
36+
37+
## Open PR upstream (`rustfs/operator`)
38+
39+
- **Target**: `rustfs/operator` branch **`main`**.
40+
- **Head**: fork branch (e.g. `GatewayJ:main`).
41+
- **PR title and body**: **English**.
42+
- **Body**: Must follow **every section** in [`.github/pull_request_template.md`](../../../.github/pull_request_template.md); use **`N/A`** where not applicable; keep all headings.
43+
44+
**Do not** pass multiline `--body` to `gh` inline. Write a file and use `--body-file`:
45+
46+
```bash
47+
cat > /tmp/pr_body.md <<'EOF'
48+
## Type of Change
49+
- [x] Bug Fix
50+
...
51+
EOF
52+
53+
gh pr create --repo rustfs/operator --head GatewayJ:main --base main \
54+
--title "fix: concise English title" \
55+
--body-file /tmp/pr_body.md
56+
```
57+
58+
Adjust checkboxes and sections to match the change. Include **`make pre-commit`** under Verification.
59+
60+
## Quick checklist
61+
62+
- [ ] `make pre-commit` passed
63+
- [ ] CHANGELOG updated if user-visible
64+
- [ ] Commit message conventional, English
65+
- [ ] PR template complete, English, `--body-file` used
66+
67+
## References
68+
69+
- [AGENTS.md](../../../AGENTS.md) — language, security, architecture notes
70+
- [`.cursor/rules/pr.mdc`](../../../.cursor/rules/pr.mdc) — PR / path conventions (if present)

.gitignore

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,10 @@ console-web/.next/
1818
console-web/docs/
1919
console-web/out/
2020
console-web/node_modules/
21-
.cursor/
21+
# Cursor IDE: ignore contents except versioned Agent skills
22+
.cursor/*
23+
!.cursor/skills/
24+
!.cursor/skills/**
2225

2326
# Docs / summaries (local or generated)
2427
CONSOLE-INTEGRATION-SUMMARY.md

CHANGELOG.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
99

1010
### Documentation
1111

12+
- Expanded root [`README.md`](README.md) with overview, quick start, development commands, CI vs `make pre-commit`, and documentation index.
1213
- Aligned [`CLAUDE.md`](CLAUDE.md) and [`ROADMAP.md`](ROADMAP.md) with current code: Tenant status conditions and StatefulSet updates on the successful reconcile path are documented as implemented; remaining work (status on early errors, integration tests, rollout extras) is listed explicitly.
1314
- Clarified the documentation map: [`CONTRIBUTING.md`](CONTRIBUTING.md) (quality gates and CI alignment), [`docs/DEVELOPMENT.md`](docs/DEVELOPMENT.md) (environment setup), [`docs/DEVELOPMENT-NOTES.md`](docs/DEVELOPMENT-NOTES.md) (historical notes, not normative).
1415
- Updated [`examples/README.md`](examples/README.md): Tenant Services document S3 **9000** and RustFS Console **9001**; distinguished the Operator HTTP Console (default **9090**, `cargo run -- console`) from the Tenant `{tenant}-console` Service.
@@ -19,8 +20,21 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1920

2021
- **`console-web` / `make pre-commit`**: `npm run lint` now runs `eslint .` (bare `eslint` only printed CLI help). Added `format` / `format:check` scripts; [`Makefile`](Makefile) `console-fmt` and `console-fmt-check` call them so Prettier resolves from `node_modules` after `npm install` in `console-web/`.
2122

23+
- **Tenant `Pool` CRD validation (CEL)**: Match the operator console API — require `servers × volumesPerServer >= 4` for every pool, and `>= 6` total volumes when `servers == 3` (fixes the previous 3-server rule using `< 4` in CEL). Regenerated [`deploy/rustfs-operator/crds/tenant-crd.yaml`](deploy/rustfs-operator/crds/tenant-crd.yaml) and [`tenant.yaml`](deploy/rustfs-operator/crds/tenant.yaml). Added [`validate_pool_total_volumes`](src/types/v1alpha1/pool.rs) as the shared Rust implementation used by [`src/console/handlers/pools.rs`](src/console/handlers/pools.rs).
24+
25+
- **Tenant name length**: [`validate_dns1035_label`](src/types/v1alpha1/tenant.rs) now caps `metadata.name` at **55** characters so derived names like `{name}-console` remain valid Kubernetes DNS labels (≤ 63).
26+
27+
### Changed
28+
29+
- **Deploy scripts** ([`scripts/deploy/deploy-rustfs.sh`](scripts/deploy/deploy-rustfs.sh), [`deploy-rustfs-4node.sh`](scripts/deploy/deploy-rustfs-4node.sh)): Docker builds use **layer cache by default** (`docker_build_cached`); set `RUSTFS_DOCKER_NO_CACHE=true` for a full rebuild. Documented in [`scripts/README.md`](scripts/README.md).
30+
- **4-node deploy**: Help text moved to an early heredoc (avoids trailing `case`/parse issues); see script header.
31+
- **4-node cleanup** ([`cleanup-rustfs-4node.sh`](scripts/cleanup/cleanup-rustfs-4node.sh)): Host storage dirs under `/tmp/rustfs-storage-*` may require `sudo rm -rf` after Kind (root-owned bind mounts).
32+
- **Dockerfile** (operator and [`console-web/Dockerfile`](console-web/Dockerfile)): Build caching and reproducibility tweaks (cargo-chef pin, pnpm in frontend image as applicable).
33+
2234
### Added
2335

36+
- Cursor Agent skill [`.cursor/skills/rustfs-operator-contribute/SKILL.md`](.cursor/skills/rustfs-operator-contribute/SKILL.md) for `make pre-commit`, commit, push to fork `my`, and opening PRs to `rustfs/operator` with the project template.
37+
2438
#### **StatefulSet Reconciliation Improvements** (2025-12-03, Issue #43)
2539

2640
Implemented intelligent StatefulSet update detection and validation to improve reconciliation efficiency and safety:

Dockerfile

Lines changed: 28 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,25 +4,47 @@ ARG BASE_IMAGE=debian:bookworm-slim
44
# Use rust:bookworm so the binary is linked against glibc 2.36, matching final image.
55
ARG RUST_BUILD_IMAGE=rust:bookworm
66

7-
# When Docker build cannot reach crates.io (DNS/network), use host network:
7+
# cargo-chef version (pin for reproducible builds; override if needed)
8+
ARG CARGO_CHEF_VERSION=0.1.77
9+
10+
# When Docker build cannot reach crates.io (DNS/network), try:
811
# docker build --network=host -t rustfs/operator:dev .
12+
# For China mirrors, mount or COPY a .cargo/config.toml (see docs) before cargo install.
13+
14+
# Shared Cargo settings for slow / flaky networks (applies to all Rust stages)
15+
FROM ${RUST_BUILD_IMAGE} AS rust-base
16+
RUN mkdir -p /usr/local/cargo && \
17+
printf '%s\n' \
18+
'[http]' \
19+
'timeout = 300' \
20+
'multiplexing = false' \
21+
'' \
22+
'[net]' \
23+
'retry = 10' \
24+
> /usr/local/cargo/config.toml
25+
ENV CARGO_REGISTRIES_CRATES_IO_PROTOCOL=sparse
26+
27+
# Install cargo-chef once; planner + cacher only COPY the binary (avoids two slow installs)
28+
FROM rust-base AS cargo-chef-installer
29+
ARG CARGO_CHEF_VERSION
30+
RUN cargo install cargo-chef --version "${CARGO_CHEF_VERSION}"
931

1032
# Stage 1: Generate recipe for dependency caching
11-
FROM ${RUST_BUILD_IMAGE} AS planner
33+
FROM rust-base AS planner
34+
COPY --from=cargo-chef-installer /usr/local/cargo/bin/cargo-chef /usr/local/cargo/bin/cargo-chef
1235
WORKDIR /app
13-
RUN cargo install cargo-chef
1436
COPY . .
1537
RUN cargo chef prepare --recipe-path recipe.json
1638

1739
# Stage 2: Build dependencies only (cached unless Cargo.lock changes)
18-
FROM ${RUST_BUILD_IMAGE} AS cacher
40+
FROM rust-base AS cacher
41+
COPY --from=cargo-chef-installer /usr/local/cargo/bin/cargo-chef /usr/local/cargo/bin/cargo-chef
1942
WORKDIR /app
20-
RUN cargo install cargo-chef
2143
COPY --from=planner /app/recipe.json recipe.json
2244
RUN cargo chef cook --release --recipe-path recipe.json
2345

2446
# Stage 3: Build the binary
25-
FROM ${RUST_BUILD_IMAGE} AS builder
47+
FROM rust-base AS builder
2648
WORKDIR /app
2749
COPY . .
2850
COPY --from=cacher /app/target target

README.md

Lines changed: 74 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,64 @@
11
# RustFS Kubernetes Operator
22

3-
RustFS Kubernetes operator (under development; not production-ready).
3+
A Kubernetes operator for [RustFS](https://rustfs.com/) object storage, written in Rust with [kube-rs](https://github.com/kube-rs/kube). It reconciles a **`Tenant` custom resource** (`rustfs.com/v1alpha1`) and provisions ConfigMaps, Secrets, RBAC, Services, and StatefulSets so RustFS runs as an erasure-coded cluster inside your cluster.
4+
5+
**Status:** v0.1.0 pre-release — under active development, **not production-ready**.
6+
7+
## Features
8+
9+
- **Tenant CRD** — Declare pools, persistence, scheduling, credentials (Secret or env), TLS, and more; see [`examples/`](examples/).
10+
- **Controller** — Reconciliation loop with status conditions (`Ready` / `Progressing` / `Degraded`), events, and safe StatefulSet update checks.
11+
- **Operator HTTP console** — Optional management API (`cargo run -- console`, default port **9090**) used by [`console-web/`](console-web/) (Next.js UI).
12+
- **Tooling** — CRD YAML generation, Docker multi-stage image, Kind-focused scripts under [`scripts/`](scripts/).
13+
14+
RustFS **S3 API** and **RustFS Console UI** inside a Tenant are exposed on **9000** and **9001** respectively; the operator’s own HTTP API is separate (typically **9090**). See [`CLAUDE.md`](CLAUDE.md) for ports and env vars.
15+
16+
## Requirements
17+
18+
- **Rust** — Toolchain from [`rust-toolchain.toml`](rust-toolchain.toml) (stable; edition 2024).
19+
- **Kubernetes** — Target API **v1.30** (see `Cargo.toml` / `k8s-openapi` features); a reachable cluster for `server` mode.
20+
- **console-web** (optional) — **Node.js ≥ 20** and `npm install` in `console-web/` if you run frontend lint/format or UI dev.
21+
22+
## Quick start
23+
24+
```bash
25+
# Clone and build
26+
git clone https://github.com/rustfs/operator.git
27+
cd operator
28+
cargo build --release
29+
30+
# Emit Tenant CRD YAML (stdout or file)
31+
cargo run -- crd
32+
cargo run -- crd -f tenant-crd.yaml
33+
34+
# Run the controller (needs kubeconfig / in-cluster config)
35+
cargo run -- server
36+
37+
# Run the operator HTTP console API (default :9090)
38+
cargo run -- console
39+
```
40+
41+
**Docker**
42+
43+
```bash
44+
docker build -t rustfs/operator:dev .
45+
```
46+
47+
**End-to-end on Kind** (single-node or multi-node) — see [`scripts/README.md`](scripts/README.md).
48+
49+
## Development
50+
51+
From the repo root:
52+
53+
| Command | Purpose |
54+
|--------|---------|
55+
| `make pre-commit` | Full local gate: Rust `fmt` / `clippy` / `test` + `console-web` ESLint and Prettier (run after `npm install` in `console-web/`). |
56+
| `make fmt` / `make clippy` / `make test` | Individual Rust checks. |
57+
| `make console-lint` / `make console-fmt-check` | Frontend only. |
58+
59+
CI (`.github/workflows/ci.yml`) runs Rust tests (including `nextest`), `cargo fmt --check`, and `clippy`; it does **not** run `console-web` checks — use **`make pre-commit`** before opening a PR so frontend changes are validated.
60+
61+
Contribution workflow, commit style, and PR expectations: [`CONTRIBUTING.md`](CONTRIBUTING.md).
462

563
## Repository layout
664

@@ -13,4 +71,19 @@ RustFS Kubernetes operator (under development; not production-ready).
1371
- `deploy/k8s-dev/` — Development Kubernetes YAML
1472
- `deploy/kind/` — Kind cluster configs (e.g. 4-node)
1573
- **examples/** — Sample Tenant CRs
74+
- **console-web/** — Operator management UI (Next.js)
1675
- **docs/** — Architecture and development documentation
76+
77+
## Documentation
78+
79+
| Doc | Content |
80+
|-----|---------|
81+
| [CLAUDE.md](CLAUDE.md) | Architecture, reconcile loop, CRD fields, RustFS ports and env (maintainer / AI context). |
82+
| [CONTRIBUTING.md](CONTRIBUTING.md) | Quality gates, `make pre-commit`, PR rules. |
83+
| [docs/DEVELOPMENT.md](docs/DEVELOPMENT.md) | Local environment (kind, IDE, workflows). |
84+
| [docs/architecture-decisions.md](docs/architecture-decisions.md) | ADRs. |
85+
| [CHANGELOG.md](CHANGELOG.md) | Release notes. |
86+
87+
## License
88+
89+
Licensed under the **Apache License 2.0** — see [LICENSE](LICENSE).

console-web/Dockerfile

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,10 @@ FROM node:22-alpine AS builder
33

44
WORKDIR /app
55

6-
RUN corepack enable && corepack prepare pnpm@latest --activate
6+
# Pin pnpm to package.json "packageManager" (avoid corepack fetching pnpm@latest from npm;
7+
# that fetch can fail behind proxies / flaky TLS during docker build).
8+
ARG PNPM_VERSION=10.28.1
9+
RUN npm install -g pnpm@${PNPM_VERSION}
710

811
COPY package.json pnpm-lock.yaml* pnpm-workspace.yaml* ./
912
RUN pnpm install --frozen-lockfile

deploy/rustfs-operator/crds/tenant-crd.yaml

Lines changed: 44 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -96,33 +96,6 @@ spec:
9696
format: int32
9797
nullable: true
9898
type: integer
99-
securityContext:
100-
description: |-
101-
Override Pod SecurityContext when encryption is enabled.
102-
If not set, the default RustFS Pod SecurityContext is used
103-
(runAsUser/runAsGroup/fsGroup = 10001).
104-
nullable: true
105-
properties:
106-
fsGroup:
107-
description: GID applied to all volumes mounted in the Pod.
108-
format: int64
109-
nullable: true
110-
type: integer
111-
runAsGroup:
112-
description: GID to run the container process as.
113-
format: int64
114-
nullable: true
115-
type: integer
116-
runAsNonRoot:
117-
description: 'Enforce non-root execution (default: true).'
118-
nullable: true
119-
type: boolean
120-
runAsUser:
121-
description: UID to run the container process as.
122-
format: int64
123-
nullable: true
124-
type: integer
125-
type: object
12699
vault:
127100
description: 'Vault-specific settings (required when `backend: vault`).'
128101
nullable: true
@@ -145,12 +118,21 @@ spec:
145118
type: integer
146119
type: object
147120
authType:
148-
default: token
149-
description: |-
150-
Authentication method: `token` (default, implemented) or `approle`
151-
(type defined in rustfs-kms but backend not yet functional).
121+
description: Authentication method. Defaults to `token` when not set.
122+
enum:
123+
- token
124+
- approle
125+
- null
152126
nullable: true
153127
type: string
128+
customCertificates:
129+
description: |-
130+
Enable custom TLS certificates for the Vault connection.
131+
When `true`, the operator mounts TLS certificate files from the KMS Secret
132+
and configures the corresponding environment variables.
133+
The Secret must contain: `vault-ca-cert`, `vault-client-cert`, `vault-client-key`.
134+
nullable: true
135+
type: boolean
154136
endpoint:
155137
description: Vault server endpoint (e.g. `https://vault.example.com:8200`).
156138
type: string
@@ -167,7 +149,7 @@ spec:
167149
nullable: true
168150
type: string
169151
tlsSkipVerify:
170-
description: 'Enable TLS verification for Vault connection (default: true).'
152+
description: Skip TLS certificate verification for Vault connection.
171153
nullable: true
172154
type: boolean
173155
required:
@@ -1199,7 +1181,7 @@ spec:
11991181
format: int32
12001182
type: integer
12011183
x-kubernetes-validations:
1202-
- message: servers must be gather than 0
1184+
- message: servers must be greater than 0
12031185
rule: self > 0
12041186
tolerations:
12051187
description: Tolerations allow pods to schedule onto nodes with matching taints.
@@ -1314,12 +1296,12 @@ spec:
13141296
- servers
13151297
type: object
13161298
x-kubernetes-validations:
1317-
- messageExpression: '"pool " + self.name + " with 2 servers must have at least 4 volumes in total"'
1299+
- messageExpression: '"pool " + self.name + " must have at least 4 total volumes (servers × volumesPerServer)"'
13181300
reason: FieldValueInvalid
1319-
rule: '!(self.servers * self.persistence.volumesPerServer < 4 && self.servers == 2)'
1301+
rule: self.servers * self.persistence.volumesPerServer >= 4
13201302
- messageExpression: '"pool " + self.name + " with 3 servers must have at least 6 volumes in total"'
13211303
reason: FieldValueInvalid
1322-
rule: '!(self.servers * self.persistence.volumesPerServer < 4 && self.servers == 3)'
1304+
rule: self.servers != 3 || self.servers * self.persistence.volumesPerServer >= 6
13231305
type: array
13241306
x-kubernetes-validations:
13251307
- message: pools must be configured
@@ -1330,6 +1312,32 @@ spec:
13301312
scheduler:
13311313
nullable: true
13321314
type: string
1315+
securityContext:
1316+
description: |-
1317+
Override the default Pod SecurityContext (runAsUser/runAsGroup/fsGroup = 10001).
1318+
Applies to all RustFS pods in this Tenant.
1319+
nullable: true
1320+
properties:
1321+
fsGroup:
1322+
description: GID applied to all volumes mounted in the Pod.
1323+
format: int64
1324+
nullable: true
1325+
type: integer
1326+
runAsGroup:
1327+
description: GID to run the container process as.
1328+
format: int64
1329+
nullable: true
1330+
type: integer
1331+
runAsNonRoot:
1332+
description: 'Enforce non-root execution (default: true).'
1333+
nullable: true
1334+
type: boolean
1335+
runAsUser:
1336+
description: UID to run the container process as.
1337+
format: int64
1338+
nullable: true
1339+
type: integer
1340+
type: object
13331341
serviceAccountName:
13341342
nullable: true
13351343
type: string

0 commit comments

Comments
 (0)