Merge branch 'main' into docs-2026-03-26-architecture-rbd

zdover23 · web-flow · commit 6b7abdfdd30e · 2026-03-27T01:34:51.000+10:00
diff --git a/docs/architecture/arbiter.md b/docs/architecture/arbiter.md
@@ -0,0 +1,198 @@
+---
+title: Arbiter
+---
+
+# Arbiter 
+
+## About This Project
+
+The external-arbiter-operator (Arbiter) works with Rook-provisioned Ceph
+clusters to deploy external arbiters (monitors) that are not managed by Rook
+but that participate in consensus.
+
+The operator also monitors the remote cluster to verify its availability and
+ensure that the tenant has sufficient permissions to handle the deployment of
+Arbiter.
+
+## Requirements and Setup
+
+### Required Tools
+
+The following tools are required on your development machine:
+
+- `sed`
+- `openssl`
+- `make`
+- `git`
+- `golang`
+- `lima` (or another method to provision Kubernetes locally, such as Minikube)
+- `kubectl`
+- `docker` (or any compatible container engine, such as Podman)
+- `helm`
+
+The remaining dependencies are provisioned via Go tools, including the
+Kubebuilder toolset.
+
+## Quick Start
+
+What follows is a quick walkthrough on how to prepare the environment, run the
+operator locally, and deploy an external monitor.
+
+### Clone and Setup
+
+```bash
+# Clone the Rook repository: https://github.com/rook/rook 
+
+#Run `make deps`:
+make deps
+
+# Create OSD for Ceph
+limactl disk create osd --size=8G
+
+# Create VM instance
+limactl create --name=k8s ./contrib/vm.yaml
+
+# Start VM
+limactl start k8s
+
+# Use kubeconfig provided by VM
+export KUBECONFIG="${HOME}/.lima/k8s/copied-from-guest/kubeconfig.yaml"
+```
+
+### Install Prerequisites
+
+```bash
+# Install cert-manager
+kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.19.2/cert-manager.yaml
+
+# Install Rook operator
+kubectl apply -f ./rook/deploy/examples/crds.yaml
+kubectl apply -f ./rook/deploy/examples/common.yaml
+kubectl apply -f ./rook/deploy/examples/operator.yaml
+kubectl apply -f ./rook/deploy/examples/csi-operator.yaml
+
+# Create Ceph cluster
+kubectl apply -f ./rook/deploy/examples/cluster-test.yaml
+
+# (Optional) Install Ceph toolbox
+kubectl apply -f ./rook/deploy/examples/toolbox.yaml
+```
+
+### Build and Install Operator
+
+```bash
+# Build image
+limactl shell k8s sudo nerdctl --namespace k8s.io build \
+  -t localhost:5000/cobaltcore-dev/external-arbiter-operator:latest \
+  -f ./Dockerfile .
+
+# Dry run operator install via Helm
+helm install --dry-run --create-namespace --namespace arbiter-operator \
+  --values ./contrib/charts/external-arbiter-operator/local.yaml \
+  arbiter-operator ./contrib/charts/external-arbiter-operator
+
+# Install operator via Helm chart
+helm install --create-namespace --namespace arbiter-operator \
+  --values ./contrib/charts/external-arbiter-operator/local.yaml \
+  arbiter-operator ./contrib/charts/external-arbiter-operator
+```
+
+### Configure and Deploy Arbiter
+
+```bash
+# Create namespace, user, role, rolebinding, kubeconfig and secret for arbiter
+./hack/configure-k8s-user.sh
+
+# Create secret with remote cluster access configuration
+kubectl apply -f ./contrib/k8s/examples/secret.yaml -n arbiter-operator
+
+# Create remote cluster resource
+kubectl apply -f ./contrib/k8s/examples/remote-cluster.yaml -n arbiter-operator
+
+# Create remote arbiter resource
+kubectl apply -f ./contrib/k8s/examples/remote-arbiter.yaml -n arbiter-operator
+
+# Watch until Arbiter is ready
+kubectl get remotearbiter -n arbiter-operator -w
+
+# Check that Arbiter has joined quorum
+kubectl exec deployment/rook-ceph-tools -n rook-ceph -it -- ceph mon dump
+```
+
+### Cleanup
+
+```bash
+# Remove Helm chart
+helm uninstall --namespace arbiter-operator arbiter-operator
+
+# Stop VM
+limactl stop k8s
+
+# Delete VM
+limactl delete k8s
+```
+
+## Make Goals
+
+Useful make commands for development:
+
+```bash
+# Build binary
+make
+
+# Prettify project, run linters, etc.
+make pretty
+
+# Run tests
+make test
+
+# Regenerate Kubernetes resources
+make gen
+
+# Copy CRD definitions to Helm chart
+make helm
+```
+
+## Configuration
+
+### Deployment Configuration
+
+Deployment manifests are managed by Helm. The `values.yaml` file lists all
+available configuration options.
+
+### Resource Configuration
+
+The following example resources are provided:
+
+- `secret.yaml` - Kubeconfig secret for arbiter installation
+- `remote-cluster.yaml` - RemoteCluster resource definition
+- `remote-arbiter.yaml` - RemoteArbiter resource definition
+
+## How to Run
+
+### Prerequisites
+
+Before running the operator, ensure the following conditions are met:
+
+1. A Ceph cluster operated by Rook is already up and running on the source
+   Kubernetes cluster
+1. Resources (pods, services) from the target (arbiter) cluster are reachable
+   from the source (operator/Rook) cluster and vice versa
+
+### Deployment Steps
+
+1. Create a user on the target cluster.
+1. Create the target namespace on the target cluster.
+1. Grant the user permissions to manage deployments, secrets, configmaps, their
+   statuses, and finalizers.
+1. Provision the target user kubeconfig on the source cluster via secret.
+1. Deploy the operator on the source cluster.
+1. Create a RemoteCluster resource on the source cluster, referencing the target
+   user kubeconfig secret.
+1. Create a RemoteArbiter resource on the source cluster, referencing the
+   RemoteCluster.
+1. Watch until resources are ready.
+1. Verify that the arbiter has joined the quorum by running `ceph mon dump`.
+
+## See Also
+[Arbiter Repository](https://github.com/cobaltcore-dev/external-arbiter-operator?tab=readme-ov-file)
diff --git a/docs/architecture/ceph.md b/docs/architecture/ceph.md
@@ -135,6 +135,134 @@ capable of addressing diverse organizational storage requirements through a
 single infrastructure platform. This convergence of capabilities, combined with
 proven integration with major virtualization and cloud platforms, establishes
 Ceph block devices as a viable solution for modern data center storage needs.
+### The Ceph Storage Cluster
+
+At its core, Ceph provides an infinitely scalable storage cluster based on
+RADOS (Reliable Autonomic Distributed Object Store), a distributed storage
+service that uses the intelligence in each node to secure data and provide it
+to clients. A Ceph Storage Cluster consists of four daemon types: Ceph
+Monitors, which maintain the master copy of the cluster map; Ceph OSD Daemons,
+which check their own state and that of other OSDs; Ceph Managers, serving as
+endpoints for monitoring and orchestration; and Ceph Metadata Servers (MDS),
+which manage file metadata when CephFS provides file services.
+
+Storage cluster clients and Ceph OSD Daemons use the CRUSH (Controlled
+Scalable Decentralized Placement of Replicated Data) algorithm to compute data
+location information, avoiding bottlenecks from central lookup tables. This
+algorithmic approach enables Ceph's high-level features, including a native
+interface to the storage cluster via librados and numerous service interfaces
+built atop it.
+
+### Data Storage and Organization
+
+The Ceph Storage Cluster receives data from clients through various
+interfaces—Ceph Block Device, Ceph Object Storage, CephFS, or custom
+implementations using librados—and stores it as RADOS objects. Each object
+resides on an Object Storage Device (OSD), with Ceph OSD Daemons controlling
+read, write, and replication operations. The default BlueStore backend stores
+objects in a monolithic, database-like fashion within a flat namespace, meaning
+objects lack hierarchical directory structures. Each object has an identifier,
+binary data, and name/value pair metadata, with clients determining object data
+semantics.
+
+### Eliminating Centralization
+
+Traditional architectures rely on centralized components—gateways, brokers, or
+APIs—that act as single points of entry, creating failure points and
+performance limits. Ceph eliminates these centralized components, enabling
+clients to interact directly with Ceph OSDs. OSDs create object replicas on
+other nodes to ensure data safety and high availability, while monitor clusters
+ensure high availability. The CRUSH algorithm replaces centralized lookup
+tables, providing better data management by distributing work across all OSD
+daemons and communicating clients, using intelligent data replication to ensure
+resiliency suitable for hyper-scale storage.
+
+### Cluster Map and High Availability
+
+For proper functioning, Ceph clients and OSDs require current cluster topology
+information stored in the Cluster Map, actually a collection of five maps: the
+Monitor Map (containing cluster fsid, monitor positions, names, addresses, and
+ports), the OSD Map (containing cluster fsid, pool lists, replica sizes, PG
+numbers, and OSD statuses), the PG Map (containing PG versions, timestamps, and
+placement group details), the CRUSH Map (containing storage devices, failure
+domain hierarchy, and traversal rules), and the MDS Map (containing MDS map
+epoch, metadata storage pool, and metadata server information). Each map
+maintains operational state change history, with Ceph Monitors maintaining
+master copies including cluster members, states, changes, and overall health.
+
+Ceph uses monitor clusters for reliability and fault tolerance. To establish
+consensus about cluster state, Ceph employs the Paxos algorithm, requiring a
+majority of monitors to agree (one in single-monitor clusters, two in
+three-monitor clusters, three in five-monitor clusters, and so forth). This
+prevents issues when monitors fall behind due to latency or faults.
+
+### Authentication and Security
+
+The cephx authentication system authenticates users and daemons while
+protecting against man-in-the-middle attacks, though it doesn't address
+transport encryption or encryption at rest. Using shared secret keys, cephx
+enables mutual authentication without revealing keys. Like Kerberos, each
+monitor can authenticate users and distribute keys, eliminating single points
+of failure. The system issues session keys encrypted with users' permanent
+secret keys, which clients use to request services. Monitors provide tickets
+authenticating clients against OSDs handling data, with monitors and OSDs
+sharing secrets enabling ticket use across any cluster OSD or metadata server.
+Tickets expire to prevent attackers from using obtained credentials, protecting
+against message forgery and alteration as long as secret keys remain secure
+before expiration.
+
+### Smart Daemons and Hyperscale
+
+Ceph's architecture makes OSD Daemons and clients cluster-aware, unlike
+centralized storage clusters requiring double dispatches that bottleneck at
+petabyte-to-exabyte scale. Each Ceph OSD Daemon knows other OSDs in the
+cluster, enabling direct interaction with other OSDs and monitors. This
+awareness allows clients to interact directly with OSDs, and because monitors
+and OSD daemons interact directly, OSDs leverage aggregate cluster CPU and RAM
+resources.
+
+This distributed intelligence provides several benefits: OSDs service clients
+directly, improving performance by avoiding centralized interface connection
+limits; OSDs report membership and status (up or down), with neighboring OSDs
+detecting and reporting failures; data scrubbing maintains consistency by
+comparing object metadata across replicas, with deeper scrubbing comparing data
+bit-for-bit against checksums to find bad drive sectors; and replication
+involves client-OSD collaboration, with clients using CRUSH to determine object
+locations, mapping objects to pools and placement groups, then writing to
+primary OSDs that replicate to secondary OSDs.
+
+### Dynamic Cluster Management
+
+Pools are logical partitions for storing objects, with clients retrieving
+cluster maps from monitors and writing RADOS objects to pools. CRUSH
+dynamically maps placement groups (PGs) to OSDs, with clients storing objects
+by having CRUSH map each RADOS object to a PG. This abstraction layer between
+OSDs and clients enables adaptive cluster growth, shrinkage, and data
+redistribution when topology changes. The indirection allows dynamic
+rebalancing when new OSDs come online.
+
+Clients compute object locations rather than querying, requiring only object ID
+and pool name. Ceph hashes object IDs, calculates hash modulo PG numbers,
+retrieves pool IDs from pool names, and prepends pool IDs to PG IDs. This
+computation proves faster than query sessions, with CRUSH enabling clients to
+compute expected object locations and contact primary OSDs for storage or
+retrieval.
+
+### Client Interfaces
+
+Ceph provides three client types: Ceph Block Device (RBD) offers resizable,
+thin-provisioned, snapshottable block devices striped across clusters for high
+performance; Ceph Object Storage (RGW) provides RESTful APIs compatible with
+Amazon S3 and OpenStack Swift; and CephFS provides POSIX-compliant filesystems
+mountable as kernel objects or FUSE. Modern applications access storage through
+librados, which provides direct parallel cluster access supporting pool
+operations, snapshots, copy-on-write cloning, object read/write operations,
+extended attributes, key/value pairs, and object classes.
+
+The architecture demonstrates how Ceph's distributed, intelligent design
+eliminates traditional storage limitations, enabling massive scalability while
+maintaining reliability and performance through algorithmic data placement,
+autonomous daemon operations, and direct client-storage interactions.
 
 ## See Also
 The architecture of the Ceph cluster is explained in [the Architecture
diff --git a/docs/architecture/chorus.md b/docs/architecture/chorus.md
@@ -0,0 +1,22 @@
+---
+title: Chorus
+---
+
+# Chorus 
+
+Chorus is data replication software designed for Object Storage systems,
+supporting S3 and OpenStack Swift APIs. It enables zero-downtime migration
+between storage systems, maintains synchronized backups for disaster recovery,
+and verifies migration integrity through consistency checks.
+
+Chorus operates through two main components: Chorus Proxy, an S3 proxy that
+captures changes, and Chorus Worker, which processes replication tasks and
+webhook events. Users configure storage credentials, designating one endpoint
+as "main" while others become "followers." Requests route through Chorus's S3
+API to the main storage and asynchronously replicate to follower endpoints.
+
+The system supports user-level and bucket-level replication policies, allowing
+users to pause and resume replication via web admin UI or CLI. Chorus handles
+initial replication of existing data in the background and can accept change
+events via webhooks when proxy deployment isn't feasible, supporting S3 bucket
+notifications and Swift access-log events.