Skip to content

Commit 38ab064

Browse files
committed
feat: add best-practices
1 parent 05a3c3e commit 38ab064

8 files changed

Lines changed: 323 additions & 14 deletions

File tree

content/en/_index.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,13 +13,14 @@ title: Project Capsule
1313
Demo <i class="fas fa-arrow-alt-circle-right ms-2"></i>
1414
</a>
1515
</div>
16+
1617
{{< blocks/link-down color="info" >}}
1718
{{< /blocks/cover >}}
1819

19-
20-
<a href="/adopters">
2120
{{< blocks/section color="white" type="row" >}}
2221

22+
## Developed in 🇮🇹 / 🇨🇭 / 🇧🇬 { class="text-center mb-4" }
23+
2324
{{< adopters-slider >}}
2425

2526
{{< /blocks/section >}}
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
title: Best Practices
3+
weight: 2
4+
description: Best Practices when running Capsule in production
5+
---

content/en/docs/overview/architecture.md renamed to content/en/docs/operating/best-practices/architecture.md

Lines changed: 41 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Architecture
3-
weight: 10
3+
weight: 1
44
description: Architecture references and considerations
55
---
66

@@ -14,6 +14,43 @@ In Capsule, we introduce a new persona called the `Tenant Owner`. The goal is to
1414

1515
Capsule provides robust tools to strictly enforce tenant boundaries, ensuring that each tenant operates within its defined limits. This separation of duties promotes both security and efficient resource management.
1616

17+
### Key Decisions
18+
19+
Introducing a new separation of duties can lead to a significant paradigm shift. This has technical implications and may also impact your organizational structure. Therefore, when designing a multi-tenant platform pattern, carefully consider the following aspects. As **Cluster Administrator**, ask yourself:
20+
21+
* 🔑 **How much ownership can be delegated to Tenant Owners (Platform Users)?**
22+
23+
The answer to this question may be influenced by the following aspects:
24+
25+
* **Are the Cluster Adminsitrators willing to grant permissions to Tenant Owners**?
26+
* _You might have a problem with know-how and probably your organisation is not yet pushing Kubernetes itself enough as a key strategic plattform. The key here is enabling Plattform Users through good UX and know-how transfers_
27+
28+
* **Who is responsible for the deployed workloads within the Tenants?**?
29+
* _If Platform Administrators are still handling this, a true “shift left” has not yet been achieved._
30+
31+
* **Who gets paged during a production outage within a Tenant’s application?**?
32+
* _You’ll need robust monitoring that enables Tenant Owners to clearly understand and manage what’s happening inside their own tenant._
33+
34+
* **Are your customers technically capable of working directly with the Kubernetes API?**?
35+
* _If not, you may need to build a more user-friendly platform with better UX — for example, a multi-tenant ArgoCD setup, or UI layers like Headlamp._
36+
37+
38+
## Layouts
39+
40+
Let's dicuss different Tenant Layouts which could be used . These are just approaches we have seen, however you might also find a combination of these which fits your use-case.
41+
42+
### Tenant As A Service
43+
44+
With this approach you essentially just provide your Customers with the Tenant on your cluster. The rest is their responsability. This concludes to a shared responsibility model. This can be achieved when also the Tenant Owners are responsible for everything they are provisiong within their Tenant's namespaces.
45+
46+
![Resourcepool Dashboard](/images/content/architecture/layout-taas.drawio.png)
47+
48+
49+
| Platform Administrators | Platform Users (Tenant Users) |
50+
| :---------------------- | :----------------------------------------- |
51+
|||
52+
53+
1754
## Scheduling
1855

1956
Workload distribution across your compute infrastructure can be approached in various ways, depending on your specific priorities. Regardless of the use case, it's essential to preserve maximum flexibility for your platform administrators. This means ensuring that:
@@ -32,7 +69,7 @@ Strong tenant isolation, ensuring that any noisy neighbor effects remain confine
3269

3370
### Shared
3471

35-
With this approach you share the nodes amongst all Tenants, therefor giving you more potential for optimizing resources on a node level. It's a common pattern to separate the controllers needed to power your distro (operators) form the actual workload. This ensures smooth operations for the clust
72+
With this approach you share the nodes amongst all Tenants, therefor giving you more potential for optimizing resources on a node level. It's a common pattern to separate the controllers needed to power your Distribution (operators) form the actual workload. This ensures smooth operations for the cluster
3673

3774
**Overview**:
3875

@@ -43,7 +80,8 @@ With this approach you share the nodes amongst all Tenants, therefor giving you
4380

4481
![Shared Nodepool](/images/content/scheduling-shared.drawio.png)
4582

46-
There's some further aspects you must think about with shared approaches:
83+
84+
We provide the concept of [ResourcePools](/docs/resourcepools/) to manage resources cross namespaces. There's some further aspects you must think about with shared approaches:
4785

4886
* [PriorityClasses](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/)
4987
* [ResourceQuotas](https://kubernetes.io/docs/concepts/policy/resource-quotas/)

content/en/docs/operating/control-pod-security.md renamed to content/en/docs/operating/best-practices/control-pod-security.md

Lines changed: 93 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
2-
title: Pod Security
3-
weight: 10
2+
title: Pod Security Standards
3+
weight: 3
44
description: Control the security of the pods running in the tenant namespaces
55
---
66

@@ -62,9 +62,10 @@ metadata:
6262
```
6363
6464
### Capsule
65+
6566
According to the regular Kubernetes segregation model, the cluster admin has to operate either at cluster level or at namespace level. Since Capsule introduces a further segregation level (the _Tenant_ abstraction), the cluster admin can implement Pod Security Standards at tenant level by simply forcing specific labels on all the namespaces created in the tenant.
6667
67-
As cluster admin, create a tenant with additional labels:
68+
You can distribute these profiles via namespace. Here's how this could look like:
6869
6970
```yaml
7071
apiVersion: capsule.clastix.io/v1beta2
@@ -73,14 +74,26 @@ metadata:
7374
name: solar
7475
spec:
7576
namespaceOptions:
76-
additionalMetadata:
77+
additionalMetadataList:
78+
- namespaceSelector:
79+
matchExpressions:
80+
- key: projectcapsule.dev/low_security_profile
81+
operator: NotIn
82+
values: ["system"]
7783
labels:
78-
pod-security.kubernetes.io/enforce: baseline
79-
pod-security.kubernetes.io/audit: restricted
84+
pod-security.kubernetes.io/enforce: restricted
8085
pod-security.kubernetes.io/warn: restricted
81-
owners:
82-
- kind: User
83-
name: alice
86+
pod-security.kubernetes.io/audit: restricted
87+
- namespaceSelector:
88+
matchExpressions:
89+
- key: company.com/env
90+
operator: In
91+
values: ["system"]
92+
labels:
93+
pod-security.kubernetes.io/enforce: privileged
94+
pod-security.kubernetes.io/warn: privileged
95+
pod-security.kubernetes.io/audit: privileged
96+
8497
```
8598

8699
All namespaces created by the tenant owner, will inherit the Pod Security labels:
@@ -152,7 +165,78 @@ kubectl --kubeconfig alice-solar.kubeconfig label ns solar-production \
152165
Error from server (Label pod-security.kubernetes.io/audit is forbidden for namespaces in the current Tenant ...
153166
```
154167

168+
## User Namespaces
169+
170+
{{% alert title="Info" color="info" %}}
171+
The FeatureGate `UserNamespacesSupport` is active by default since [Kubernetes 1.33](https://kubernetes.io/blog/2025/04/25/userns-enabled-by-default/). However every pod must still [opt-in](#admission)
172+
173+
When you are also enabling the FeatureGate `UserNamespacesPodSecurityStandards` you may relax the Pod Security Standards for your workloads. [Read More](https://kubernetes.io/docs/concepts/workloads/pods/user-namespaces/#integration-with-pod-security-admission-checks)
174+
{{% /alert %}}
175+
176+
A process running as root in a container can run as a different (non-root) user in the host; in other words, the process has full privileges for operations inside the user namespace, but is unprivileged for operations outside the namespace. [Read More](https://kubernetes.io/docs/concepts/workloads/pods/user-namespaces/)
177+
178+
### Kubelet
179+
180+
On your Kubelet you must use the [FeatureGates](https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/):
181+
182+
* `UserNamespacesSupport`
183+
* `UserNamespacesPodSecurityStandards` (Optional)
184+
185+
### Sysctls
186+
187+
```yaml
188+
user.max_user_namespaces: "11255"
189+
```
190+
191+
### Admission (Kyverno)
192+
193+
To make sure all the workloads are forced to use dedicated User Namespaces, we recommend to mutate pods at admission. See the following examples.
194+
195+
#### Kyverno
196+
197+
```yaml
198+
apiVersion: kyverno.io/v1
199+
kind: ClusterPolicy
200+
metadata:
201+
name: add-hostusers-spec
202+
annotations:
203+
policies.kyverno.io/title: Add HostUsers
204+
policies.kyverno.io/category: Security
205+
policies.kyverno.io/subject: Pod,User Namespace
206+
kyverno.io/kubernetes-version: "1.31"
207+
policies.kyverno.io/description: >-
208+
Do not use the host's user namespace. A new userns is created for the pod.
209+
Setting false is useful for mitigating container breakout vulnerabilities even allowing users to run their containers as root
210+
without actually having root privileges on the host. This field is
211+
alpha-level and is only honored by servers that enable the
212+
UserNamespacesSupport feature.
213+
spec:
214+
rules:
215+
- name: add-host-users
216+
match:
217+
any:
218+
- resources:
219+
kinds:
220+
- Pod
221+
namespaceSelector:
222+
matchExpressions:
223+
- key: capsule.clastix.io/tenant
224+
operator: Exists
225+
preconditions:
226+
all:
227+
- key: "{{request.operation || 'BACKGROUND'}}"
228+
operator: AnyIn
229+
value:
230+
- CREATE
231+
- UPDATE
232+
mutate:
233+
patchStrategicMerge:
234+
spec:
235+
hostUsers: false
236+
```
237+
155238
## Pod Security Policies
239+
156240
As stated in the documentation, *"PodSecurityPolicies enable fine-grained authorization of pod creation and updates. A Pod Security Policy is a cluster-level resource that controls security sensitive aspects of the pod specification. The `PodSecurityPolicy` objects define a set of conditions that a pod must run with in order to be accepted into the system, as well as defaults for the related fields."*
157241

158242
Using the [Pod Security Policies](https://kubernetes.io/docs/concepts/security/pod-security-policy), the cluster admin can impose limits on pod creation, for example the types of volume that can be consumed, the linux user that the process runs as in order to avoid running things as root, and more. From multi-tenancy point of view, the cluster admin has to control how users run pods in their tenants with a different level of permission on tenant basis.
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
---
2+
title: General Advice
3+
weight: 2
4+
description: This is general advice you should consider before making Kubernetes Distribution consideration
5+
---
6+
7+
This is general advice you should consider before making Kubernetes Distribution consideration. They are partly relevant for Multi-Tenancy with Capsule.
8+
9+
### Authentication
10+
11+
User authentication for the platform should be handled via a central OIDC-compatible identity provider system (e.g., Keycloak, Azure AD, Okta, or any other OIDC-compliant provider).
12+
The rationale is that other central platform components — such as ArgoCD, Grafana, Headlamp, or Harbor — should also integrate with the same authentication mechanism. This enables a unified login experience and reduces administrative complexity in managing users and permissions.
13+
14+
[Capsule relies on native Kubernetes RBAC](/docs/operating/authentication/), so it's important to consider how the Kubernetes API handles user authentication.
15+
16+
### OCI Pull-Cache
17+
18+
By default, Kubernetes clusters pull images directly from upstream registries like `docker.io`, `quay.io`, `ghcr.io`, or `gcr.io`. In production environments, this can lead to issues — especially because Docker Hub enforces rate limits that may cause image pull failures with just a few nodes or frequent deployments (e.g., when pods are rescheduled).
19+
20+
To ensure availability, performance, and control over container images, it's essential to provide an on-premise OCI mirror.
21+
This mirror should be configured via the CRI (Container Runtime Interface) by defining it as a mirror endpoint in registries.conf for default registries (e.g., `docker.io`).
22+
This way, all nodes automatically benefit from caching without requiring developers to change image URLs.
23+
24+
### Secrets Management
25+
26+
In more complex environments with multiple clusters and applications, managing secrets manually via YAML or Helm is no longer practical.
27+
Instead, a centralized secrets management system should be established — such as Vault, AWS Secrets Manager, Azure Key Vault, or the CNCF project [OpenBao](https://openbao.org/) (formerly the Vault community fork).
28+
29+
To integrate these external secret stores with Kubernetes, the [External Secrets Operator (ESO)](https://external-secrets.io/latest/) is a recommended solution. It automatically syncs defined secrets from external sources as Kubernetes secrets, and supports dynamic rotation, access control, and auditing.
30+
31+
If no external secret store is available, there should at least be a secure way to store sensitive data in Git.
32+
In our ecosystem, we provide a solution based on SOPS (Secrets OPerationS) for this use case.
33+
34+
[👉 Demonstration](https://killercoda.com/peakscale/course/playgrounds/sops-secrets)
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
---
2+
title: Container Images
3+
weight: 5
4+
description: Multi-Tenant Container Images considerations
5+
---
6+
7+
> [Until this issue is resolved (might be in Kubernetes 1.34)](https://github.com/kubernetes/enhancements/issues/2535)
8+
9+
it's recommended to use the [ImagePullPolicy](https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy) `Always` for private registries on shared nodes. This ensures that no images can be used which are already pulled to the node.

0 commit comments

Comments
 (0)