Skip to content

Commit 37d33d0

Browse files
committed
docs: align resource policy and status page docs
1 parent 27867c2 commit 37d33d0

2 files changed

Lines changed: 23 additions & 18 deletions

File tree

AGENTS.md

Lines changed: 16 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -78,36 +78,39 @@ Expected .onion addresses are documented in `../www/onion.makeitwork.cloud/index
7878

7979
## Resource Management
8080

81-
**Do NOT set resource limits** - only set requests. This is a single-node CRC cluster where limits cause problems:
81+
**Single-node CRC policy:** avoid container CPU/memory reservations by default.
8282

83-
- **CPU limits** cause throttling even when the node has spare capacity
84-
- **Memory limits** prevent pods from using available memory and cause unnecessary OOMs
85-
- **Requests** are sufficient for scheduling and QoS classification
83+
- Prefer `resources: {}` or no `resources` block on app containers
84+
- Avoid both `requests` and `limits` unless a workload has a proven stability need
85+
- High requests on single-node CRC commonly trigger `Insufficient cpu/memory` scheduling failures
86+
- CPU limits cause throttling; memory limits can cause avoidable OOM kills
8687

87-
When adding new workloads:
88+
When adding new workloads, default to no container requests/limits:
8889
```yaml
89-
resources:
90-
requests:
91-
cpu: "100m"
92-
memory: "128Mi"
93-
# NO limits section
90+
containers:
91+
- name: app
92+
image: example/image:tag
93+
resources: {}
9494
```
9595

96-
For operators installed via OLM (Subscription), limits are baked into the CSV and cannot be easily changed. For operators installed via kustomize remote refs, use JSON patches to remove limits:
96+
For operators installed via OLM (Subscription), tune through supported CR/Subscription fields where available (for example `spec.config.resources: {}` or operator-specific `*_resource_requirements: {}`). If the operator ignores these fields, accept operator defaults.
97+
98+
For operators installed via kustomize remote refs, use JSON patches to remove the entire `resources` block:
9799

98100
```yaml
99101
patches:
100102
- patch: |
101103
- op: remove
102-
path: /spec/template/spec/containers/0/resources/limits/cpu
104+
path: /spec/template/spec/containers/0/resources
103105
target:
104106
kind: Deployment
105107
name: controller-manager
106108
```
107109

108-
Add KubeLinter ignore annotation when removing limits:
110+
If KubeLinter checks require explicit ignores for this cluster policy:
109111
```yaml
110112
annotations:
113+
ignore-check.kube-linter.io/unset-cpu-requirements: "No requests on single-node cluster"
111114
ignore-check.kube-linter.io/unset-memory-requirements: "No limits on single-node cluster"
112115
```
113116

README.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -23,9 +23,9 @@ workloads/ # CRs and resources that depend on operator CRDs
2323
├── ansible/ # AWX instance + GitHub SSO + Tor + TunnelBinding
2424
├── arc/ # DinD runners + image registry + pull-through cache
2525
├── argocd-proxy/ # Tor hidden service + TunnelBinding for ArgoCD
26-
├── grafana/ # Grafana instance + GitHub SSO + Tor + TunnelBinding
26+
├── grafana/ # Internal Grafana + public status Grafana + probes + TunnelBindings
2727
├── makeitwork-proxy/ # Tor hidden service for makeitwork.cloud
28-
├── uptime-kuma/ # Uptime monitoring + Tor + TunnelBinding
28+
├── uptime-kuma/ # Legacy uptime stack (status host migrated to Grafana)
2929
└── warp/ # Cloudflare WARP connector for private network access
3030
```
3131

@@ -55,13 +55,15 @@ Operators must be installed before workloads to ensure CRDs exist.
5555
- **Cloudflare Tunnels**: External apps via cloudflare-operator with TunnelBindings per app
5656
- **Tor Hidden Services**: Centralized tor-controller with OnionService CRDs per workload
5757
- **Let's Encrypt Certs**: Wildcard `*.apps.makeitwork.cloud` via cert-manager DNS-01 (Cloudflare)
58+
- **Public Status Page**: `status.makeitwork.cloud` served by dedicated anonymous Grafana instance with blackbox probe metrics
5859
- **Pull-Through Cache**: Docker registry mirror for ARC runners to reduce rate limits
5960
- **App-of-Apps**: Each workload is a separate ArgoCD Application for independent sync
6061

6162
## Requirements
6263

6364
- OpenShift GitOps operator
6465
- OpenShift cert-manager operator
66+
- CRC with monitoring enabled (`crc config set enable-cluster-monitoring true`)
6567
- `sops-age-keys` secret in `openshift-gitops` namespace (for SOPS decryption)
6668

6769
## CI/CD
@@ -75,11 +77,11 @@ The `ci-deployer` service account provides cluster-admin access for CI/CD workfl
7577

7678
## Resource Management
7779

78-
This is a single-node CRC cluster. **Do not set resource limits** - only requests:
80+
This is a single-node CRC cluster. Prefer **no container requests/limits** unless there is a proven stability need:
7981

82+
- High requests commonly trigger `Insufficient cpu/memory` and block scheduling
8083
- CPU limits cause throttling even with spare capacity
81-
- Memory limits cause unnecessary OOMs
82-
- Requests are sufficient for scheduling
84+
- Memory limits can cause avoidable OOM kills
8385

8486
See `AGENTS.md` for detailed guidance on resource configuration.
8587

0 commit comments

Comments
 (0)