Skip to content

Commit 994f92f

Browse files
committed
Add envoy gateway docs
1 parent 8f34014 commit 994f92f

2 files changed

Lines changed: 310 additions & 7 deletions

File tree

README.md

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Theia Deployment
22

3-
This repository manages automated deployments of [Theia Cloud](https://github.com/eclipse-theia/theia-cloud) to Kubernetes clusters using GitHub Actions. Theia Cloud provides browser-based development environments, allowing students and developers to work in containerized IDEs without local setup.
3+
This repository manages automated deployments of [EduIDE Cloud](https://github.com/EduIDE/EduIDE-Cloud) to Kubernetes clusters using GitHub Actions. EduIDE Cloud provides browser-based development environments, allowing students and developers to work in containerized IDEs without local setup.
44

55
## What is This Repository?
66

@@ -37,6 +37,7 @@ This repository serves as the infrastructure-as-code for deploying and managing
3737
3838
└── docs/ # Detailed documentation
3939
├── deployment-workflows.md # How deployments work
40+
├── envoy-gateway-setup.md # Envoy Gateway and shared Gateway API setup
4041
├── adding-environments.md # Adding new environments
4142
├── keycloak-setup.md # Authentication configuration
4243
├── tum-certificates.md # TUM-specific SSL certificate process
@@ -128,6 +129,7 @@ Configuration files for each environment are located in the [deployments/](deplo
128129
# If cert-manager will solve HTTP-01 challenges through Gateway API, enable:
129130
# --set config.enableGatewayAPI=true
130131
```
132+
See [Envoy Gateway Setup](docs/envoy-gateway-setup.md) for the full cluster bootstrap and shared Gateway configuration.
131133

132134
2. **Install Theia Cloud base charts**:
133135
```bash
@@ -158,8 +160,8 @@ Configuration files for each environment are located in the [deployments/](deplo
158160
```
159161

160162
Normal deployments consume released OCI charts from `ghcr.io/eduide/charts`.
161-
The `theia-cloud` dependency version in `charts/theia-cloud-combined/Chart.yaml` controls the main application chart, while `theia-cloud-base` and `theia-cloud-crds` are pinned separately in the workflow at `1.2.0-next.0` and `1.4.0-next.0`.
162-
For PR previews, you can set `helm_chart_tag` to a value like `pr-123` to pull preview OCI charts published from `theia-cloud-helm` pull requests as versions such as `<chart-version>.pr-123`.
163+
The `theia-cloud` dependency version in [`charts/theia-cloud-combined/Chart.yaml`](charts/theia-cloud-combined/Chart.yaml) controls the main application chart, while `theia-cloud-base` and `theia-cloud-crds` are pinned separately in the workflow at `1.2.0-next.0` and `1.4.0-next.0`.
164+
For PR previews, you can set `helm_chart_tag` to a value like `pr-123` to pull preview OCI charts published from [EduIDE-Helm](https://github.com/EduIDE/EduIDE-Helm) pull requests as versions such as `<chart-version>.pr-123`.
163165

164166
When using GitHub Actions, shared-gateway settings are passed as hardcoded inputs
165167
by the caller workflows (`deploy-pr.yml`, `deploy-staging.yml`, `deploy-production.yml`):
@@ -200,13 +202,14 @@ See [Deployment Workflows](docs/deployment-workflows.md#release-process-for-pinn
200202
- **Deploy a PR to test environment**: See [Deployment Workflows](docs/deployment-workflows.md#pull-request-deployments)
201203
- **Bump release image tags**: See [Deployment Workflows](docs/deployment-workflows.md#release-process-for-pinned-image-tags)
202204
- **Add a new environment**: See [Adding Environments](docs/adding-environments.md)
205+
- **Set up Envoy Gateway**: See [Envoy Gateway Setup](docs/envoy-gateway-setup.md)
203206
- **Configure Keycloak authentication**: See [Keycloak Setup](docs/keycloak-setup.md)
204207
- **Request TUM wildcard certificates**: See [TUM Certificates](docs/tum-certificates.md)
205208
- **Set up monitoring**: See [Monitoring Setup](docs/monitoring-setup.md)
206209

207210
## AppDefinitions
208211

209-
*AppDefinitions* define the IDE environments that users work in. Custom AppDefinitions are built in a three-stage pipeline at [artemis-theia-blueprints](https://github.com/ls1intum/artemis-theia-blueprints).
212+
*AppDefinitions* define the IDE environments that users work in. Custom AppDefinitions are built in a three-stage pipeline at [artemis-theia-blueprints](https://github.com/EduIDE/EduIDE).
210213

211214
To install or update AppDefinitions:
212215

@@ -234,16 +237,17 @@ The AppDefinitions chart configuration is documented in [charts/theia-appdefinit
234237
Detailed documentation is available in the [docs/](docs/) directory:
235238

236239
- [Deployment Workflows](docs/deployment-workflows.md) - How automated deployments work
240+
- [Envoy Gateway Setup](docs/envoy-gateway-setup.md) - How to bootstrap Envoy Gateway and the shared Gateway API entrypoint
237241
- [Adding Environments](docs/adding-environments.md) - Step-by-step guide to add new environments
238242
- [Keycloak Setup](docs/keycloak-setup.md) - Authentication and authorization configuration
239243
- [TUM Certificates](docs/tum-certificates.md) - TUM-specific SSL certificate process
240244
- [Monitoring Setup](docs/monitoring-setup.md) - Prometheus and Grafana installation
241245

242246
## Related Projects
243247

244-
- [Theia Cloud](https://github.com/eclipse-theia/theia-cloud) - Main Theia Cloud project
245-
- [Theia Cloud Helm Charts](https://github.com/eclipse-theia/theia-cloud-helm) - Official Helm charts
246-
- [Artemis Theia Blueprints](https://github.com/ls1intum/artemis-theia-blueprints) - Custom IDE images and configurations
248+
- [EduIDE Cloud](https://github.com/EduIDE/EduIDE-Cloud) - The Theia Cloud fork deployed by this repository
249+
- [EduIDE Helm](https://github.com/EduIDE/EduIDE-Helm) - Helm charts consumed by this repository
250+
- [Artemis Theia Blueprints](https://github.com/EduIDE/EduIDE) - Custom IDE images and configurations
247251
- [Theia Cloud Observability](https://github.com/eclipsesource/theia-cloud-observability) - Monitoring and observability
248252

249253
## Support

docs/envoy-gateway-setup.md

Lines changed: 299 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,299 @@
1+
# Envoy Gateway Setup
2+
3+
This repository deploys Theia Cloud through Gateway API resources backed by Envoy Gateway. The setup is split across two repositories:
4+
5+
- [EduIDE-Helm](https://github.com/EduIDE/EduIDE-Helm) renders the Theia Cloud `HTTPRoute` resources and, for simple installations, can also render a namespace-local `Gateway`.
6+
- [theia-deployment](https://github.com/EduIDE/EduIDE-deployment) uses those charts internally and adds the [`theia-shared-gateway`](https://github.com/EduIDE/EduIDE-deployment/tree/main/charts/theia-shared-gateway) chart, which owns one cluster-level Gateway shared by multiple Theia namespaces.
7+
8+
For the Artemis/EduIDE deployments, the shared gateway model is the expected setup. Tenant releases should create only their namespace-local routes and workloads; the shared gateway release owns the edge Gateway, listener hostnames, GatewayClass customization, and TLS material.
9+
10+
## Architecture
11+
12+
The traffic path is:
13+
14+
1. DNS points Theia hostnames to the Envoy Gateway load balancer address.
15+
2. Envoy Gateway watches Gateway API resources and programs Envoy.
16+
3. The `theia-shared-gateway` release creates the shared `Gateway` in `gateway-system`.
17+
4. Each Theia tenant release creates `HTTPRoute` resources in its own namespace.
18+
5. Those `HTTPRoute` resources attach to the shared Gateway through `theia-cloud.gateway.parentRefs`.
19+
6. The Theia Cloud operator later edits the instances `HTTPRoute` to attach newly created IDE sessions.
20+
21+
This replaces the older ingress-controller style setup with Gateway API. The main practical benefit is that route updates can be applied dynamically without making every tenant release own a separate edge gateway.
22+
23+
## Prerequisites
24+
25+
Before deploying Theia Cloud, the cluster needs:
26+
27+
- Gateway API CRDs
28+
- Envoy Gateway
29+
- cert-manager
30+
- cert-manager Gateway API support, if ACME HTTP-01 challenges should be solved through Gateway API
31+
- a load balancer implementation for the Envoy data plane, for example a cloud load balancer or MetalLB
32+
- DNS records for landing, service, instance, and webview hostnames
33+
- TLS certificate material, either managed through cert-manager or provided as a wildcard certificate secret
34+
35+
Use the official installation documentation for exact versions and compatibility:
36+
37+
- [Envoy Gateway Helm installation](https://gateway.envoyproxy.io/docs/install/install-helm/)
38+
- [cert-manager Gateway API HTTP-01 solver](https://cert-manager.io/docs/configuration/acme/http01/)
39+
40+
## Install Envoy Gateway
41+
42+
Install Gateway API CRDs and Envoy Gateway once per cluster. A typical Helm-based installation looks like this:
43+
44+
```bash
45+
helm upgrade --install eg oci://docker.io/envoyproxy/gateway-helm \
46+
--namespace envoy-gateway-system \
47+
--create-namespace
48+
```
49+
50+
If your cluster already has Gateway API CRDs managed separately, follow the Envoy Gateway documentation for the matching `--skip-crds` flow.
51+
52+
After installation, verify that the controller is running:
53+
54+
```bash
55+
kubectl get pods -n envoy-gateway-system
56+
kubectl get gatewayclasses
57+
kubectl get crd | grep -E 'gateway.networking.k8s.io|gateway.envoyproxy.io'
58+
```
59+
60+
The default GatewayClass name expected by the Theia charts is `envoy`. If you use another class name, set it consistently in both the shared gateway values and the tenant values:
61+
62+
```yaml
63+
gateway:
64+
className: envoy
65+
66+
theia-cloud:
67+
gateway:
68+
className: envoy
69+
```
70+
71+
## Install cert-manager With Gateway API Support
72+
73+
cert-manager is still responsible for certificate resources. If certificates are issued through Gateway API HTTP-01 challenges, install or upgrade cert-manager with Gateway API support enabled.
74+
75+
The exact values depend on the cert-manager version. For current cert-manager versions, use the file-based `config.enableGatewayAPI` value described in the cert-manager documentation.
76+
77+
Example shape:
78+
79+
```bash
80+
helm upgrade --install cert-manager oci://quay.io/jetstack/charts/cert-manager \
81+
--namespace cert-manager \
82+
--create-namespace \
83+
--set crds.enabled=true \
84+
--set config.enableGatewayAPI=true
85+
```
86+
87+
If cert-manager was already running before Gateway API CRDs were installed, restart cert-manager after enabling Gateway API support so it discovers the new resource types.
88+
89+
```bash
90+
kubectl rollout restart deployment/cert-manager -n cert-manager
91+
kubectl rollout restart deployment/cert-manager-webhook -n cert-manager
92+
kubectl rollout restart deployment/cert-manager-cainjector -n cert-manager
93+
```
94+
95+
## Deploy the Shared Gateway
96+
97+
The shared Gateway is deployed from the [`theia-shared-gateway`](https://github.com/EduIDE/EduIDE-deployment/tree/main/charts/theia-shared-gateway) chart in this repository:
98+
99+
```bash
100+
helm upgrade --install theia-shared-gateway ./charts/theia-shared-gateway \
101+
--namespace gateway-system \
102+
--create-namespace \
103+
-f deployments/shared-gateway/values.yaml
104+
```
105+
106+
For the dedicated production cluster, use:
107+
108+
```bash
109+
helm upgrade --install theia-shared-gateway ./charts/theia-shared-gateway \
110+
--namespace gateway-system \
111+
--create-namespace \
112+
-f deployments/shared-gateway-prod/values.yaml
113+
```
114+
115+
The deployment workflow can also install this release automatically when the caller workflow passes:
116+
117+
```yaml
118+
with:
119+
deploy_shared_gateway: true
120+
shared_gateway_values_file: deployments/shared-gateway/values.yaml
121+
shared_gateway_namespace: gateway-system
122+
```
123+
124+
The workflow injects `THEIA_WILDCARD_CERTIFICATE_CERT` and `THEIA_WILDCARD_CERTIFICATE_KEY` into the shared gateway chart as `wildcardTLSSecret.certificate` and `wildcardTLSSecret.key`.
125+
126+
## Shared Gateway Values
127+
128+
The shared gateway chart can create:
129+
130+
- a `Gateway`
131+
- an optional `GatewayClass`
132+
- an optional Envoy Gateway `EnvoyProxy`
133+
- optional cert-manager `Certificate` resources
134+
- an optional Gateway API ACME `ClusterIssuer`
135+
- an optional static wildcard TLS secret
136+
137+
For shared test/staging clusters, [`deployments/shared-gateway/values.yaml`](https://github.com/EduIDE/EduIDE-deployment/blob/main/deployments/shared-gateway/values.yaml) mainly defines HTTPS listeners for all test and staging hostnames. It assumes the required TLS secrets already exist or are supplied through the workflow.
138+
139+
For production, [`deployments/shared-gateway-prod/values.yaml`](https://github.com/EduIDE/EduIDE-deployment/blob/main/deployments/shared-gateway-prod/values.yaml) additionally creates:
140+
141+
- a `GatewayClass` named `envoy`
142+
- an `EnvoyProxy` that customizes the Envoy data-plane service
143+
- a Gateway API ACME `ClusterIssuer`
144+
- cert-manager `Certificate` resources for concrete production hostnames
145+
- the static wildcard webview TLS secret from deployment secrets
146+
147+
The production `EnvoyProxy` currently contains MetalLB-specific annotations and a fixed load-balancer IP:
148+
149+
```yaml
150+
envoyProxy:
151+
spec:
152+
provider:
153+
type: Kubernetes
154+
kubernetes:
155+
envoyService:
156+
annotations:
157+
metallb.io/address-pool: ingress
158+
metallb.io/loadBalancerIPs: 131.159.88.82
159+
```
160+
161+
Adjust this section for a different cluster. On cloud providers, this may be replaced by provider-specific load balancer annotations or omitted entirely.
162+
163+
## Configure Tenant Theia Releases
164+
165+
Each tenant environment should attach its routes to the shared Gateway instead of creating a namespace-local Gateway:
166+
167+
```yaml
168+
theia-cloud:
169+
gateway:
170+
enabled: true
171+
create: false
172+
routes:
173+
enabled: true
174+
parentRefs:
175+
- name: theia-shared-gateway
176+
namespace: gateway-system
177+
sectionName: test1-landing
178+
- name: theia-shared-gateway
179+
namespace: gateway-system
180+
sectionName: test1-service
181+
- name: theia-shared-gateway
182+
namespace: gateway-system
183+
sectionName: test1-instances
184+
- name: theia-shared-gateway
185+
namespace: gateway-system
186+
sectionName: test1-webview
187+
```
188+
189+
The `sectionName` values must match listener names in the shared gateway values file. If a route references a listener name that does not exist, or if the listener hostname does not match the route hostname, the `HTTPRoute` will not attach.
190+
191+
Disable tenant-local certificate resources when using the shared gateway:
192+
193+
```yaml
194+
theia-certificates:
195+
certificates:
196+
enabled: false
197+
wildcardTLSSecret:
198+
enabled: false
199+
adminApiTokenSecret:
200+
enabled: true
201+
```
202+
203+
Do not set `theia-cloud.gateway.instancesWildcardSecretNames` in tenant values when `gateway.create: false`. That map is only used when the Theia Cloud chart renders its own `Gateway`. In the shared-gateway setup, wildcard TLS secrets are owned by the shared gateway release in `gateway-system`.
204+
205+
## Add or Change Hostnames
206+
207+
For every environment, keep these values aligned:
208+
209+
- tenant `hosts.configuration.landing`
210+
- tenant `hosts.configuration.service`
211+
- tenant `hosts.configuration.instance`
212+
- tenant `hosts.allWildcardInstances`
213+
- tenant `theia-cloud.gateway.parentRefs[*].sectionName`
214+
- shared gateway `gateway.listeners[*].name`
215+
- shared gateway `gateway.listeners[*].hostname`
216+
- DNS records for the same hostnames
217+
- TLS certificate DNS names
218+
219+
Each environment usually needs listeners for:
220+
221+
- landing page hostname
222+
- service API hostname
223+
- session instance hostname
224+
- webview wildcard hostname
225+
226+
If cert-manager should issue concrete host certificates through HTTP-01, also add matching HTTP listeners on port `80` so cert-manager can attach solver routes to the Gateway.
227+
228+
## Manual Steps That Automation Does Not Fully Own
229+
230+
Some cluster-level setup still has to be done manually or by separate infrastructure automation:
231+
232+
- Install or upgrade Envoy Gateway and Gateway API CRDs.
233+
- Install or upgrade cert-manager with Gateway API support.
234+
- Ensure the Envoy Gateway load balancer receives the intended external IP or hostname.
235+
- Point DNS records at the Envoy Gateway load balancer.
236+
- Provide wildcard certificate secrets for webview hosts, or configure cert-manager to issue suitable certificates.
237+
- For production-style MetalLB clusters, reserve the configured load-balancer IP and keep `envoyProxy.spec.provider.kubernetes.envoyService.annotations` in sync.
238+
- Create or update Keycloak clients separately; see [`docs/keycloak-setup.md`](https://github.com/EduIDE/EduIDE-deployment/blob/main/docs/keycloak-setup.md).
239+
240+
The GitHub Actions workflow installs Theia Cloud base charts, CRDs, monitoring, the optional shared gateway release, and tenant releases. It does not install Envoy Gateway itself.
241+
242+
## Validation
243+
244+
After deploying the shared gateway and a tenant release, check:
245+
246+
```bash
247+
kubectl get gatewayclass
248+
kubectl get gateway -n gateway-system
249+
kubectl get httproute -A
250+
kubectl describe gateway theia-shared-gateway -n gateway-system
251+
kubectl describe httproute landing-route -n <tenant-namespace>
252+
kubectl describe httproute service-route -n <tenant-namespace>
253+
kubectl describe httproute theia-cloud-demo-ws-route -n <tenant-namespace>
254+
```
255+
256+
The important conditions are:
257+
258+
- the shared `Gateway` is accepted and programmed
259+
- tenant `HTTPRoute` resources are accepted
260+
- each route has a resolved parent reference
261+
- the Envoy data-plane service has an external address
262+
- TLS secrets referenced by HTTPS listeners exist in `gateway-system`
263+
264+
For a quick end-to-end check, open the landing hostname and then start an IDE session. The operator should add a rule to the instances route, and the generated session URL should resolve through the shared Gateway.
265+
266+
## Common Failure Modes
267+
268+
`HTTPRoute` does not attach:
269+
270+
- the `sectionName` in tenant `parentRefs` does not match a listener name
271+
- the route hostname is not allowed by the listener hostname
272+
- cross-namespace routes are blocked by `allowedRoutes`
273+
- Gateway API CRDs are missing or too old for the rendered resources
274+
275+
TLS fails:
276+
277+
- the listener references a secret that does not exist in `gateway-system`
278+
- the certificate does not cover the concrete or wildcard hostname
279+
- cert-manager Gateway API support is not enabled for HTTP-01 challenges
280+
281+
The load balancer has no address:
282+
283+
- Envoy Gateway is installed but the cluster has no load balancer implementation
284+
- MetalLB address pools or fixed IP annotations do not match the cluster
285+
- cloud-provider load balancer annotations are missing or invalid
286+
287+
The Theia Cloud chart renders a duplicate Gateway:
288+
289+
- tenant values forgot `theia-cloud.gateway.create=false`
290+
- the release still carries legacy namespace-local Gateway or certificate values
291+
292+
## References
293+
294+
- [`charts/theia-shared-gateway/README.md`](https://github.com/EduIDE/EduIDE-deployment/blob/main/charts/theia-shared-gateway/README.md)
295+
- [`deployments/shared-gateway/values.yaml`](https://github.com/EduIDE/EduIDE-deployment/blob/main/deployments/shared-gateway/values.yaml)
296+
- [`deployments/shared-gateway-prod/values.yaml`](https://github.com/EduIDE/EduIDE-deployment/blob/main/deployments/shared-gateway-prod/values.yaml)
297+
- [`docs/adding-environments.md`](https://github.com/EduIDE/EduIDE-deployment/blob/main/docs/adding-environments.md)
298+
- [`docs/deployment-workflows.md`](https://github.com/EduIDE/EduIDE-deployment/blob/main/docs/deployment-workflows.md)
299+
- [`charts/theia-cloud/values.yaml`](https://github.com/EduIDE/EduIDE-Helm/blob/main/charts/theia-cloud/values.yaml)

0 commit comments

Comments
 (0)