|
| 1 | +# Envoy Gateway Setup |
| 2 | + |
| 3 | +This repository deploys Theia Cloud through Gateway API resources backed by Envoy Gateway. The setup is split across two repositories: |
| 4 | + |
| 5 | +- [EduIDE-Helm](https://github.com/EduIDE/EduIDE-Helm) renders the Theia Cloud `HTTPRoute` resources and, for simple installations, can also render a namespace-local `Gateway`. |
| 6 | +- [theia-deployment](https://github.com/EduIDE/EduIDE-deployment) uses those charts internally and adds the [`theia-shared-gateway`](https://github.com/EduIDE/EduIDE-deployment/tree/main/charts/theia-shared-gateway) chart, which owns one cluster-level Gateway shared by multiple Theia namespaces. |
| 7 | + |
| 8 | +For the Artemis/EduIDE deployments, the shared gateway model is the expected setup. Tenant releases should create only their namespace-local routes and workloads; the shared gateway release owns the edge Gateway, listener hostnames, GatewayClass customization, and TLS material. |
| 9 | + |
| 10 | +## Architecture |
| 11 | + |
| 12 | +The traffic path is: |
| 13 | + |
| 14 | +1. DNS points Theia hostnames to the Envoy Gateway load balancer address. |
| 15 | +2. Envoy Gateway watches Gateway API resources and programs Envoy. |
| 16 | +3. The `theia-shared-gateway` release creates the shared `Gateway` in `gateway-system`. |
| 17 | +4. Each Theia tenant release creates `HTTPRoute` resources in its own namespace. |
| 18 | +5. Those `HTTPRoute` resources attach to the shared Gateway through `theia-cloud.gateway.parentRefs`. |
| 19 | +6. The Theia Cloud operator later edits the instances `HTTPRoute` to attach newly created IDE sessions. |
| 20 | + |
| 21 | +This replaces the older ingress-controller style setup with Gateway API. The main practical benefit is that route updates can be applied dynamically without making every tenant release own a separate edge gateway. |
| 22 | + |
| 23 | +## Prerequisites |
| 24 | + |
| 25 | +Before deploying Theia Cloud, the cluster needs: |
| 26 | + |
| 27 | +- Gateway API CRDs |
| 28 | +- Envoy Gateway |
| 29 | +- cert-manager |
| 30 | +- cert-manager Gateway API support, if ACME HTTP-01 challenges should be solved through Gateway API |
| 31 | +- a load balancer implementation for the Envoy data plane, for example a cloud load balancer or MetalLB |
| 32 | +- DNS records for landing, service, instance, and webview hostnames |
| 33 | +- TLS certificate material, either managed through cert-manager or provided as a wildcard certificate secret |
| 34 | + |
| 35 | +Use the official installation documentation for exact versions and compatibility: |
| 36 | + |
| 37 | +- [Envoy Gateway Helm installation](https://gateway.envoyproxy.io/docs/install/install-helm/) |
| 38 | +- [cert-manager Gateway API HTTP-01 solver](https://cert-manager.io/docs/configuration/acme/http01/) |
| 39 | + |
| 40 | +## Install Envoy Gateway |
| 41 | + |
| 42 | +Install Gateway API CRDs and Envoy Gateway once per cluster. A typical Helm-based installation looks like this: |
| 43 | + |
| 44 | +```bash |
| 45 | +helm upgrade --install eg oci://docker.io/envoyproxy/gateway-helm \ |
| 46 | + --namespace envoy-gateway-system \ |
| 47 | + --create-namespace |
| 48 | +``` |
| 49 | + |
| 50 | +If your cluster already has Gateway API CRDs managed separately, follow the Envoy Gateway documentation for the matching `--skip-crds` flow. |
| 51 | + |
| 52 | +After installation, verify that the controller is running: |
| 53 | + |
| 54 | +```bash |
| 55 | +kubectl get pods -n envoy-gateway-system |
| 56 | +kubectl get gatewayclasses |
| 57 | +kubectl get crd | grep -E 'gateway.networking.k8s.io|gateway.envoyproxy.io' |
| 58 | +``` |
| 59 | + |
| 60 | +The default GatewayClass name expected by the Theia charts is `envoy`. If you use another class name, set it consistently in both the shared gateway values and the tenant values: |
| 61 | + |
| 62 | +```yaml |
| 63 | +gateway: |
| 64 | + className: envoy |
| 65 | + |
| 66 | +theia-cloud: |
| 67 | + gateway: |
| 68 | + className: envoy |
| 69 | +``` |
| 70 | +
|
| 71 | +## Install cert-manager With Gateway API Support |
| 72 | +
|
| 73 | +cert-manager is still responsible for certificate resources. If certificates are issued through Gateway API HTTP-01 challenges, install or upgrade cert-manager with Gateway API support enabled. |
| 74 | +
|
| 75 | +The exact values depend on the cert-manager version. For current cert-manager versions, use the file-based `config.enableGatewayAPI` value described in the cert-manager documentation. |
| 76 | + |
| 77 | +Example shape: |
| 78 | + |
| 79 | +```bash |
| 80 | +helm upgrade --install cert-manager oci://quay.io/jetstack/charts/cert-manager \ |
| 81 | + --namespace cert-manager \ |
| 82 | + --create-namespace \ |
| 83 | + --set crds.enabled=true \ |
| 84 | + --set config.enableGatewayAPI=true |
| 85 | +``` |
| 86 | + |
| 87 | +If cert-manager was already running before Gateway API CRDs were installed, restart cert-manager after enabling Gateway API support so it discovers the new resource types. |
| 88 | + |
| 89 | +```bash |
| 90 | +kubectl rollout restart deployment/cert-manager -n cert-manager |
| 91 | +kubectl rollout restart deployment/cert-manager-webhook -n cert-manager |
| 92 | +kubectl rollout restart deployment/cert-manager-cainjector -n cert-manager |
| 93 | +``` |
| 94 | + |
| 95 | +## Deploy the Shared Gateway |
| 96 | + |
| 97 | +The shared Gateway is deployed from the [`theia-shared-gateway`](https://github.com/EduIDE/EduIDE-deployment/tree/main/charts/theia-shared-gateway) chart in this repository: |
| 98 | + |
| 99 | +```bash |
| 100 | +helm upgrade --install theia-shared-gateway ./charts/theia-shared-gateway \ |
| 101 | + --namespace gateway-system \ |
| 102 | + --create-namespace \ |
| 103 | + -f deployments/shared-gateway/values.yaml |
| 104 | +``` |
| 105 | + |
| 106 | +For the dedicated production cluster, use: |
| 107 | + |
| 108 | +```bash |
| 109 | +helm upgrade --install theia-shared-gateway ./charts/theia-shared-gateway \ |
| 110 | + --namespace gateway-system \ |
| 111 | + --create-namespace \ |
| 112 | + -f deployments/shared-gateway-prod/values.yaml |
| 113 | +``` |
| 114 | + |
| 115 | +The deployment workflow can also install this release automatically when the caller workflow passes: |
| 116 | + |
| 117 | +```yaml |
| 118 | +with: |
| 119 | + deploy_shared_gateway: true |
| 120 | + shared_gateway_values_file: deployments/shared-gateway/values.yaml |
| 121 | + shared_gateway_namespace: gateway-system |
| 122 | +``` |
| 123 | + |
| 124 | +The workflow injects `THEIA_WILDCARD_CERTIFICATE_CERT` and `THEIA_WILDCARD_CERTIFICATE_KEY` into the shared gateway chart as `wildcardTLSSecret.certificate` and `wildcardTLSSecret.key`. |
| 125 | + |
| 126 | +## Shared Gateway Values |
| 127 | + |
| 128 | +The shared gateway chart can create: |
| 129 | + |
| 130 | +- a `Gateway` |
| 131 | +- an optional `GatewayClass` |
| 132 | +- an optional Envoy Gateway `EnvoyProxy` |
| 133 | +- optional cert-manager `Certificate` resources |
| 134 | +- an optional Gateway API ACME `ClusterIssuer` |
| 135 | +- an optional static wildcard TLS secret |
| 136 | + |
| 137 | +For shared test/staging clusters, [`deployments/shared-gateway/values.yaml`](https://github.com/EduIDE/EduIDE-deployment/blob/main/deployments/shared-gateway/values.yaml) mainly defines HTTPS listeners for all test and staging hostnames. It assumes the required TLS secrets already exist or are supplied through the workflow. |
| 138 | + |
| 139 | +For production, [`deployments/shared-gateway-prod/values.yaml`](https://github.com/EduIDE/EduIDE-deployment/blob/main/deployments/shared-gateway-prod/values.yaml) additionally creates: |
| 140 | + |
| 141 | +- a `GatewayClass` named `envoy` |
| 142 | +- an `EnvoyProxy` that customizes the Envoy data-plane service |
| 143 | +- a Gateway API ACME `ClusterIssuer` |
| 144 | +- cert-manager `Certificate` resources for concrete production hostnames |
| 145 | +- the static wildcard webview TLS secret from deployment secrets |
| 146 | + |
| 147 | +The production `EnvoyProxy` currently contains MetalLB-specific annotations and a fixed load-balancer IP: |
| 148 | + |
| 149 | +```yaml |
| 150 | +envoyProxy: |
| 151 | + spec: |
| 152 | + provider: |
| 153 | + type: Kubernetes |
| 154 | + kubernetes: |
| 155 | + envoyService: |
| 156 | + annotations: |
| 157 | + metallb.io/address-pool: ingress |
| 158 | + metallb.io/loadBalancerIPs: 131.159.88.82 |
| 159 | +``` |
| 160 | + |
| 161 | +Adjust this section for a different cluster. On cloud providers, this may be replaced by provider-specific load balancer annotations or omitted entirely. |
| 162 | + |
| 163 | +## Configure Tenant Theia Releases |
| 164 | + |
| 165 | +Each tenant environment should attach its routes to the shared Gateway instead of creating a namespace-local Gateway: |
| 166 | + |
| 167 | +```yaml |
| 168 | +theia-cloud: |
| 169 | + gateway: |
| 170 | + enabled: true |
| 171 | + create: false |
| 172 | + routes: |
| 173 | + enabled: true |
| 174 | + parentRefs: |
| 175 | + - name: theia-shared-gateway |
| 176 | + namespace: gateway-system |
| 177 | + sectionName: test1-landing |
| 178 | + - name: theia-shared-gateway |
| 179 | + namespace: gateway-system |
| 180 | + sectionName: test1-service |
| 181 | + - name: theia-shared-gateway |
| 182 | + namespace: gateway-system |
| 183 | + sectionName: test1-instances |
| 184 | + - name: theia-shared-gateway |
| 185 | + namespace: gateway-system |
| 186 | + sectionName: test1-webview |
| 187 | +``` |
| 188 | + |
| 189 | +The `sectionName` values must match listener names in the shared gateway values file. If a route references a listener name that does not exist, or if the listener hostname does not match the route hostname, the `HTTPRoute` will not attach. |
| 190 | + |
| 191 | +Disable tenant-local certificate resources when using the shared gateway: |
| 192 | + |
| 193 | +```yaml |
| 194 | +theia-certificates: |
| 195 | + certificates: |
| 196 | + enabled: false |
| 197 | + wildcardTLSSecret: |
| 198 | + enabled: false |
| 199 | + adminApiTokenSecret: |
| 200 | + enabled: true |
| 201 | +``` |
| 202 | + |
| 203 | +Do not set `theia-cloud.gateway.instancesWildcardSecretNames` in tenant values when `gateway.create: false`. That map is only used when the Theia Cloud chart renders its own `Gateway`. In the shared-gateway setup, wildcard TLS secrets are owned by the shared gateway release in `gateway-system`. |
| 204 | + |
| 205 | +## Add or Change Hostnames |
| 206 | + |
| 207 | +For every environment, keep these values aligned: |
| 208 | + |
| 209 | +- tenant `hosts.configuration.landing` |
| 210 | +- tenant `hosts.configuration.service` |
| 211 | +- tenant `hosts.configuration.instance` |
| 212 | +- tenant `hosts.allWildcardInstances` |
| 213 | +- tenant `theia-cloud.gateway.parentRefs[*].sectionName` |
| 214 | +- shared gateway `gateway.listeners[*].name` |
| 215 | +- shared gateway `gateway.listeners[*].hostname` |
| 216 | +- DNS records for the same hostnames |
| 217 | +- TLS certificate DNS names |
| 218 | + |
| 219 | +Each environment usually needs listeners for: |
| 220 | + |
| 221 | +- landing page hostname |
| 222 | +- service API hostname |
| 223 | +- session instance hostname |
| 224 | +- webview wildcard hostname |
| 225 | + |
| 226 | +If cert-manager should issue concrete host certificates through HTTP-01, also add matching HTTP listeners on port `80` so cert-manager can attach solver routes to the Gateway. |
| 227 | + |
| 228 | +## Manual Steps That Automation Does Not Fully Own |
| 229 | + |
| 230 | +Some cluster-level setup still has to be done manually or by separate infrastructure automation: |
| 231 | + |
| 232 | +- Install or upgrade Envoy Gateway and Gateway API CRDs. |
| 233 | +- Install or upgrade cert-manager with Gateway API support. |
| 234 | +- Ensure the Envoy Gateway load balancer receives the intended external IP or hostname. |
| 235 | +- Point DNS records at the Envoy Gateway load balancer. |
| 236 | +- Provide wildcard certificate secrets for webview hosts, or configure cert-manager to issue suitable certificates. |
| 237 | +- For production-style MetalLB clusters, reserve the configured load-balancer IP and keep `envoyProxy.spec.provider.kubernetes.envoyService.annotations` in sync. |
| 238 | +- Create or update Keycloak clients separately; see [`docs/keycloak-setup.md`](https://github.com/EduIDE/EduIDE-deployment/blob/main/docs/keycloak-setup.md). |
| 239 | + |
| 240 | +The GitHub Actions workflow installs Theia Cloud base charts, CRDs, monitoring, the optional shared gateway release, and tenant releases. It does not install Envoy Gateway itself. |
| 241 | + |
| 242 | +## Validation |
| 243 | + |
| 244 | +After deploying the shared gateway and a tenant release, check: |
| 245 | + |
| 246 | +```bash |
| 247 | +kubectl get gatewayclass |
| 248 | +kubectl get gateway -n gateway-system |
| 249 | +kubectl get httproute -A |
| 250 | +kubectl describe gateway theia-shared-gateway -n gateway-system |
| 251 | +kubectl describe httproute landing-route -n <tenant-namespace> |
| 252 | +kubectl describe httproute service-route -n <tenant-namespace> |
| 253 | +kubectl describe httproute theia-cloud-demo-ws-route -n <tenant-namespace> |
| 254 | +``` |
| 255 | + |
| 256 | +The important conditions are: |
| 257 | + |
| 258 | +- the shared `Gateway` is accepted and programmed |
| 259 | +- tenant `HTTPRoute` resources are accepted |
| 260 | +- each route has a resolved parent reference |
| 261 | +- the Envoy data-plane service has an external address |
| 262 | +- TLS secrets referenced by HTTPS listeners exist in `gateway-system` |
| 263 | + |
| 264 | +For a quick end-to-end check, open the landing hostname and then start an IDE session. The operator should add a rule to the instances route, and the generated session URL should resolve through the shared Gateway. |
| 265 | + |
| 266 | +## Common Failure Modes |
| 267 | + |
| 268 | +`HTTPRoute` does not attach: |
| 269 | + |
| 270 | +- the `sectionName` in tenant `parentRefs` does not match a listener name |
| 271 | +- the route hostname is not allowed by the listener hostname |
| 272 | +- cross-namespace routes are blocked by `allowedRoutes` |
| 273 | +- Gateway API CRDs are missing or too old for the rendered resources |
| 274 | + |
| 275 | +TLS fails: |
| 276 | + |
| 277 | +- the listener references a secret that does not exist in `gateway-system` |
| 278 | +- the certificate does not cover the concrete or wildcard hostname |
| 279 | +- cert-manager Gateway API support is not enabled for HTTP-01 challenges |
| 280 | + |
| 281 | +The load balancer has no address: |
| 282 | + |
| 283 | +- Envoy Gateway is installed but the cluster has no load balancer implementation |
| 284 | +- MetalLB address pools or fixed IP annotations do not match the cluster |
| 285 | +- cloud-provider load balancer annotations are missing or invalid |
| 286 | + |
| 287 | +The Theia Cloud chart renders a duplicate Gateway: |
| 288 | + |
| 289 | +- tenant values forgot `theia-cloud.gateway.create=false` |
| 290 | +- the release still carries legacy namespace-local Gateway or certificate values |
| 291 | + |
| 292 | +## References |
| 293 | + |
| 294 | +- [`charts/theia-shared-gateway/README.md`](https://github.com/EduIDE/EduIDE-deployment/blob/main/charts/theia-shared-gateway/README.md) |
| 295 | +- [`deployments/shared-gateway/values.yaml`](https://github.com/EduIDE/EduIDE-deployment/blob/main/deployments/shared-gateway/values.yaml) |
| 296 | +- [`deployments/shared-gateway-prod/values.yaml`](https://github.com/EduIDE/EduIDE-deployment/blob/main/deployments/shared-gateway-prod/values.yaml) |
| 297 | +- [`docs/adding-environments.md`](https://github.com/EduIDE/EduIDE-deployment/blob/main/docs/adding-environments.md) |
| 298 | +- [`docs/deployment-workflows.md`](https://github.com/EduIDE/EduIDE-deployment/blob/main/docs/deployment-workflows.md) |
| 299 | +- [`charts/theia-cloud/values.yaml`](https://github.com/EduIDE/EduIDE-Helm/blob/main/charts/theia-cloud/values.yaml) |
0 commit comments