This document describes what the AWS reference architecture produces: the AWS resources and Kubernetes components needed to deploy the Poolside platform and/or models.
A single terraform apply creates the complete AWS foundation and Kubernetes requirements for a Poolside deployment, then deploys the Poolside platform and
(optional) local inference stack on GPU EKS worker nodes.
The infrastructure is organized into layers:
A dedicated VPC with three subnet tiers across multiple availability zones:
- Public subnets: NAT gateways and the internet-facing Application Load Balancer (ALB)
- Private worker subnets: EKS worker nodes (CPU and GPU), RDS instance, with outbound internet via NAT
- Private control plane subnets: EKS control plane ENIs (AWS-managed)
An S3 VPC gateway endpoint routes S3 traffic directly, bypassing NAT gateways to reduce data transfer costs for image pulls and model artifact downloads.
A managed Kubernetes cluster with:
- OIDC provider for IAM Roles for Service Accounts (IRSA)
- Managed EKS add-ons: vpc-cni, kube-proxy, coredns, metrics-server, snapshot-controller, aws-ebs-csi-driver
- Public API endpoint protected by a mandatory CIDR allowlist, plus a private endpoint for in-VPC traffic
- Access entries for cluster admin principals (API mode, not aws-auth ConfigMap)
- Envelope encryption of Kubernetes Secrets using a customer-managed KMS key
- CPU node group (always created): runs the Poolside platform services, ALB controller, External Secrets Operator, and cluster addons
- GPU node group (optional, full profile only): runs model inference workloads via the NVIDIA GPU Operator. Supports EC2 capacity reservations for guaranteed GPU instance availability.
- RDS PostgreSQL: application database with AWS-managed master password (stored in Secrets Manager, never in Terraform state), multi-AZ by default, Performance Insights enabled, CloudWatch log exports
- S3 buckets: data bucket (model artifacts, telemetry, repositories) and access log bucket. Both SSE-KMS encrypted with public access blocked.
- ECR repositories: one per container image the Helm chart needs, namespaced under the deployment name
- KMS keys: EKS secret encryption, RDS storage encryption, S3 object encryption, EBS volume encryption, and application-level encryption (used by core-api for encrypting sensitive data in the database)
- IAM roles with least-privilege policies: node group instance roles, IRSA workload roles (core-api, inference, external-secrets, ALB controller, EBS CSI, VPC CNI), plus the EKS cluster role
- Permissions boundary support: an optional
permissions_boundary_arnthreads through every IAM role for regulated environments
Before Helm runs, Terraform creates the Kubernetes resources the chart expects to find:
- Namespaces:
poolside(platform) andpoolside-models(inference) - gp3 StorageClass (cluster default): EBS-backed, KMS-encrypted, WaitForFirstConsumer binding
- Custom CA bundle ConfigMap (optional): for environments with TLS-intercepting proxies or private PKI
- AWS Load Balancer Controller: Helm-managed, creates ALBs from Kubernetes Ingress resources
- External Secrets Operator: syncs the RDS master password from Secrets Manager into a Kubernetes Secret (
poolside-db-secret) - NVIDIA GPU Operator (full profile only): installs GPU device drivers and the Kubernetes device plugin
Pods reach KMS and S3 via IRSA, so no static-key or AWS-credentials Kubernetes Secrets are created.
Terraform also owns the Helm installs, via the helm-wrapper module.
Two releases come from the Poolside bundle:
- poolside-deployment: platform workloads:
- core-api: Poolside API server (chat, completions, agent orchestration, repository indexing)
- core-api-worker: async worker pool running the same forge_api
binary in
workermode - core-api-temporal-server: embedded Temporal server
- web-assistant: Svelte SPA frontend served by Caddy
- public-docs: static docs served at the same ALB under
/docs
- inference-stack (full profile only): one deployment per enabled model subchart, plus an envoy proxy and an extproc sidecar for request dispatch
Values for both charts are composed by the poolside-values module
from reference-stack outputs (database endpoints, S3 bucket names,
KMS ARNs, ECR registry URIs, IRSA role ARNs). The three-layer
composition (reference-stack → poolside-values → helm-wrapper)
isolates chart-specific knowledge to a single module; see
customizing.md for the operator-visible knobs.
All application pods use IRSA for AWS API access (S3, KMS).
core-api, web-assistant and the public-docs workloads share a single internet-facing ALB via the group.name annotation. TLS termination uses an ACM certificate (looked up by domain name). After deployment, the operator must create a DNS record that points the public hostname at the ALB domain name.
The platform supports any OIDC-compliant identity provider. Optionally, Terraform can create an AWS Cognito user pool and client. The Cognito endpoint, client ID, and client secret are output for the first-time IdP binding in the Poolside Console.
- Multi-AZ by default: VPC subnets, NAT gateways, EKS control plane, and RDS are spread across availability zones
- Single NAT gateway option: available via
single_nat_gateway = truefor cost-sensitive deployments (trades AZ-level NAT redundancy for lower cost) - RDS Multi-AZ: synchronous standby replica with automatic failover (RTO < 60 seconds). Can be disabled for non-production use.
A few deliberate design choices that affect operator workflow:
- Public EKS API endpoint required. The cluster is created with
both a public API endpoint (gated by
cluster_endpoint_public_access_cidrs) and a private endpoint. Terraform talks to the public one; in-cluster workloads use the private one. Fully-private API access isn't supported by the reference architecture because the documented operator workflow assumes Helm and Terraform run from outside the VPC. If your organization requires private-only API access, you'll need to run Terraform and Helm from inside the VPC (bastion, peered VPC, or transit gateway). - Single ALB, HTTPS-only. The Poolside Console, core-api, and
public-docs share one internet-facing ALB joined via the
group.nameannotation. There is no HTTP-only fallback. An ACM cert coveringpublic_hostnamemust be issued invar.regionbefore plan.
