Architecture overview

This document describes what the AWS reference architecture produces: the AWS resources and Kubernetes components needed to deploy the Poolside platform and/or models.

What Terraform creates

A single terraform apply creates the complete AWS foundation and Kubernetes requirements for a Poolside deployment, then deploys the Poolside platform and (optional) local inference stack on GPU EKS worker nodes.

The infrastructure is organized into layers:

Network

A dedicated VPC with three subnet tiers across multiple availability zones:

Public subnets: NAT gateways and the internet-facing Application Load Balancer (ALB)
Private worker subnets: EKS worker nodes (CPU and GPU), RDS instance, with outbound internet via NAT
Private control plane subnets: EKS control plane ENIs (AWS-managed)

An S3 VPC gateway endpoint routes S3 traffic directly, bypassing NAT gateways to reduce data transfer costs for image pulls and model artifact downloads.

EKS cluster

A managed Kubernetes cluster with:

OIDC provider for IAM Roles for Service Accounts (IRSA)
Managed EKS add-ons: vpc-cni, kube-proxy, coredns, metrics-server, snapshot-controller, aws-ebs-csi-driver
Public API endpoint protected by a mandatory CIDR allowlist, plus a private endpoint for in-VPC traffic
Access entries for cluster admin principals (API mode, not aws-auth ConfigMap)
Envelope encryption of Kubernetes Secrets using a customer-managed KMS key

Node groups

CPU node group (always created): runs the Poolside platform services, ALB controller, External Secrets Operator, and cluster addons
GPU node group (optional, full profile only): runs model inference workloads via the NVIDIA GPU Operator. Supports EC2 capacity reservations for guaranteed GPU instance availability.

Data stores

RDS PostgreSQL: application database with AWS-managed master password (stored in Secrets Manager, never in Terraform state), multi-AZ by default, Performance Insights enabled, CloudWatch log exports
S3 buckets: data bucket (model artifacts, telemetry, repositories) and access log bucket. Both SSE-KMS encrypted with public access blocked.
ECR repositories: one per container image the Helm chart needs, namespaced under the deployment name

Security

KMS keys: EKS secret encryption, RDS storage encryption, S3 object encryption, EBS volume encryption, and application-level encryption (used by core-api for encrypting sensitive data in the database)
IAM roles with least-privilege policies: node group instance roles, IRSA workload roles (core-api, inference, external-secrets, ALB controller, EBS CSI, VPC CNI), plus the EKS cluster role
Permissions boundary support: an optional permissions_boundary_arn threads through every IAM role for regulated environments

Cluster prerequisites (Kubernetes resources created by Terraform)

Before Helm runs, Terraform creates the Kubernetes resources the chart expects to find:

Namespaces: poolside (platform) and poolside-models (inference)
gp3 StorageClass (cluster default): EBS-backed, KMS-encrypted, WaitForFirstConsumer binding
Custom CA bundle ConfigMap (optional): for environments with TLS-intercepting proxies or private PKI
AWS Load Balancer Controller: Helm-managed, creates ALBs from Kubernetes Ingress resources
External Secrets Operator: syncs the RDS master password from Secrets Manager into a Kubernetes Secret (poolside-db-secret)
NVIDIA GPU Operator (full profile only): installs GPU device drivers and the Kubernetes device plugin

Pods reach KMS and S3 via IRSA, so no static-key or AWS-credentials Kubernetes Secrets are created.

What Helm installs

Terraform also owns the Helm installs, via the helm-wrapper module. Two releases come from the Poolside bundle:

poolside-deployment: platform workloads:
- core-api: Poolside API server (chat, completions, agent orchestration, repository indexing)
- core-api-worker: async worker pool running the same forge_api binary in worker mode
- core-api-temporal-server: embedded Temporal server
- web-assistant: Svelte SPA frontend served by Caddy
- public-docs: static docs served at the same ALB under /docs
inference-stack (full profile only): one deployment per enabled model subchart, plus an envoy proxy and an extproc sidecar for request dispatch

Values for both charts are composed by the poolside-values module from reference-stack outputs (database endpoints, S3 bucket names, KMS ARNs, ECR registry URIs, IRSA role ARNs). The three-layer composition (reference-stack → poolside-values → helm-wrapper) isolates chart-specific knowledge to a single module; see customizing.md for the operator-visible knobs.

All application pods use IRSA for AWS API access (S3, KMS).

Ingress

core-api, web-assistant and the public-docs workloads share a single internet-facing ALB via the group.name annotation. TLS termination uses an ACM certificate (looked up by domain name). After deployment, the operator must create a DNS record that points the public hostname at the ALB domain name.

Authentication

The platform supports any OIDC-compliant identity provider. Optionally, Terraform can create an AWS Cognito user pool and client. The Cognito endpoint, client ID, and client secret are output for the first-time IdP binding in the Poolside Console.

High availability

Multi-AZ by default: VPC subnets, NAT gateways, EKS control plane, and RDS are spread across availability zones
Single NAT gateway option: available via single_nat_gateway = true for cost-sensitive deployments (trades AZ-level NAT redundancy for lower cost)
RDS Multi-AZ: synchronous standby replica with automatic failover (RTO < 60 seconds). Can be disabled for non-production use.

Architectural limitations

A few deliberate design choices that affect operator workflow:

Public EKS API endpoint required. The cluster is created with both a public API endpoint (gated by cluster_endpoint_public_access_cidrs) and a private endpoint. Terraform talks to the public one; in-cluster workloads use the private one. Fully-private API access isn't supported by the reference architecture because the documented operator workflow assumes Helm and Terraform run from outside the VPC. If your organization requires private-only API access, you'll need to run Terraform and Helm from inside the VPC (bastion, peered VPC, or transit gateway).
Single ALB, HTTPS-only. The Poolside Console, core-api, and public-docs share one internet-facing ALB joined via the group.name annotation. There is no HTTP-only fallback. An ACM cert covering public_hostname must be issued in var.region before plan.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture overview

What Terraform creates

Network

EKS cluster

Node groups

Data stores

Security

Cluster prerequisites (Kubernetes resources created by Terraform)

What Helm installs

Ingress

Authentication

High availability

Architectural limitations

FilesExpand file tree

architecture.md

Latest commit

History

architecture.md

File metadata and controls

Architecture overview

What Terraform creates

Network

EKS cluster

Node groups

Data stores

Security

Cluster prerequisites (Kubernetes resources created by Terraform)

What Helm installs

Ingress

Authentication

High availability

Architectural limitations