|
| 1 | +--- |
| 2 | +title: Architecture |
| 3 | +--- |
| 4 | + |
| 5 | +Ory Talos is an API credential service. It issues and verifies API keys, derives short-lived JWT and macaroon tokens from those |
| 6 | +keys, and lets credential holders revoke their own keys with proof of possession. This page covers the editions, the deployment |
| 7 | +shapes, and the design choices that matter when you adopt or operate Talos. |
| 8 | + |
| 9 | +## What Talos does |
| 10 | + |
| 11 | +Talos exposes two surfaces: |
| 12 | + |
| 13 | +- An admin surface for managing credentials: issue, rotate, revoke, import, derive tokens, list, and get. Verification |
| 14 | + (`apiKeys:verify` and `apiKeys:batchVerify`) also lives on the admin surface, because verifying a credential is a high-trust |
| 15 | + operation that needs the same network protection as management. Talos ships no admin authentication; you control who can reach |
| 16 | + this surface. |
| 17 | +- A self-service surface for the one credential-holder operation: self-revocation (`apiKeys:selfRevoke`). The caller proves |
| 18 | + possession by presenting the credential, so this surface needs no admin authentication. |
| 19 | + |
| 20 | +The JWKS endpoint (`GET /v2alpha1/derivedKeys/jwks.json`) publishes the public keys that verify derived JWTs. It carries no |
| 21 | +secrets, so every surface exposes it and callers can fetch it from any process. |
| 22 | + |
| 23 | +Run both surfaces in one process, or split them so the public self-revoke endpoint doesn't share a listener with management |
| 24 | +endpoints. See [separate admin and public APIs](../operate/deploy/deployment-modes.md) for the production topology. |
| 25 | + |
| 26 | +## Editions |
| 27 | + |
| 28 | +Talos ships in two editions. The OSS edition is single-tenant, supports only SQLite, and treats rate-limit policies as metadata. |
| 29 | +The commercial edition adds multi-tenancy, enforced rate limits, observability, and Postgres, MySQL, and CockroachDB backends. |
| 30 | + |
| 31 | +| Capability | OSS | Commercial | |
| 32 | +| ------------------------------------------------ | ------------------------------------------------------ | -------------------------------------- | |
| 33 | +| All admin and self-service endpoints | yes | yes | |
| 34 | +| Single-process `serve` | yes | yes | |
| 35 | +| Split deployment (`serve admin`, `serve public`) | yes | yes | |
| 36 | +| Edge proxy (`talos proxy`) | no | yes | |
| 37 | +| Helm charts | no | yes | |
| 38 | +| Cache backends | `noop` only | `memory`, `redis` | |
| 39 | +| Multi-tenancy (network ID derived from hostname) | no | yes | |
| 40 | +| Rate limit enforcement | no (policies are stored and reported as metadata only) | yes | |
| 41 | +| Prometheus `/metrics` endpoint on port 4422 | no | yes | |
| 42 | +| OpenTelemetry tracing | no | yes | |
| 43 | +| Database backends | SQLite | SQLite, PostgreSQL, MySQL, CockroachDB | |
| 44 | + |
| 45 | +The configuration schema marks commercial-only blocks (`serve.metrics`, `tracing`, `cache`, `rate_limit`, `multitenancy`, and the |
| 46 | +Redis sub-block) with `x-license-required`. OSS builds parse these blocks but never activate them: the metrics route is a no-op, |
| 47 | +no tracer or tenant routing is created, and rate-limit policies stay metadata. Setting `cache.type` to `memory` or `redis` fails |
| 48 | +because both backends require a license; OSS supports only `noop`. |
| 49 | + |
| 50 | +## Deployment topologies |
| 51 | + |
| 52 | +- Single process. Run `talos serve`. Both surfaces share one listener and database. This is the OSS default and works for |
| 53 | + development and small deployments. See the [deployment overview](../operate/deploy/index.md). |
| 54 | +- Split admin and public. Run `talos serve admin` for the admin API (management plus verification) and `talos serve public` for |
| 55 | + self-revoke, against a shared database. The admin process stays on an internal network behind an authenticating proxy; the |
| 56 | + public process accepts public traffic. Available in OSS and commercial. See |
| 57 | + [separate admin and public APIs](../operate/deploy/deployment-modes.md). |
| 58 | +- Edge proxy. Run `talos proxy` (commercial only) as a sidecar in front of a central Talos cluster. The proxy caches valid |
| 59 | + verification responses locally and forwards everything else to the upstream. See [edge proxy](../operate/deploy/edge-proxy.mdx). |
| 60 | + |
| 61 | +## Design principles |
| 62 | + |
| 63 | +- Stateless verification for derived tokens. JWT and macaroon verification reads neither the database nor the cache. Talos checks |
| 64 | + signatures against the configured JWKS or shared secret. This lets the edge proxy and admin process scale independently of the |
| 65 | + database. |
| 66 | +- Single source of truth for tenancy. Talos derives the network ID from the request context: from the hostname in commercial |
| 67 | + deployments, always `uuid.Nil` in OSS. It never reads the network ID from request bodies or persisted records. See the |
| 68 | + [security model](./security-model.md) for the full rationale. |
| 69 | +- Pluggable persistence and cache. Storage and cache backends are interfaces. The commercial edition supplies additional |
| 70 | + implementations without changing the OSS surface. |
| 71 | + |
| 72 | +## Scalability |
| 73 | + |
| 74 | +Approximate shapes. Exact numbers depend on key formats, cache hit ratio, and database choice. |
| 75 | + |
| 76 | +| Tier | Process layout | Cache | Database | |
| 77 | +| ------ | ----------------------------------------------------------------------------------------------- | -------------------------------------------------------- | -------------------------------------------- | |
| 78 | +| Small | One `talos serve` instance | `noop` (OSS) or `memory` (commercial) | SQLite (OSS) or any backend (commercial) | |
| 79 | +| Medium | A few `talos serve admin` instances behind a load balancer, scaled horizontally for verify load | `redis` for shared state across verify nodes | PostgreSQL or CockroachDB | |
| 80 | +| Large | Regional `talos proxy` sidecars in front of a central Talos cluster | Local cache in each proxy plus a shared `redis` upstream | CockroachDB or PostgreSQL with read replicas | |
| 81 | + |
| 82 | +Verification is the hot path. Admin operations aren't. Size for verify throughput first. |
| 83 | + |
| 84 | +## Observability |
| 85 | + |
| 86 | +Both editions emit structured JSON logs to stderr (set `log.format` to `text` for plain text). The commercial edition also exports |
| 87 | +Prometheus metrics on a dedicated port and OpenTelemetry traces via OTLP. See [monitoring](../operate/monitoring/index.md) for |
| 88 | +setup, configuration, and the available metrics and spans. |
| 89 | + |
| 90 | +## Ports |
| 91 | + |
| 92 | +| Port | Purpose | Edition | |
| 93 | +| ---- | -------------------------------------------------------------------------------------------- | --------------- | |
| 94 | +| 4420 | HTTP API and health checks (`serve.http.port`) | OSS, commercial | |
| 95 | +| 4422 | Health checks; Prometheus `/metrics` scrape endpoint, commercial only (`serve.metrics.port`) | OSS, commercial | |
0 commit comments