|
| 1 | +# ADR-0002: Environment-Per-Database Isolation |
| 2 | + |
| 3 | +**Status**: Accepted |
| 4 | +**Date**: 2026-04-19 |
| 5 | +**Deciders**: ObjectStack Protocol Architects |
| 6 | +**Supersedes**: The v3.4/v4.0 "per-organization database" tenant model |
| 7 | +**Consumers**: `@objectstack/service-tenant`, `@objectstack/spec/cloud`, future `service-subscription`, `service-quota`, `service-audit-log`, `service-dlp-policy`, `service-solution-history` |
| 8 | + |
| 9 | +--- |
| 10 | + |
| 11 | +## Context |
| 12 | + |
| 13 | +The v3.4 / v4.0 multi-tenant model in `@objectstack/service-tenant` provisions **one physical database per organization**, registered in `sys_tenant_database`. Logical separation between *environments* (dev / test / prod / sandbox) is achieved by an `env_id` column carried on every row in every data-plane table. |
| 14 | + |
| 15 | +Operating this model in production surfaced five classes of recurring problems: |
| 16 | + |
| 17 | +1. **Leaky logical isolation.** Every query must carry `WHERE env_id = ?`. A single missing predicate in a hand-written query, a migration, a background job, or a badly-written skill can corrupt production from a developer shell. |
| 18 | +2. **Coupled schema evolution.** A Solution can't upgrade its schema in `dev` without affecting `prod` — the tables are the same physical tables. This blocks blue/green schema rollouts, destructive migrations, and safe rollback. |
| 19 | +3. **Complex backup / DR.** Backing up or restoring just `prod` requires per-row filtering during dump/restore. Point-in-time recovery of one environment leaks into others. |
| 20 | +4. **Difficult Solution publishing.** "Promote Solution X from dev to prod" degenerates into row-level copy jobs with `env_id` rewriting — slow, fragile, and nearly impossible to make atomic. |
| 21 | +5. **No physical boundary for security / compliance.** Per-environment encryption keys, IP allow-lists, retention policies, and audit isolation all require a per-environment DB to be credible. |
| 22 | + |
| 23 | +Meanwhile, the ecosystem has moved on: |
| 24 | + |
| 25 | +- **Turso / libSQL**, **Neon**, **Supabase branches**, **PlanetScale branches**, and **Cloudflare D1** all make "a database per environment" a near-free operation (milliseconds to provision, cents per month to idle). |
| 26 | +- **Power Platform**, **Salesforce**, and **ServiceNow** all expose environments as first-class primitives backed by isolated storage. |
| 27 | +- **Kubernetes namespaces** are the pattern developers reach for; the data layer should match. |
| 28 | + |
| 29 | +## Decision |
| 30 | + |
| 31 | +We upgrade the multi-tenant architecture from **per-organization database** to **per-environment database**, with a hard split between Control Plane and Data Plane: |
| 32 | + |
| 33 | +### Control Plane (shared, single database) |
| 34 | + |
| 35 | +Registers environments and how to reach them — **never** stores business data: |
| 36 | + |
| 37 | +| Table | Purpose | |
| 38 | +|---------------------------|------------------------------------------------------------| |
| 39 | +| `sys_environment` | One row per environment — `(organization_id, slug)` UNIQUE | |
| 40 | +| `sys_environment_database`| Physical DB addressing (1:1 with `sys_environment`) | |
| 41 | +| `sys_database_credential` | Rotatable encrypted secrets (N:1 with `sys_environment_database`) | |
| 42 | +| `sys_environment_member` | Per-environment RBAC (`(environment_id, user_id)` UNIQUE) | |
| 43 | + |
| 44 | +### Data Plane (one database per environment) |
| 45 | + |
| 46 | +Each environment owns its own physical database containing: |
| 47 | + |
| 48 | +- All `sys_` data-plane objects — `sys_package_installation`, `sys_solution_history`, … |
| 49 | +- All business objects — `account`, `contact`, user tables, … |
| 50 | +- **Zero** `environment_id` columns. The environment is **implicit** in the connection. |
| 51 | + |
| 52 | +### Session → Routing |
| 53 | + |
| 54 | +`better-auth` sessions carry a single `active_environment_id`. The tenant router resolves: |
| 55 | + |
| 56 | +``` |
| 57 | +session.active_environment_id |
| 58 | + → sys_environment (→ organization_id) |
| 59 | + → sys_environment_database (url, driver, region) |
| 60 | + → sys_database_credential (active secret, decrypted) |
| 61 | + → data-plane driver |
| 62 | +``` |
| 63 | + |
| 64 | +Switching environments ⇒ swapping DB connections. There is no in-process filter that can be forgotten. |
| 65 | + |
| 66 | +### Provisioning API |
| 67 | + |
| 68 | +`EnvironmentProvisioningService` (new) exposes: |
| 69 | + |
| 70 | +- `provisionOrganization(req)` — atomically creates the org's **default** environment and its physical DB (replaces `provisionTenant`). |
| 71 | +- `provisionEnvironment(req)` — allocates any subsequent `dev` / `test` / `sandbox` / `preview` environment, each with its own DB and credential row. |
| 72 | +- `rotateCredential(envDbId, plaintext)` — issues a new `active` credential and revokes the previous one. |
| 73 | + |
| 74 | +Physical-DB allocation is delegated to pluggable `EnvironmentDatabaseAdapter` implementations (initially `turso`; `libsql` / `sqlite` / `postgres` drop in without core changes). |
| 75 | + |
| 76 | +### Deprecation & Migration |
| 77 | + |
| 78 | +- **v4.x** keeps `sys_tenant_database` registered as a deprecation shim (TSDoc `@deprecated`, runtime log warning). The new control-plane objects ship alongside it, additive, non-breaking. |
| 79 | +- `migrations/v4-to-v5-env-migration.ts` ships in v4.x as a **skeleton** (stable public API) and is executed during the v5.0 upgrade. |
| 80 | +- **v5.0** removes `sys_tenant_database` and its reader code entirely. |
| 81 | + |
| 82 | +The migration is **non-destructive** and **idempotent**: each legacy org's database is reused as its new `prod` environment DB — no data movement, no cutover window. |
| 83 | + |
| 84 | +## Consequences |
| 85 | + |
| 86 | +### Positive |
| 87 | + |
| 88 | +- **Hard isolation.** Prod and dev are separate databases; no `WHERE` clause can be forgotten. |
| 89 | +- **Independent schema evolution.** Solutions upgrade their schema in `dev`, validate, then promote via a single DB-level backup/restore into `prod`. |
| 90 | +- **Trivial backup / DR.** Per-environment backup = native DB backup. PITR stays within one environment. |
| 91 | +- **First-class Solution publishing.** "Publish" becomes a schema + metadata export from `dev` and an idempotent apply into `prod`, operating on cleanly-scoped DBs. |
| 92 | +- **Per-environment security posture.** Each environment owns its own credential, its own network ACL, its own quotas, its own retention. |
| 93 | +- **Pluggable backend.** Driver-agnostic — new backends register an `EnvironmentDatabaseAdapter` without core changes. |
| 94 | +- **Future-proof.** Naturally slots in quotas (`sys_quota`), subscriptions (`sys_subscription`), audit (`sys_audit_log`), DLP (`sys_dlp_policy`), and solution history (`sys_solution_history`) as subsequent PRs. |
| 95 | + |
| 96 | +### Negative / Trade-offs |
| 97 | + |
| 98 | +- **More databases to operate.** Every org now has ≥1 DB; heavy users of `sandbox` / `preview` environments may have 5–20. Mitigated by Turso/libSQL free-tier economics and lazy provisioning. |
| 99 | +- **Cross-environment reporting** (e.g. "how many leads across all of Acme's envs?") becomes an explicit federation query. Acceptable — such queries are rare and better expressed at the BI layer. |
| 100 | +- **Cold starts.** A dormant environment may need to be resumed on first access. Mitigated by the router's TTL cache and the adapter's warm-up hook. |
| 101 | +- **Connection sprawl.** A node handling many environments holds N connections. Mitigated by an LRU connection pool with per-env TTL (already present in the v3.4 router). |
| 102 | +- **Irrevocable breaking change at v5.0.** v4.x ships the shim and migration; v5.0 removes legacy code. Customers must run the migration before upgrading. |
| 103 | + |
| 104 | +### Neutral |
| 105 | + |
| 106 | +- No change to Zod-first, `.describe()` on every field, `sys_` prefix invariants. |
| 107 | +- No change to the public `ObjectKernel` / plugin lifecycle. |
| 108 | +- No change to `better-auth` session shape beyond renaming `active_organization_id` → `active_environment_id` (v5.0). |
| 109 | + |
| 110 | +## Alternatives Considered |
| 111 | + |
| 112 | +1. **Stay with per-org DB + `env_id` column.** Rejected — the failure modes above are structural, not implementation bugs. |
| 113 | +2. **Schema-per-environment inside one DB.** Works for Postgres but not Turso/libSQL/SQLite, and defeats the backup/DR argument. Rejected. |
| 114 | +3. **Row-level security via Postgres RLS.** Strengthens the `env_id` approach but still leaves schema evolution coupled and DR complex. Rejected. |
| 115 | +4. **One global DB + tenant column.** Was never on the table — already discarded in v3.4's ADR-0001. |
| 116 | + |
| 117 | +## References |
| 118 | + |
| 119 | +- `packages/spec/src/cloud/environment.zod.ts` — protocol schemas |
| 120 | +- `packages/services/service-tenant/src/objects/sys-environment*.object.ts` — control-plane objects |
| 121 | +- `packages/services/service-tenant/src/environment-provisioning.ts` — provisioning service |
| 122 | +- `packages/services/service-tenant/migrations/v4-to-v5-env-migration.ts` — migration skeleton |
| 123 | +- Power Platform environments: <https://learn.microsoft.com/power-platform/admin/environments-overview> |
| 124 | +- Salesforce sandboxes: <https://help.salesforce.com/s/articleView?id=data.sandboxes.htm> |
| 125 | +- Turso multi-DB pricing: <https://turso.tech/pricing> |
0 commit comments