┌─────────────────────────────────────────────────────────────────────┐
│ BROWSER (SPA) │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌─────────────────┐ │
│ │ Scene3D │ │ HUD │ │ Voice │ │ HandControls │ │
│ │ (R3F) │◀──│ (state) │──▶│ Module │ │ (MediaPipe) │ │
│ └──────────┘ └──────────┘ └──────────┘ └─────────────────┘ │
│ ▲ │ │ │ │
│ │ ▼ ▼ ▼ │
│ │ ┌──────────────────────────────────────────┐ │
│ │ │ services/geminiService (fetch client) │ │
│ │ └──────────────────────────────────────────┘ │
│ │ │ │
│ │ ▼ │
│ │ ┌─────────────────────────┐ AudioWorklet │
│ │ │ /api/* (server proxy) │ (PCM encoder) │
│ │ └─────────────────────────┘ │
│ │ │ │
└───────┼────────────────────────────┼────────────────────────────────┘
│ │
│ ▼
│ ┌─────────────────────────────────┐
│ │ VERCEL EDGE RUNTIME │
│ │ │
│ │ analyze ▸ diagnostics ▸ chat │
│ │ live-token ▸ systems/* │
│ └────┬──────────────────────┬─────┘
│ │ │
│ ▼ ▼
│ ┌──────────────┐ ┌───────────────┐
└──────▶│ Gemini API │ │ Supabase │
WS │ (REST + │ │ (Postgres + │
direct │ Live) │ │ RLS) │
└──────────────┘ └───────────────┘
user → HUD.triggerAnalysis
→ services/geminiService.analyzeSystem
→ POST /api/analyze
→ Edge Function checks rate limit (Supabase RPC)
→ Edge Function calls Gemini 3.1 Pro with strict JSON schema
→ normalizeAnalysis() validates + slugs IDs + resolves connections
→ response streams back to Scene3D
user taps mic
→ services/geminiService.fetchLiveToken
→ POST /api/live-token
→ Edge Function mints ephemeral token (10 min TTL)
→ browser opens WebSocket directly to Google Live endpoint using token
→ AudioWorklet (pcm-processor) downsamples mic to 16 kHz Int16
→ frames posted via WS; responses decoded at 24 kHz for playback
→ tool_call `generate_system` triggers analyze flow via onCommand()
user clicks Share
→ services/storage.saveSystem → POST /api/systems/save
→ Supabase INSERT into `systems` with random share_hash
→ client copies `https://.../?s=<hash>` to clipboard
→ recipient loads page; App reads ?s= and calls GET /api/systems/load
├── api/ Vercel Edge Functions
│ ├── _shared/ Shared helpers (cors, ratelimit, validate, schemas, gemini)
│ ├── systems/ save, load, list
│ ├── analyze.ts Gemini Pro → SystemAnalysis
│ ├── diagnostics.ts Gemini Flash → Diagnostic issues
│ ├── chat.ts Gemini Flash → chat reply
│ ├── live-token.ts Ephemeral Live API token
│ └── health.ts /api/health for uptime checks
├── components/
│ ├── Interface/ HUD, VoiceModule, Toast, ErrorBoundary, GlassCard
│ └── Simulation/ Scene3D, HandControls
├── hooks/ useLiveSession, useToast, useKeyboardShortcuts
├── services/ geminiService (fetch client), storage, supabaseClient
├── public/ manifest, favicon, robots, sitemap, worklets/
├── supabase/ config.toml + migrations/
├── tests/ vitest setup + specs
└── docs/ ARCHITECTURE, DEPLOYMENT, API, CONTRIBUTING
- Server proxy over direct-from-browser calls. Prevents API key leakage and gives us a natural rate-limit choke point. Cost: one extra hop (~40ms at the edge).
- Ephemeral tokens for Live API. Avoids proxying WebSocket traffic (harder on serverless), while still never exposing the long-lived Gemini key.
- Supabase as the single backend store. Covers DB (systems), rate-limit state, and future Auth - one vendor instead of Redis + Auth0 + Firestore.
- AudioWorklet over ScriptProcessor. Off-main-thread, predictable timing, no UI jank, future-proof.
- Strict JSON schema on Gemini output with server-side validation (
normalizeAnalysis) - the WebGL layer can never crash from a missing field. - Adaptive quality tier in R3F.
PerformanceMonitordowngrades shadows, environment, and star count on low-FPS devices. - Error boundaries around each major subsystem. A bad shader crash doesn't take down the HUD; a gesture failure doesn't take down the 3D view.
GEMINI_API_KEYis server-only (Vercel env var, never in the bundle).- Supabase service role key is server-only; browser uses the anon key protected by RLS.
- All user input is length-capped and control-character-stripped before hitting Gemini.
- Rate limits are per-IP (hashed with salt before storage).
- Permissions-Policy restricts camera/mic to same-origin.
- No inline
eval/dangerouslySetInnerHTML.
| Metric | Target |
|---|---|
| First Contentful Paint | < 1.5s |
| Largest Contentful Paint | < 2.5s |
| Time to Interactive | < 3.5s |
| Analyze (Gemini Pro) | P50 8s / P99 20s |
| Diagnostics (Gemini Flash) | P50 2s / P99 5s |
| Voice round-trip | P50 700ms / P99 1.5s |
| 3D frame budget @ tier 2 | 16.6 ms (60 FPS) |
| 3D frame budget @ tier 1 | 33.3 ms (30 FPS on low-end) |