This document defines the standard architecture patterns for building products in the AI Product OS.
All architecture and engineering agents must reference this guide when designing systems.
- Start with a single Next.js application containing frontend, backend API routes, and cron endpoints
- Avoid premature microservices - they add complexity without MVP-stage benefits
- Split services only when you have clear scalability requirements backed by metrics
- Platform: Default to Vercel for Next.js deployments (serverless functions + edge runtime)
- Constraints:
- 10-second timeout for Hobby tier, 60s for Pro (design accordingly)
- No persistent processes or long-running background jobs
- Stateless functions only
- Fan-Out Pattern: Cron jobs should trigger, not process. Use worker invocations for per-entity operations.
- Define schema before writing application code
- Use relational databases (PostgreSQL via Supabase) for structured data
- Enable Row-Level Security (RLS) from day one, even in MVP
- Write schema.sql files that can be run idempotently
- Framework: Next.js 16+ (App Router)
- Language: TypeScript (strict mode)
- Styling: Tailwind CSS 4+ (utility-first, responsive design)
- UI Components:
- Framer Motion (animations)
- Lucide React (icons)
- Radix UI or Shadcn (accessible primitives)
- State Management: React hooks + optimistic UI patterns
- Client-Side Routing: Next.js App Router navigation
- API Layer: Next.js API Routes (serverless functions)
- Database: Supabase (PostgreSQL + Auth + Storage)
- ORM/Client:
@supabase/supabase-js(official client) - Authentication: Supabase GoTrue (email/password, OAuth providers)
- Cron Jobs: Vercel Cron or external triggers (Upstash QStash)
- Primary: Google Gemini (
@google/genaiSDK)- Use
gemini-2.5-flashfor speed-critical operations (<2s response) - Use
gemini-2.5-profor complex reasoning - Always use Structured Outputs (JSON Schema) to guarantee valid responses
- Use
- Alternatives: OpenAI GPT-4, Anthropic Claude
- Prompt Engineering: Store prompts in code, version control them
- Product Analytics: PostHog (web + server-side events)
- Error Tracking: Vercel Runtime Logs (MVP), Sentry (production)
- Performance: Built-in Next.js analytics, Web Vitals
// app/api/[resource]/route.ts
export async function GET(req: Request) {
// List/Read operations
// MUST include .limit() on queries
}
export async function POST(req: Request) {
// Create operations
// Validate input, call AI if needed, persist to DB
}
export async function PUT(req: Request) {
// Update operations
// Use query params for IDs: /api/tasks?id=123
}
export async function DELETE(req: Request) {
// Delete operations (use soft deletes where possible)
}- Validation: Check auth, validate input shape/size
- External Calls: AI APIs, third-party services (MUST await)
- Database Operation: Single source of truth
- Response: Return JSON with success/error structure
- Telemetry: Log event to PostHog before returning
CREATE TABLE [entity_name] (
id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
user_id UUID REFERENCES auth.users(id), -- If user-scoped
-- Business fields
[field_name] TEXT NOT NULL,
-- Metadata
created_at TIMESTAMP WITH TIME ZONE DEFAULT timezone('utc', now()) NOT NULL,
updated_at TIMESTAMP WITH TIME ZONE DEFAULT timezone('utc', now()) NOT NULL
);
-- Enable RLS
ALTER TABLE [entity_name] ENABLE ROW LEVEL SECURITY;
-- Create policy
CREATE POLICY "Users can only access their own data"
ON [entity_name]
FOR ALL
USING (auth.uid() = user_id);Use PostgreSQL ENUMs for constrained string fields:
CREATE TYPE task_status AS ENUM ('todo', 'in_progress', 'done');
CREATE TYPE priority AS ENUM ('low', 'medium', 'high');-- Index frequently queried columns
CREATE INDEX idx_tasks_user_id ON tasks(user_id);
CREATE INDEX idx_tasks_status ON tasks(status);
CREATE INDEX idx_tasks_created_at ON tasks(created_at DESC);const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: prompt,
config: {
responseMimeType: "application/json",
responseSchema: {
type: Type.OBJECT,
properties: {
category: {
type: Type.STRING,
enum: ['option1', 'option2', 'option3']
},
confidence: { type: Type.NUMBER }
},
required: ["category"]
}
}
});let result: any = null;
let isFallback = false;
try {
// Strip markdown codeblocks
const cleanText = response.text
.replace(/```json\n?/g, '')
.replace(/```\n?/g, '')
.trim();
result = JSON.parse(cleanText);
// Validate shape
if (!isValidResult(result)) {
isFallback = true;
}
} catch (e) {
console.error("AI parsing failed:", e);
isFallback = true;
}
if (isFallback) {
// Apply safe default to prevent data loss
result = { category: 'uncategorized', title: rawInput };
}// L ANTI-PATTERN: Master cron processes all users synchronously
export async function GET() {
const users = await fetchAllUsers();
for (const user of users) {
await processUser(user); // Sequential, hits timeout
}
}
// � CORRECT: Master cron triggers, workers process
export async function GET() {
const users = await fetchAllUsers();
// Fan out to individual worker invocations
await Promise.allSettled(
users.map(user =>
fetch('/api/worker', {
method: 'POST',
body: JSON.stringify({ userId: user.id })
})
)
);
return new Response('Triggered');
}// L ANTI-PATTERN: N+1 queries
for (const userId of userIds) {
const data = await db.from('tasks').select().eq('user_id', userId);
}
// � CORRECT: Single batched query
const allData = await db
.from('tasks')
.select()
.in('user_id', userIds);// Client-side
import { supabase } from '@/lib/supabase';
const { data, error } = await supabase.auth.signUp({
email,
password
});
// Server-side (API route)
const token = req.headers.get('authorization')?.split(' ')[1];
const { data: user } = await supabase.auth.getUser(token);-- Read own data
CREATE POLICY "Users read own tasks"
ON tasks FOR SELECT
USING (auth.uid() = user_id);
-- Insert own data
CREATE POLICY "Users insert own tasks"
ON tasks FOR INSERT
WITH CHECK (auth.uid() = user_id);
-- Update own data
CREATE POLICY "Users update own tasks"
ON tasks FOR UPDATE
USING (auth.uid() = user_id);function classifyError(error: any): 'transient' | 'permanent' | 'unknown' {
const status = error.response?.status;
// Permanent errors (don't retry, may suspend user)
if ([400, 401, 403, 404].includes(status)) return 'permanent';
// Transient errors (safe to retry)
if ([429, 500, 502, 503, 504].includes(status)) return 'transient';
return 'unknown';
}
// Only apply account-level consequences for permanent errors
if (classifyError(twilioError) === 'permanent') {
await db.from('users').update({ is_active: false }).eq('id', userId);
}// REQUIRED: Always enforce limits
export async function GET(req: Request) {
const url = new URL(req.url);
const page = parseInt(url.searchParams.get('page') || '1');
const limit = Math.min(
parseInt(url.searchParams.get('limit') || '100'),
100 // Hard cap
);
const { data } = await supabase
.from('tasks')
.select()
.range((page - 1) * limit, page * limit - 1);
return NextResponse.json({ data, page, limit });
}// REQUIRED: Both page limit AND temporal bound
let pageCount = 0;
const MAX_PAGES = 5;
const LOOKBACK_DAYS = 30;
while (pageToken && pageCount < MAX_PAGES) {
const messages = await gmail.users.messages.list({
userId: 'me',
pageToken,
q: `newer_than:${LOOKBACK_DAYS}d` // Temporal bound
});
pageToken = messages.nextPageToken;
pageCount++;
}Structure events by lifecycle stage:
landing_page_viewedsignup_startedsignup_completedfirst_action_completed
[feature]_submitted(user initiates action)[feature]_completed(system confirms success)[feature]_failed(system encounters error)
ai_fallback_triggered(AI processing failed)api_timeout(external service slow)retry_exhausted(permanent failure)
// Frontend (posthog-js)
import { usePostHog } from 'posthog-js/react';
const posthog = usePostHog();
posthog.capture('task_submitted', {
input_length: text.length,
timestamp: Date.now()
});
// Backend (posthog-node)
import PostHog from 'posthog-node';
const client = new PostHog(process.env.POSTHOG_KEY);
client.capture({
distinctId: userId,
event: 'task_categorized',
properties: {
category,
ai_latency_ms: latency
}
});
await client.shutdown();# Environment Variables (set in Vercel dashboard)
NEXT_PUBLIC_SUPABASE_URL=https://xxx.supabase.co
NEXT_PUBLIC_SUPABASE_ANON_KEY=eyJxxx
SUPABASE_SERVICE_ROLE_KEY=eyJxxx # Server-only
GEMINI_API_KEY=AIzxxx # Server-only
NEXT_PUBLIC_POSTHOG_KEY=phc_xxx
NEXT_PUBLIC_POSTHOG_HOST=https://app.posthog.com- Schema Applied: Run
schema.sqlagainst Supabase before first deploy - Env Vars Set: All required secrets configured in platform
- Build Success:
npm run buildcompletes without errors - RLS Enabled: Row-level security active on all tables
- Telemetry Wired: PostHog events firing on critical paths
- Error Handling: All API routes return proper 400/500 responses
- Fire-and-forget promises in serverless functions
- Unbounded database queries without
.limit() - Synchronous loops over async operations (use
Promise.all()) - Processing all users in a single cron execution
- Using AI snippets/previews instead of full payloads
- Treating all third-party errors as permanent failures
- Skipping RLS because "it's just an MVP"
- Adding telemetry after QA instead of during implementation
When designing architecture, answer these questions:
- Scalability: Will this work with 10x users? 100x?
- Failure Modes: What happens if the AI times out? If the DB is slow?
- Cost: Does this architecture fit within platform free tiers?
- Observability: Can I debug this in production with logs alone?
- Security: Is user data properly isolated and encrypted?
- Next.js App Router: https://nextjs.org/docs/app
- Supabase Docs: https://supabase.com/docs
- Vercel Limits: https://vercel.com/docs/limits
- PostHog Docs: https://posthog.com/docs
This guide is a living document. Update it when postmortems reveal new patterns or anti-patterns.