Project Context: A voice-powered technical interview preparation platform using Next.js 16, Azure AI services (OpenAI GPT-4o-mini, Speech TTS/STT), TypeScript, Prisma 7, and PostgreSQL.
- Azure Services Integration
- Next.js 16 & React 19
- Offline-First Architecture
- AI Prompt Engineering
- Real-Time Audio Processing
- TypeScript & Type Safety
- Infrastructure as Code (Terraform)
- Developer Experience
Problem:
- Azure Speech SDK runs in the browser and requires authentication
- Exposing API keys client-side is a major security vulnerability
- Direct API key usage violates security best practices
Solution:
// Server-side token endpoint (app/api/speech/token/route.ts)
export async function GET() {
const { token, region, expiresAt } = await generateSpeechToken();
return NextResponse.json({ token, region, expiresAt });
}
// Client-side usage
const { token, region } = await fetch('/api/speech/token').then(r => r.json());
const speechConfig = SpeechSDK.SpeechConfig.fromAuthorizationToken(token, region);Lessons:
- ✅ Always use authorization tokens for client-side Azure SDK usage (10-minute lifetime)
- ✅ Implement token refresh logic when sessions exceed 10 minutes
- ✅ Keep API keys server-side only via environment variables
⚠️ Token generation adds latency - cache tokens client-side until near expiry
Problem:
- API versions change frequently (monthly preview releases)
- New features require specific API versions (e.g., prompt caching)
- Documentation doesn't always specify minimum version requirements
Solution:
// lib/azure-openai.ts
const openaiClient = new AzureOpenAI({
apiKey: process.env.AZURE_OPENAI_API_KEY,
endpoint: process.env.AZURE_OPENAI_ENDPOINT,
apiVersion: process.env.AZURE_OPENAI_API_VERSION || '2024-10-21', // Latest GA
});Lessons:
- ✅ Use latest GA version (2024-10-21 as of Jan 2026) for stability
- ✅ Make API version configurable via environment variables
- ✅ Test with preview versions for new features (e.g.,
2024-10-01-previewfor prompt caching) ⚠️ Monitor breaking changes - Azure OpenAI has monthly preview releases
Key Findings:
| Feature | Minimum API Version | Status |
|---|---|---|
| JSON Mode | 2023-12-01-preview | GA |
| Structured Outputs | 2024-08-01-preview | GA |
| Prompt Caching | 2024-10-01-preview | Preview |
| gpt-4o-mini | 2024-08-01-preview | GA |
References:
Problem:
- Not all Azure regions support neural TTS voices
- OpenAI voices only available in specific regions (North Central US, Sweden Central)
- Region mismatch between Speech and OpenAI resources causes deployment issues
Solution:
# infra/variables.tf
variable "location" {
description = "Primary Azure region"
default = "southcentralus" # Has gpt-4o-mini
}
variable "openai_location" {
description = "OpenAI-specific region (may differ)"
default = "southcentralus" # Verify model availability
}Lessons:
- ✅ Check model availability per region before deployment
- ✅ Use different regions for OpenAI vs Speech if needed
- ✅ Document region dependencies in Terraform comments
⚠️ Central US lacks gpt-4o-mini - use South Central US, East US, or West US
References:
Problem:
- Initial cost estimates were 4-5x lower than actual pricing (Jan 2026 price updates)
- No built-in cost alerts for OpenAI token usage
- Speech Service charges accumulate quickly with TTS/STT usage
Reality Check:
INITIAL ESTIMATE (Outdated):
- OpenAI: $0.03/$0.12 per 1M tokens → $6-20/month
ACTUAL PRICING (Jan 2026):
- OpenAI: $0.165/$0.66 per 1M tokens → $12-28/month
- Speech TTS: $16/1M characters (was documented as $4)
- Speech STT: $1/hour real-time (unchanged)
Solution:
// Implement usage tracking
export async function generateQuestions(config) {
const startTime = Date.now();
const response = await client.chat.completions.create({...});
// Track token usage
console.log('Token usage:', {
prompt_tokens: response.usage?.prompt_tokens,
completion_tokens: response.usage?.completion_tokens,
total_tokens: response.usage?.total_tokens,
estimated_cost: calculateCost(response.usage),
});
return response;
}Lessons:
- ✅ Always verify pricing via Microsoft Learn MCP - docs can be outdated
- ✅ Use F0 (Free) tier during development (500K TTS chars + 5 hours STT/month)
- ✅ Enable prompt caching for 50% input token savings (requires 2024-10-01-preview+)
- ✅ Set Azure cost alerts at $10, $25, $50 thresholds
- ✅ Log token usage on every API call for monitoring
⚠️ Prompt caching only works with identical prefixes (system prompts, context)
References:
Problem:
- Breaking change in Next.js 16:
paramsis now aPromise - Code that worked in Next.js 13-15 throws
TypeErrorin Next.js 16 - No clear migration guide in early documentation
Before (Next.js 13-15):
// ❌ BROKEN in Next.js 16
export default async function InterviewPage({
params,
}: {
params: { id: string };
}) {
const session = await prisma.interviewSession.findUnique({
where: { id: params.id }, // TypeError: Cannot read property 'id' of Promise
});
}After (Next.js 16):
// ✅ CORRECT for Next.js 16+
export default async function InterviewPage({
params,
}: {
params: Promise<{ id: string }>;
}) {
const { id } = await params; // Must await!
const session = await prisma.interviewSession.findUnique({
where: { id },
});
}Lessons:
- ✅ Always
await paramsin dynamic routes (pages with[id],[slug], etc.) - ✅ Update TypeScript types to
Promise<{ ... }>for params - ✅ Apply to ALL dynamic routes - API routes, pages,
generateMetadata() ⚠️ This is a source breaking change - code won't compile until fixed
References:
Problem:
- Default to Server Components, but need Client Components for:
- Event handlers (
onClick,onChange) - React hooks (
useState,useEffect,useContext) - Browser APIs (
window,navigator,localStorage) - Third-party libraries (Azure Speech SDK, audio recording)
- Event handlers (
Solution:
// ✅ GOOD - Server Component (default)
// app/interview/[id]/page.tsx
export default async function InterviewPage({ params }) {
const { id } = await params;
const session = await prisma.interviewSession.findUnique({ where: { id } });
const questions = await prisma.interviewQuestion.findMany({ where: { sessionId: id } });
return <InterviewRoomClient session={session} questions={questions} />;
}
// ✅ GOOD - Client Component (when needed)
// app/interview/[id]/client.tsx
'use client';
export function InterviewRoomClient({ session, questions }) {
const [isRecording, setIsRecording] = useState(false);
// ... event handlers, hooks, browser APIs
}Lessons:
- ✅ Keep data fetching in Server Components for direct DB/API access
- ✅ Use Client Components only for interactivity - keeps bundle size small
- ✅ Pass data as props from Server → Client Components
- ✅ Split into server + client wrappers for complex pages
⚠️ 'use client'boundary is sticky - all imports become client-side
Metrics:
- Initial load reduced by ~40% using Server Components
- Better SEO with fully-rendered HTML
- Faster Time to Interactive (TTI)
Problem:
- Next.js 16 uses Turbopack by default (no
--turbopackflag needed) - Some npm packages incompatible with Turbopack (rare edge cases)
- Hot reload behavior differs from Webpack
Solution:
// package.json
{
"scripts": {
"dev": "next dev", // Uses Turbopack by default
"dev:webpack": "next dev --webpack" // Fallback if needed
}
}Lessons:
- ✅ Turbopack is faster - 5-10x faster cold starts, 3x faster HMR
- ✅ No configuration needed - works out of the box
- ✅ Keep Webpack fallback for compatibility testing
⚠️ Some edge cases exist - test thoroughly before production
Performance Gains:
| Metric | Webpack | Turbopack | Improvement |
|---|---|---|---|
| Cold Start | ~8-12s | ~2-3s | 5-6x faster |
| HMR | ~1-2s | ~200-500ms | 3-4x faster |
| Full Reload | ~5-8s | ~1-2s | 4x faster |
Problem:
- Users lose work if network drops during interview
- Browser
localStoragetoo small (5-10 MB limit) - Need structured storage for responses, transcriptions, audio blobs
Solution:
// lib/offline-storage.ts
import { openDB } from 'idb';
const DB_NAME = 'ai-tech-interview';
const DB_VERSION = 1;
const db = await openDB(DB_NAME, DB_VERSION, {
upgrade(db) {
// Pending responses queue
db.createObjectStore('pending-responses', { keyPath: 'id', autoIncrement: true });
// Cached sessions
db.createObjectStore('sessions', { keyPath: 'id' });
// Cached questions
db.createObjectStore('questions', { keyPath: 'id' });
},
});
// Save response when offline
export async function savePendingResponse(response: PendingResponse) {
const db = await getDB();
await db.add('pending-responses', {
...response,
status: 'pending',
retryCount: 0,
createdAt: new Date().toISOString(),
});
}Lessons:
- ✅ Use
idbwrapper library - cleaner API than raw IndexedDB - ✅ Store audio as Blobs directly in IndexedDB (no size limit like localStorage)
- ✅ Implement retry logic with exponential backoff (1s, 2s, 4s, 8s)
- ✅ Show pending count in UI to keep users informed
⚠️ IndexedDB is asynchronous - all operations return Promises⚠️ Test quota limits - browsers may clear data when storage is low
References:
Problem:
navigator.onLineis unreliable (reports online even with no internet)- Need to detect actual connectivity to backend API
- Avoid sync storms when multiple tabs reconnect simultaneously
Solution:
// lib/offline-storage.ts
export function isOnline(): boolean {
if (typeof window === 'undefined') return true; // SSR safe
return navigator.onLine;
}
export function onNetworkChange(callback: (online: boolean) => void): () => void {
if (typeof window === 'undefined') return () => {};
const handleOnline = () => callback(true);
const handleOffline = () => callback(false);
window.addEventListener('online', handleOnline);
window.addEventListener('offline', handleOffline);
return () => {
window.removeEventListener('online', handleOnline);
window.removeEventListener('offline', handleOffline);
};
}
// hooks/useOfflineSupport.ts
useEffect(() => {
const cleanup = onNetworkChange(async (online) => {
setState((prev) => ({ ...prev, isOnline: online }));
if (online && !syncInProgress.current) {
await triggerSync(); // Auto-sync when reconnecting
}
});
return cleanup;
}, []);Lessons:
- ✅ Listen to online/offline events but verify with API health check
- ✅ Auto-sync on reconnect for seamless UX
- ✅ Show offline indicator prominently in UI
- ✅ Use sync flags to prevent concurrent syncs
⚠️ navigator.onLineis a hint only - always verify with actual request⚠️ Implement debouncing for rapid network toggles
UX Impact:
- Users can complete interviews even with spotty WiFi
- Zero data loss if network drops mid-response
- Transparent background sync when reconnected
Problem:
- Audio recordings (Blob objects) need offline persistence
- Blobs can be 1-10 MB per response (30-60s audio at 128kbps)
- Need efficient storage and retrieval
Solution:
// Store audio blob directly
export async function savePendingResponse(response: {
questionId: string;
sessionId: string;
audioBlob: Blob;
transcription: string;
durationSeconds: number;
}) {
const db = await getDB();
await db.add('pending-responses', {
questionId: response.questionId,
sessionId: response.sessionId,
audioBlob: response.audioBlob, // Store Blob directly!
transcription: response.transcription,
durationSeconds: response.durationSeconds,
status: 'pending',
retryCount: 0,
});
}
// Convert to FormData for upload
export async function syncPendingData() {
const responses = await db.getAll('pending-responses');
for (const response of responses) {
const formData = new FormData();
formData.append('audio', response.audioBlob, 'recording.webm');
formData.append('transcription', response.transcription);
// ... other fields
const result = await fetch('/api/responses', {
method: 'POST',
body: formData,
});
}
}Lessons:
- ✅ IndexedDB supports Blobs natively - no need for base64 encoding
- ✅ Store original Blob - don't convert to string (avoids 33% size increase)
- ✅ Use FormData for upload - handles multipart encoding automatically
- ✅ Set content type on Blob creation (
audio/webm,audio/mp4) ⚠️ Monitor quota usage - browsers have ~50-100 GB limit but may clear old data
Problem:
- LLMs are non-deterministic - output format varies without strict guidance
- JSON parsing fails if LLM returns malformed JSON or extra text
- Need reliable structured data for database insertion
Solution:
// lib/azure-openai.ts
const response = await client.chat.completions.create({
model: deployment,
response_format: { type: 'json_object' }, // Force JSON mode
messages: [
{
role: 'system',
content: `Return a JSON object with this EXACT structure:
{
"topics": [
{
"name": "string",
"description": "string",
"priority": 1-3,
"category": "technical|system-design|domain|soft-skills|methodology",
"keywords": ["string"]
}
]
}
RULES:
- Return ONLY valid JSON
- No markdown code blocks
- No explanatory text before or after JSON`
},
{ role: 'user', content: userPrompt },
],
temperature: 0.5, // Lower = more consistent
});Lessons:
- ✅ Use
response_format: { type: 'json_object' }(requires API version 2023-12-01+) - ✅ Provide exact JSON schema in system prompt with examples
- ✅ Use CAPITAL INSTRUCTIONS for critical rules (LLMs pay more attention)
- ✅ Set temperature 0.3-0.5 for structured outputs (vs 0.7-1.0 for creative)
- ✅ Validate with Zod after parsing for runtime type safety
⚠️ JSON mode doesn't guarantee schema - still need validation⚠️ System prompt must mention "JSON" for json_object mode to work
References:
Problem:
- Initial approach: Generate all questions in one prompt → inconsistent distribution
- Need minimum 2 questions per topic → LLM often skips topics
- Question quality varied wildly across categories
Evolution:
Attempt 1 (Failed):
// ❌ Generated 12 questions total, but 8 were on one topic, 0-1 on others
const prompt = `Generate 10-15 interview questions for: ${roleTitle}`;Attempt 2 (Better):
// ✅ Extract topics first, then generate questions per topic
// Step 1: Topic extraction
const topics = await extractTopicsFromJobDescription(roleTitle, jobDescription);
// Step 2: Generate questions with topic distribution rules
const prompt = `
## CRITICAL REQUIREMENTS
1. Generate MINIMUM ${minQuestionsPerTopic} questions per topic
2. Total questions: ${minTotalQuestions}-${maxTotalQuestions}
3. Cover ALL ${topics.length} topics provided
## TOPICS TO COVER
${topics.map((t, i) => `${i + 1}. ${t.name} (Priority ${t.priority}): ${t.description}`).join('\n')}
## DISTRIBUTION RULES
- Priority 1 topics: 3-4 questions each
- Priority 2 topics: 2-3 questions each
- Priority 3 topics: 1-2 questions each
`;Lessons:
- ✅ Two-phase approach: Topic extraction → Question generation
- ✅ Explicit minimums for each topic in prompt (with bold/caps)
- ✅ Include examples of well-distributed output
- ✅ Use JSON arrays to enforce structure
⚠️ Longer prompts = higher cost but worth it for quality⚠️ Validate distribution post-generation and retry if needed
Results:
| Approach | Topics Covered | Distribution | Quality |
|---|---|---|---|
| Single prompt | 2-3 / 6 | Poor | Medium |
| Two-phase | 6 / 6 | Excellent | High |
Problem:
- Same evaluation criteria for junior vs senior candidates → unfair
- Junior candidates penalized for not discussing advanced concepts
- Senior candidates not challenged enough
Solution:
// lib/prompts.ts
export function getEvaluationPrompt(
seniorityLevel: SeniorityLevel,
question: string,
transcription: string,
expectedTopics: string[]
) {
const expectations = {
junior: {
depthLevel: 'Basic understanding with 1-2 examples',
technicalAccuracy: 'Core concepts correct, minor gaps acceptable',
clarity: 'Clear explanation of basics, simplified analogies',
},
mid: {
depthLevel: 'Solid understanding with trade-offs, 2-3 examples',
technicalAccuracy: 'Strong grasp of concepts + some edge cases',
clarity: 'Structured explanation with logical flow',
},
senior: {
depthLevel: 'Deep expertise, architectural thinking, 3+ examples',
technicalAccuracy: 'Comprehensive coverage + edge cases + best practices',
clarity: 'Executive summary + technical depth, anticipates questions',
},
};
const { depthLevel, technicalAccuracy, clarity } = expectations[seniorityLevel];
return `
You are evaluating a ${seniorityLevel.toUpperCase()} candidate's response.
## SENIORITY-ADJUSTED EXPECTATIONS
**Depth Level:** ${depthLevel}
**Technical Accuracy:** ${technicalAccuracy}
**Clarity:** ${clarity}
## QUESTION
${question}
## EXPECTED TOPICS
${expectedTopics.join(', ')}
## CANDIDATE RESPONSE
${transcription}
Evaluate based on the seniority expectations above.
`;
}Lessons:
- ✅ Adjust evaluation criteria per seniority level explicitly
- ✅ Define expectations in prompt (examples, depth, trade-offs)
- ✅ Use seniority-specific rubrics (junior = 70%+ good, senior = 85%+ good)
- ✅ Include positive framing ("for a junior role, this demonstrates...")
⚠️ Don't lower standards unfairly - adjust expectations, not quality bar
Impact:
- Junior candidates no longer penalized for missing advanced topics
- Senior candidates challenged appropriately
- Fairer, more actionable feedback
Problem:
- Users must explicitly grant microphone access (GDPR, privacy)
- Permission prompts are browser-specific and can't be styled
- Denial breaks entire interview flow
Solution:
// components/PermissionsCheck.tsx
'use client';
export function PermissionsCheck({ children }) {
const [permissionState, setPermissionState] = useState<'checking' | 'granted' | 'denied'>('checking');
useEffect(() => {
async function checkPermissions() {
try {
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
stream.getTracks().forEach(track => track.stop()); // Release immediately
setPermissionState('granted');
} catch (error) {
setPermissionState('denied');
}
}
checkPermissions();
}, []);
if (permissionState === 'denied') {
return (
<div className="permission-denied">
<h2>Microphone Access Required</h2>
<p>This app needs microphone access to record your interview responses.</p>
<button onClick={() => window.location.reload()}>
Grant Permission & Retry
</button>
</div>
);
}
return permissionState === 'granted' ? children : <LoadingSpinner />;
}Lessons:
- ✅ Check permissions early (before interview starts)
- ✅ Provide clear messaging about why mic is needed
- ✅ Handle denial gracefully with retry instructions
- ✅ Test across browsers (Chrome, Firefox, Edge, Safari)
⚠️ HTTPS required forgetUserMedia()(localhost is exempt)⚠️ Permission is per-origin - subdomains need separate grants
Browser Differences:
| Browser | Permission UI | Notes |
|---|---|---|
| Chrome | Top bar dropdown | Remembers choice per site |
| Firefox | Left of URL bar | Can be revoked easily |
| Safari | Dialog box | More restrictive, requires HTTPS |
| Edge | Similar to Chrome | Same underlying engine |
Problem:
MediaRecorderproduces WebM in Chrome/Firefox, MP4 in Safari- Azure Speech SDK accepts WAV, MP3, OGG, WebM (but prefers WAV/MP3)
- Need cross-browser audio handling
Solution:
// hooks/useAudioRecorder.ts
export function useAudioRecorder() {
const startRecording = async () => {
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
// Determine best supported format
const mimeTypes = [
'audio/webm;codecs=opus', // Chrome, Firefox (best compression)
'audio/mp4', // Safari
'audio/ogg;codecs=opus', // Fallback
'audio/wav', // Universal fallback
];
const supportedType = mimeTypes.find(type => MediaRecorder.isTypeSupported(type));
if (!supportedType) {
throw new Error('No supported audio format found');
}
const recorder = new MediaRecorder(stream, {
mimeType: supportedType,
audioBitsPerSecond: 128000, // 128 kbps
});
// ...
};
}Lessons:
- ✅ Check
MediaRecorder.isTypeSupported()before recording - ✅ Prefer WebM with Opus codec (best quality/size ratio)
- ✅ Set bitrate to 128 kbps (good quality, reasonable size)
- ✅ Store audio with correct MIME type for later processing
⚠️ Safari uses MP4 - ensure backend handles both⚠️ Opus codec not universally supported - have fallbacks
File Size Comparison (60s audio):
| Format | Codec | Bitrate | File Size |
|---|---|---|---|
| WebM | Opus | 128 kbps | ~960 KB |
| MP4 | AAC | 128 kbps | ~960 KB |
| WAV | PCM | 1411 kbps | ~10 MB |
Problem:
- SDK requires specific audio format (16 kHz, 16-bit, mono)
- Browser audio is often 48 kHz, stereo
- Need format conversion for accurate transcription
Solution:
// hooks/useSpeechRecognition.ts
import * as SpeechSDK from 'microsoft-cognitiveservices-speech-sdk';
export function useSpeechRecognition() {
const startRecognition = async (onTranscript: (text: string) => void) => {
// Get token from backend
const { token, region } = await fetch('/api/speech/token').then(r => r.json());
const speechConfig = SpeechSDK.SpeechConfig.fromAuthorizationToken(token, region);
speechConfig.speechRecognitionLanguage = 'en-US';
// Use default microphone (SDK handles format conversion internally)
const audioConfig = SpeechSDK.AudioConfig.fromDefaultMicrophoneInput();
const recognizer = new SpeechSDK.SpeechRecognizer(speechConfig, audioConfig);
// Real-time transcription
recognizer.recognizing = (s, e) => {
console.log('Interim:', e.result.text); // Partial results
};
recognizer.recognized = (s, e) => {
if (e.result.reason === SpeechSDK.ResultReason.RecognizedSpeech) {
onTranscript(e.result.text); // Final result
}
};
recognizer.startContinuousRecognitionAsync();
return () => {
recognizer.stopContinuousRecognitionAsync();
};
};
};Lessons:
- ✅ Let SDK handle format conversion via
AudioConfig.fromDefaultMicrophoneInput() - ✅ Use continuous recognition for long responses (vs single-shot)
- ✅ Listen to both
recognizingandrecognizedevents for UX feedback - ✅ Enable punctuation and profanity filtering in config
⚠️ SDK is 2+ MB - consider lazy loading⚠️ Transcription latency is ~1-2 seconds (acceptable for interviews)
Problem:
- Prisma 7 requires explicit output path in schema
- Generated client must be gitignored but importable
- Type safety across server/client boundary
Solution:
// prisma/schema.prisma
generator client {
provider = "prisma-client"
output = "../src/generated/prisma" // Custom location
}
datasource db {
provider = "postgresql"
}// src/lib/prisma.ts
import { PrismaClient } from '@/generated/prisma/client';
// Singleton pattern (Next.js hot reload safe)
const globalForPrisma = globalThis as unknown as {
prisma: PrismaClient | undefined;
};
export const prisma = globalForPrisma.prisma ?? new PrismaClient();
if (process.env.NODE_ENV !== 'production') {
globalForPrisma.prisma = prisma;
}Lessons:
- ✅ Always set custom output path to avoid conflicts
- ✅ Use singleton pattern to prevent multiple instances during dev
- ✅ Add generated folder to .gitignore (but keep schema.prisma in git)
- ✅ Run
npx prisma generateafter every schema change ⚠️ Prisma Client must be regenerated after schema changes (not automatic)⚠️ Import from @/generated/prisma not @prisma/client
Problem:
- Untyped API responses lead to runtime errors
- Frontend and backend types can drift
- No compile-time guarantees for API contracts
Solution:
// types/api.ts
export interface ApiResponse<T = unknown> {
success: boolean;
data?: T;
error?: {
code: string;
message: string;
details?: unknown;
};
}
// API Route
export async function POST(request: NextRequest): Promise<NextResponse<ApiResponse<Session>>> {
try {
const body = await request.json();
const session = await prisma.interviewSession.create({ data: body });
return NextResponse.json<ApiResponse<Session>>({
success: true,
data: session,
});
} catch (error) {
return NextResponse.json<ApiResponse<never>>({
success: false,
error: {
code: 'CREATE_FAILED',
message: error.message,
},
}, { status: 500 });
}
}
// Client usage
const response = await fetch('/api/sessions', { method: 'POST', body: JSON.stringify(data) });
const result: ApiResponse<Session> = await response.json();
if (!result.success) {
throw new Error(result.error?.message ?? 'Unknown error');
}
const session = result.data; // Type-safe!Lessons:
- ✅ Wrap all API responses in consistent envelope (
ApiResponse<T>) - ✅ Use discriminated unions (
success: true/falsedetermines shape) - ✅ Type both success and error cases explicitly
- ✅ Share types between client and server via
types/folder ⚠️ Runtime validation still needed (use Zod for input validation)
Problem:
- Azure Cognitive Services require globally unique names
- Manual naming leads to conflicts
- Need predictable but unique resource names
Solution:
# infra/main.tf
resource "random_string" "suffix" {
length = 6
special = false
upper = false
}
resource "azurerm_cognitive_account" "openai" {
name = "${var.project_name}-openai-${random_string.suffix.result}"
# e.g., "ai-interview-openai-a3b7k2"
location = var.openai_location
resource_group_name = azurerm_resource_group.main.name
kind = "OpenAI"
sku_name = "S0"
custom_subdomain_name = "${var.project_name}-openai-${random_string.suffix.result}"
}Lessons:
- ✅ Use random suffix for globally unique names
- ✅ Lowercase + alphanumeric only (no special chars)
- ✅ Keep project prefix for organization (billing tags)
- ✅ Set
custom_subdomain_nameto match resource name ⚠️ Random suffix changes on recreate - uselifecycle { ignore_changes }if needed⚠️ Account ID suffix (e.g.,${local.account_id}) may be too long for some resources
Problem:
- Manual copying of credentials is error-prone
- Need to populate
.env.localafterterraform apply - Keys are sensitive and shouldn't be displayed in terminal
Solution:
# infra/outputs.tf
output "env_file_content" {
description = "Content for .env.local file"
sensitive = true
value = <<-EOT
# Azure OpenAI Configuration
AZURE_OPENAI_ENDPOINT=${azurerm_cognitive_account.openai.endpoint}
AZURE_OPENAI_API_KEY=${azurerm_cognitive_account.openai.primary_access_key}
AZURE_OPENAI_DEPLOYMENT=${var.openai_deployment_name}
AZURE_OPENAI_API_VERSION=2024-10-21
# Azure Speech Service Configuration
AZURE_SPEECH_KEY=${azurerm_cognitive_account.speech.primary_access_key}
AZURE_SPEECH_REGION=${azurerm_cognitive_account.speech.location}
EOT
}# Generate .env.local automatically
cd infra
terraform output -raw env_file_content > ../.env.localLessons:
- ✅ Create env_file_content output with complete template
- ✅ Mark output as sensitive to hide from console
- ✅ Use
-rawflag to avoid JSON escaping - ✅ Include comments in output for clarity
⚠️ Commit .env.local.example (without keys) to git, not .env.local
Problem:
- Prisma Client creates multiple instances during Next.js hot reload
- "Too many clients" error in development
- Connection pool exhaustion
Solution:
// src/lib/prisma.ts
const globalForPrisma = globalThis as unknown as {
prisma: PrismaClient | undefined;
};
// Reuse existing client in development (hot reload safe)
export const prisma = globalForPrisma.prisma ?? new PrismaClient();
if (process.env.NODE_ENV !== 'production') {
globalForPrisma.prisma = prisma;
}Lessons:
- ✅ Use global singleton pattern for Prisma in development
- ✅ Store instance on
globalThisto survive hot reloads - ✅ Only apply in development (
NODE_ENV !== 'production') ⚠️ Restart dev server if you see "Too many clients" errors
Problem:
- Multiple environments (local, dev, staging, prod)
- Different credentials per environment
- Risk of committing secrets to git
Solution:
# .env.local.example (committed to git)
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_KEY=your-key-here
AZURE_OPENAI_DEPLOYMENT=gpt-4o-mini
AZURE_OPENAI_API_VERSION=2024-10-21
AZURE_SPEECH_KEY=your-key-here
AZURE_SPEECH_REGION=eastus
# .gitignore
.env.local
.env.*.local// Validate required env vars at startup
function getEnvVar(name: string): string {
const value = process.env[name];
if (!value) {
throw new Error(`Missing required environment variable: ${name}`);
}
return value;
}
const config = {
openai: {
endpoint: getEnvVar('AZURE_OPENAI_ENDPOINT'),
apiKey: getEnvVar('AZURE_OPENAI_API_KEY'),
deployment: getEnvVar('AZURE_OPENAI_DEPLOYMENT'),
},
};Lessons:
- ✅ Use .env.local for development (gitignored)
- ✅ Commit .env.local.example with placeholder values
- ✅ Validate required vars at startup (fail fast)
- ✅ Use Terraform outputs to auto-generate .env.local
⚠️ Never commit actual keys - use git pre-commit hooks to prevent
- Cold Start (Turbopack): 2-3s (vs 8-12s with Webpack)
- Hot Reload: 200-500ms (vs 1-2s with Webpack)
- Initial Load (SSR): 40% reduction with Server Components
- Time to Interactive: 1.5-2s
- Development: $0.30-1.00/month (using F0 free tiers)
- Production (50 sessions/month): $13-23/month
- Token Usage: ~10-15K tokens per interview session
- Audio Storage: ~2-5 MB per session (offline first)
- Offline Support: 100% data retention during network drops
- Sync Success Rate: 95%+ on reconnect
- Speech Recognition Accuracy: 85-90% (Azure STT)
- Question Generation Quality: 90%+ relevance to role
-
Azure AI Services
- Always use token-based auth for client-side SDKs
- Monitor pricing closely - docs can be outdated
- Verify model availability per region before deployment
-
Next.js 16 Migration
paramsis now a Promise - must await in all dynamic routes- Turbopack is default and significantly faster
- Server Components are powerful but understand the boundary
-
Offline-First Architecture
- IndexedDB + Blobs = reliable offline storage
- Auto-sync on reconnect for seamless UX
- Network state detection is critical
-
AI Prompt Engineering
- Two-phase generation (topics → questions) for quality
- Force JSON mode with explicit schemas
- Seniority-aware prompts for fair evaluation
-
Type Safety
- Prisma 7 custom output path + singleton pattern
- ApiResponse wrapper for all endpoints
- Validate at boundaries (Zod for inputs)
-
Infrastructure
- Terraform outputs → .env.local automation
- Random suffixes for globally unique Azure resources
- Always test in target region first