docs(feat): add monitoring and optimization workflow guides

workingmans-ai · workingmans-ai · commit 214099eeaee4 · 2026-02-16T12:40:07.000-05:00
Add two guides covering the "monitor &amp; improve" stages (skeleton format):

**Monitoring &amp; Operating guide:**
- Reframed title from "Monitoring" to "Monitoring &amp; Operating"
- Operating voice AI systems introduction (real-time performance, cost, quality)
- Tools at a glance (Boards, Insights API, Analytics API, Langfuse, Webhook-to-External)
- Placeholder sections for tool details, alerting strategies, best practices
- Focus on operational reliability and continuous visibility

**Optimization workflows guide:**
- Optimization as continuous improvement loop (not a dedicated tool)
- 7-step workflow (Detect → Extract → Hypothesize → Change → Test → Deploy → Verify)
- Optimization mindset and why it matters
- Placeholder sections for detailed steps, common scenarios, best practices
- Cross-functional workflow using tools from all previous stages

Both pages use skeleton format with complete intros and VAPI validation questions,
awaiting tool clarification and detailed content development in iteration 2.
diff --git a/fern/observability/monitoring.mdx b/fern/observability/monitoring.mdx
@@ -0,0 +1,149 @@
+---
+title: Monitoring & Operating
+subtitle: Visualize trends, track operational health, and ensure production reliability
+slug: observability/monitoring
+---
+
+## What is monitoring and operating?
+
+**Monitoring & Operating** means running your voice AI system in production with continuous visibility into its health and performance. This stage answers critical operational questions:
+
+- How many calls are happening right now?
+- What's my average call cost this week?
+- Is my success rate dropping?
+- Are any assistants experiencing unusual error rates?
+- When should I be alerted about problems?
+
+**Operating a voice AI system** requires more than traditional software monitoring. Voice AI systems have unique operational characteristics:
+
+- **Real-time performance matters** — Latency, interruption handling, and voice quality directly impact user experience
+- **Cost scales with usage** — Every call has LLM, TTS, and STT costs that must be tracked
+- **Quality is subjective** — Success isn't just "200 OK" - it's whether the conversation achieved its goal
+- **Failures are multi-layered** — Issues can occur in the LLM, voice pipeline, tool execution, or external integrations
+
+**The goal**: Catch problems early (before customers complain), understand operational patterns, and maintain production reliability.
+
+---
+
+## Monitoring & Operating tools at a glance
+
+| Tool | What it does | Best for |
+|------|--------------|----------|
+| **Boards** | Drag-and-drop visual dashboards with charts, metrics, and global filters. Queries scalar Structured Output fields. | Real-time operational visibility, team dashboards, custom reporting |
+| **Insights API** | [TBD: Programmatic querying and alerting capabilities?] | [TBD: Automated alerts, custom monitoring logic?] |
+| **Analytics API** | [TBD: Aggregated operational metrics?] | [TBD: Cost tracking, performance monitoring?] |
+| **Langfuse Integration** | Real-time observability platform integration for call monitoring and tracing | End-to-end observability, LLM performance tracking, distributed tracing |
+| **Webhook-to-External** | Export call data to third-party monitoring platforms (Datadog, Braintrust, Grafana, custom dashboards) | Enterprise monitoring stacks, unified observability across systems, custom alerting |
+
+<span className="vapi-validation">Confirm this list of monitoring tools is complete and accurate. Need clarification on: What are the key capabilities and use cases for Insights API vs Analytics API? How do they differ? When should users choose one over the other? What monitoring capabilities does Langfuse provide beyond basic call data? Are there other built-in or recommended monitoring integrations? What's the roadmap for built-in alerting capabilities?</span>
+
+---
+
+## Boards
+
+**[Placeholder - Full detail section]**
+
+→ **[Build your first dashboard in Boards quickstart](/observability/boards-quickstart)**
+
+---
+
+## Analytics API
+
+**[Placeholder - Full detail section]**
+
+<span className="internal-note">What's the difference between Analytics API and Insights API? What are Analytics API's key capabilities? When should users choose Analytics API vs Insights API vs Boards?</span>
+
+---
+
+## Insights API
+
+**[Placeholder - Full detail section]**
+
+<Warning>
+  **Insights API is currently undocumented**. If you need flexible querying or programmatic alerting, contact Vapi support for guidance.
+</Warning>
+
+<span className="internal-note">Should Insights API be formally documented? What's the relationship between Insights API and Analytics API? Is Insights API the primary alerting mechanism, or are built-in alerts planned?</span>
+
+---
+
+## Langfuse Integration
+
+**[Placeholder - Full detail section]**
+
+<span className="vapi-validation">What are Langfuse's key capabilities for Vapi users? Does it provide real-time alerting? What metrics/traces does it capture? Are there setup requirements or limitations?</span>
+
+---
+
+## Webhook-to-External Monitoring
+
+**[Placeholder - Full detail section]**
+
+<span className="vapi-validation">What are recommended third-party monitoring platforms for Vapi (Datadog, Braintrust, etc.)? Are there integration guides or examples? What webhook events are most useful for monitoring?</span>
+
+---
+
+## Alerting Strategies
+
+**[Placeholder - Full detail section]**
+
+<span className="internal-note">Are built-in alerts on the roadmap?</span>
+
+---
+
+## Monitoring Best Practices
+
+**[Placeholder - Full detail section]**
+
+Topics to cover:
+- Define baseline metrics
+- Set alert thresholds (critical, warning, informational)
+- Monitor continuously, not reactively
+- Create role-specific dashboards
+
+---
+
+## What you'll learn in detailed guides
+
+- [Boards quickstart](/observability/boards-quickstart) — Create custom dashboards in minutes
+- (Planned) Langfuse integration guide — Set up real-time observability
+- (Planned) Webhook monitoring guide — Export to external platforms
+- (Planned) Analytics API reference — Programmatic monitoring
+
+---
+
+## Key takeaway
+
+**Monitor continuously**. Production issues caught early (via dashboards or alerts) are easier to fix than issues discovered through customer complaints.
+
+Operating a voice AI system requires proactive monitoring. Set up visibility on day one of production launch.
+
+---
+
+## Next steps
+
+<CardGroup cols={2}>
+  <Card
+    title="Boards quickstart"
+    icon="chart-line"
+    href="/observability/boards-quickstart"
+  >
+    Build your first monitoring dashboard
+  </Card>
+
+  <Card
+    title="Optimization workflows"
+    icon="arrow-trend-up"
+    href="/observability/optimization-workflows"
+  >
+    Next stage: Use monitoring data to improve
+  </Card>
+
+  <Card
+    title="Back to overview"
+    icon="arrow-left"
+    href="/observability/framework"
+  >
+    Return to observability framework
+  </Card>
+</CardGroup>
diff --git a/fern/observability/optimization-workflows.mdx b/fern/observability/optimization-workflows.mdx
@@ -0,0 +1,174 @@
+---
+title: Optimization workflows
+subtitle: Use observability data to continuously improve your assistant
+slug: observability/optimization-workflows
+---
+
+## What is optimization?
+
+**Optimization** is the continuous improvement loop: using observability data to refine prompts, improve tool calls, and enhance conversation flows.
+
+Unlike the previous stages (INSTRUMENT, TEST, EXTRACT, MONITOR), **OPTIMIZE is not a dedicated tool or feature** — it's a workflow that combines tools from all previous stages to drive systematic improvement.
+
+**The optimization mindset**: Voice AI quality improves through iteration, not perfection. The best teams:
+- Start with "good enough" (not perfect)
+- Deploy to production with instrumentation and monitoring
+- Use real-world data to identify improvement opportunities
+- Test changes before deploying
+- Track impact systematically
+
+**Why optimization matters**: Without a systematic optimization workflow, teams either:
+- ❌ Over-engineer before launch (trying to predict every edge case)
+- ❌ React to problems ad-hoc (fixing symptoms, not root causes)
+- ❌ Stagnate after launch (no process for continuous improvement)
+
+**The goal**: Establish a repeatable workflow that turns observability data into measurable improvements.
+
+---
+
+## Optimization workflow at a glance
+
+| Stage | Tools Used | What you do |
+|-------|-----------|-------------|
+| **1. Detect patterns** | Boards, Insights API, Analytics API | Spot trends in monitoring dashboards (success rate dropping, cost increasing, etc.) |
+| **2. Extract details** | Webhooks, Structured Outputs, Transcripts | Pull call data to understand WHY the pattern exists |
+| **3. Form hypothesis** | Manual analysis | Identify root cause (e.g., "prompt doesn't handle edge case X") |
+| **4. Make changes** | Assistant configuration | Update prompts, tools, routing logic based on hypothesis |
+| **5. Test changes** | Evals, Simulations | Validate improvement before deploying to production |
+| **6. Deploy** | API, Dashboard | Push updated assistant to production |
+| **7. Verify** | Boards, Insights API | Track target metric to confirm improvement |
+
+This is a **continuous cycle**, not a one-time activity:
+
+```
+MONITOR → EXTRACT → Analyze → Revise → TEST → Deploy → MONITOR (repeat)
+```
+
+<span className="vapi-validation">Confirm this optimization workflow accurately reflects how Vapi customers typically iterate on their assistants. Are there tools or stages we're missing? Should we emphasize certain steps more than others?</span>
+
+---
+
+## The optimization loop in detail
+
+**[Placeholder - Full detail sections]**
+
+### Step 1: Detect patterns from monitoring
+
+<span className="internal-note">Placeholder for: How to use Boards/analytics to spot trends (success rate drops, cost spikes, etc.). Include example scenario.</span>
+
+---
+
+### Step 2: Extract detailed data
+
+<span className="internal-note">Placeholder for: Methods for pulling call transcripts, structured outputs, tool call logs. Show how to filter/export data for analysis.</span>
+
+---
+
+### Step 3: Form a hypothesis
+
+<span className="internal-note">Placeholder for: Common hypothesis patterns (prompt issues, tool description problems, routing logic, verbosity, etc.). Show example hypothesis formation process.</span>
+
+---
+
+### Step 4: Make targeted changes
+
+<span className="internal-note">Placeholder for: How to revise prompts, update tool descriptions, refine conversation flows. Include before/after examples.</span>
+
+---
+
+### Step 5: Test before deploying
+
+<span className="internal-note">Placeholder for: Creating Evals for specific failure cases, regression testing strategies. Show example test structure.</span>
+
+---
+
+### Step 6: Deploy
+
+<span className="internal-note">Placeholder for: Deployment strategies (direct deploy, staged rollout, A/B testing). Include decision framework for choosing strategy.</span>
+
+---
+
+### Step 7: Verify improvement
+
+<span className="internal-note">Placeholder for: Time windows for verification (immediate, 24h, 1 week), what to track, when to roll back.</span>
+
+---
+
+## Common optimization scenarios
+
+**[Placeholder - Table of common patterns, root causes, and optimization actions]**
+
+<span className="vapi-validation">What are the most common optimization scenarios Vapi customers encounter? What issues drive the most improvement iterations? Are there voice-specific optimization patterns we should highlight?</span>
+
+---
+
+## Optimization best practices
+
+**[Placeholder - Full detail sections]**
+
+Topics to cover:
+- Start with high-impact, low-effort changes
+- Track improvement over time (optimization log)
+- Don't optimize prematurely (wait for data)
+- Make one change at a time (clear cause-and-effect)
+- Maintain regression tests
+
+<span className="internal-note">Should we include specific guidance on optimization cadence (weekly reviews, monthly deep dives, quarterly retrospectives)?</span>
+
+---
+
+## What you'll learn in detailed guides
+
+**Optimization is cross-functional** — it references tools from all previous stages:
+- [Evals quickstart](/observability/evals-quickstart) — Test improvements before deploying
+- [Boards quickstart](/observability/boards-quickstart) — Track metrics over time
+- [Structured outputs quickstart](/assistants/structured-outputs-quickstart) — Extract failure data for analysis
+
+(Planned) Optimization playbook — Common scenarios and solutions
+(Planned) Advanced optimization — A/B testing, staged rollouts, impact measurement
+
+---
+
+## Key takeaway
+
+**Optimize continuously**. The best teams treat observability as a loop: instrument → test → deploy → monitor → identify improvements → repeat. Data-driven iteration beats guesswork.
+
+Start your optimization practice on day one. Don't wait until you have problems — establish the workflow while things are working, so you're ready when issues arise.
+
+---
+
+## Next steps
+
+<CardGroup cols={2}>
+  <Card
+    title="Boards quickstart"
+    icon="chart-line"
+    href="/observability/boards-quickstart"
+  >
+    Set up monitoring to detect patterns
+  </Card>
+
+  <Card
+    title="Evals quickstart"
+    icon="clipboard-check"
+    href="/observability/evals-quickstart"
+  >
+    Build tests to validate improvements
+  </Card>
+
+  <Card
+    title="Production readiness"
+    icon="check-circle"
+    href="/observability/production-readiness"
+  >
+    Validate you're ready to optimize in production
+  </Card>
+
+  <Card
+    title="Back to overview"
+    icon="arrow-left"
+    href="/observability/framework"
+  >
+    Return to observability framework
+  </Card>
+</CardGroup>