Skip to content

Commit 8d6dc9d

Browse files
docs(feat): add extraction patterns and production readiness guides
Add two guides covering data extraction architecture and deployment validation: **Extraction patterns guide:** - Three extraction patterns at a glance (Dashboard Native, Webhook-to-External, Hybrid) - Pattern descriptions with architectural trade-offs - Feature mapping to patterns (Structured Outputs, Scorecards, APIs, Langfuse) - When to use each pattern - Schema design implications - Migration paths between patterns **Production readiness guide:** - Progressive validation approach (INSTRUMENT+TEST → EXTRACT+MONITOR → OPTIMIZE) - Stage-by-stage checklist with required and recommended items - Production readiness gates (first deploy, scaled deploy, mature observability) - Common readiness mistakes and fixes - Deployment workflow timeline Includes VAPI validation questions throughout for pattern accuracy and naming consistency.
1 parent e96b044 commit 8d6dc9d

2 files changed

Lines changed: 770 additions & 0 deletions

File tree

Lines changed: 364 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,364 @@
1+
---
2+
title: Choosing your extraction pattern
3+
subtitle: Understand the three architectural patterns for getting data out of Vapi
4+
slug: observability/extraction-patterns
5+
---
6+
7+
## Why extraction is an architectural choice
8+
9+
Unlike traditional observability platforms (DataDog, New Relic) where data flows automatically from instrumentation to monitoring, **Vapi requires you to choose how data gets extracted** for analysis.
10+
11+
This design reflects Vapi's architecture:
12+
13+
- **Scalar Structured Outputs** (strings, numbers, booleans) flow automatically to Boards and Insights API
14+
- **Object Structured Outputs** (nested data) require webhook extraction
15+
- **Scorecard results** don't appear in native analytics (webhook-only)
16+
17+
**Your extraction pattern choice determines**:
18+
19+
- What schema types you can use (scalar vs object fields)
20+
- What tools you can use for monitoring (Boards vs external BI)
21+
- How much engineering effort is required
22+
- Whether you can export to existing data infrastructure
23+
24+
<span className="vapi-validation">Confirm this framing is accurate and doesn't oversimplify</span>
25+
26+
---
27+
28+
## The three extraction patterns at a glance
29+
30+
Vapi offers three architectural patterns for extracting observability data from your calls. Each pattern represents a different trade-off between simplicity and flexibility:
31+
32+
| Pattern | Description | Engineering effort | Data richness | Typical users |
33+
|---------|-------------|-------------------|---------------|---------------|
34+
| **Dashboard Native** | Use Vapi's built-in Boards with scalar Structured Outputs for real-time dashboards | ⚡ Minimal (no infrastructure) | Basic (scalar fields only) | Solo founders, non-technical teams, startups |
35+
| **Webhook-to-External** | Build custom post-call processing that captures data via webhooks and exports to your data warehouse | 🛠️ High (requires backend infrastructure) | Rich (full object schemas, nested data) | Engineering teams, enterprises with existing data platforms |
36+
| **Hybrid** | Combine both approaches - use Boards for operational metrics, webhooks for deep analysis | ⚙️ Medium (partial infrastructure) | Flexible (mix of scalar and object data) | Growing teams balancing simplicity and power |
37+
38+
**How to choose**: Start with Dashboard Native (fastest setup). Migrate to Hybrid or Webhook-to-External as your analytics needs grow or when you need features like Scorecard visualization or external BI tools.
39+
40+
---
41+
42+
## EXTRACT stage features at a glance
43+
44+
| Feature | What it extracts | Extraction method | Pattern compatibility |
45+
| ------------------------------- | -------------------------------------------------------------- | ---------------------------------------------------------------- | --------------------------- |
46+
| **Structured Outputs (Scalar)** | Business metrics using scalar fields (individual boolean, strings, numbers) | Automatic → Boards + Insights API | Dashboard Native, Hybrid |
47+
| **Structured Outputs (Object)** | Rich nested data using object/array schemas | Webhooks only | Webhook-to-External, Hybrid |
48+
| **Scorecards** | AI-powered quality evaluation results | Webhooks only (not visible in Boards) | Webhook-to-External, Hybrid |
49+
| **Insights API** | [TBD: What does Insights API extract/provide?] | [TBD: Automatic for scalars? Separate feature?] | [TBD] |
50+
| **Analytics API** | [TBD: What does Analytics API extract/provide?] | [TBD: How does it differ from Insights API?] | [TBD] |
51+
| **Langfuse Integration** | Real-time observability data to external platform | Direct integration (real-time, no webhooks/post-call processing) | All patterns |
52+
53+
54+
<span className="vapi-validation">Confirm this list is complete and accurate? Need help explaining and contrasting Insights API and Analytics API. Are you ok with having Langfuse be included here in extraction phase or should we only mention in monitoring phase?</span>
55+
56+
---
57+
58+
## The three extraction patterns
59+
60+
### Pattern 1: Dashboard Native
61+
62+
**What it is**: This pattern uses Vapi's built-in Boards platform to automatically visualize scalar Structured Outputs (strings, numbers, booleans) without any external infrastructure. Data flows from your assistant configuration directly to Boards, where you can build real-time dashboards using a drag-and-drop visual builder.
63+
64+
<span className="vapi-validation">Validate that Structured Outputs (scalar) are the only instrumentation that will work with native Vapi Boards</span>
65+
66+
**Architecture**: Structured Outputs (scalar only) → Boards
67+
68+
**Who it's for**:
69+
70+
- Non-technical teams or solo founders
71+
- Teams without backend engineering resources
72+
- Startups with simple analytics needs
73+
- Quick operational dashboards (call volume, cost, success rate)
74+
75+
**How it works**:
76+
77+
1. Configure Structured Outputs using **scalar fields only** (no nested objects)
78+
2. Data automatically flows to Vapi Boards
79+
3. Build dashboards using drag-and-drop visual builder
80+
4. Monitor via Boards web interface
81+
82+
**Capabilities**:
83+
84+
- ✅ Real-time dashboards with no code
85+
- ✅ Built-in formulas and aggregations (Math.js)
86+
- ✅ Global filters and time range controls
87+
- ❌ Can't export to external BI tools (Tableau, PowerBI)
88+
- ❌ Can't use object-type schemas (limits extraction richness)
89+
- ❌ Can't visualize Scorecard results
90+
91+
**When to use**:
92+
93+
- You're just starting with observability
94+
- You don't have engineering resources for webhook infrastructure
95+
- Your analytics needs are simple (operational metrics, not complex business intelligence)
96+
- You need visibility fast with minimal setup
97+
98+
**When NOT to use**:
99+
100+
- You need to export to external BI tools (Tableau, PowerBI, Looker) → use **Webhook-to-External**
101+
- You're using Scorecards for quality monitoring (results not visible in Boards) → use **Webhook-to-External** or **Hybrid**
102+
- Compliance requires data sovereignty or custom retention → use **Webhook-to-External**
103+
- You need rich nested data schemas (objects, arrays) → use **Webhook-to-External** or **Hybrid**
104+
105+
**Example use case**: A solopreneur running an AI receptionist for their dental practice. Wants to track: daily call volume, booking rate, missed calls. Uses Boards to see trends and spot issues.
106+
107+
<span className="vapi-validatoin">Pay close attention to this section because a number of assumptions are being made. Corrections and disambiguation needed.</span>
108+
109+
---
110+
111+
### Pattern 2: Webhook-to-External
112+
113+
**What it is**: This pattern uses Vapi's webhook functionality to send post-call data to a custom endpoint you build and host. You configure a webhook URL at the org, squad, or assistant level, and Vapi sends complete call data (including object-type Structured Outputs and Scorecard results) to your server after each call, where you can process and store it in your data warehouse.
114+
115+
<span className="vapi-validation">Naming consistency question: We've used "webhook", "webhook-to-external", and "Webhook-to-External" throughout the docs. Should we standardize on one name for this pattern? Recommendation: "Webhook-to-External" (capitalized, hyphenated) to parallel "Dashboard Native". Confirm preferred naming.</span>
116+
117+
**Architecture**: Structured Outputs (any type) → Webhooks → Your data warehouse → Your BI tools
118+
119+
**Who it's for**:
120+
121+
- Engineering teams with data infrastructure
122+
- Enterprises with existing analytics platforms
123+
- Teams needing custom business intelligence
124+
- Organizations requiring data sovereignty or compliance
125+
126+
**How it works**:
127+
128+
1. Configure Structured Outputs using **rich object schemas** (nested data, arrays, complex types)
129+
2. Set up webhook endpoint on your servers to receive call data
130+
3. Process webhooks and store in your data warehouse (BigQuery, Snowflake, Postgres)
131+
4. Connect BI tools (Tableau, Looker, Metabase) to your warehouse
132+
5. Build custom analytics on your infrastructure
133+
134+
**Capabilities**:
135+
136+
- ✅ Full control over data storage and processing
137+
- ✅ Integration with existing BI and alerting systems
138+
- ✅ Rich nested data schemas (not limited to scalars)
139+
- ✅ Can access Scorecard results via webhooks
140+
- ❌ Requires backend engineering (webhook receiver, database, ETL)
141+
- ❌ Higher operational complexity (hosting, monitoring webhooks)
142+
143+
**When to use**:
144+
145+
- You have engineering resources to build webhook infrastructure
146+
- You need to integrate Vapi data with existing business systems (CRM, data warehouse)
147+
- You require custom analytics beyond Vapi's built-in capabilities
148+
- Compliance or data sovereignty requires you to control data storage
149+
150+
**When NOT to use**:
151+
152+
- You have no backend engineering team or resources → use **Dashboard Native**
153+
- Your analytics needs are simple and Boards provides sufficient visibility → use **Dashboard Native**
154+
- You want to start simple and may add external integration later → use **Dashboard Native** or **Hybrid**
155+
- You need instant operational dashboards without warehouse ETL delays → consider **Hybrid** instead
156+
157+
**Example use case**: An enterprise healthcare org using Vapi for patient intake. Needs to: sync extracted patient info to Epic EHR, analyze call quality trends in Tableau, alert on-call staff via PagerDuty. Uses webhooks to export all call data to Snowflake, then integrates downstream systems.
158+
159+
---
160+
161+
### Pattern 3: Hybrid
162+
163+
**What it is**: This pattern combines Dashboard Native and Webhook-to-External approaches by maintaining two parallel data flows - scalar Structured Outputs go to Boards for real-time operational dashboards, while rich object schemas and Scorecard results are exported via webhooks to your external data warehouse. This allows operations teams to use Boards while analytics teams get full-fidelity data in external BI tools.
164+
165+
**Architecture**:
166+
167+
- **Operational track**: Scalar Structured Outputs → Boards (real-time dashboards)
168+
- **Analytics track**: Object Structured Outputs + Scorecards → Webhooks → External warehouse
169+
170+
**Who it's for**:
171+
172+
- Teams with some engineering resources
173+
- Organizations balancing simplicity and power
174+
- Teams iterating from simple to complex analytics
175+
- Use cases needing both real-time ops dashboards AND deep analysis
176+
177+
**How it works**:
178+
179+
1. Configure **two sets of Structured Outputs**:
180+
- Scalar fields for operational metrics (cost, volume, basic success metrics)
181+
- Object fields for rich analysis (full conversation context, detailed scoring)
182+
2. Scalar data flows to Boards for real-time visibility
183+
3. Object data + Scorecards exported via webhooks for deep analysis
184+
4. Operations team uses Boards, analytics team uses external BI
185+
186+
**Capabilities**:
187+
188+
- ✅ Best of both worlds: simple dashboards + powerful analytics
189+
- ✅ Incremental complexity (start with Boards, add webhooks later)
190+
- ✅ Team separation (ops uses Boards, analysts use BI tools)
191+
- ❌ More complex schema design (must plan for both tracks)
192+
- ❌ Partial engineering effort (still need webhook infrastructure)
193+
194+
**When to use**:
195+
196+
- You're scaling from simple to complex analytics needs
197+
- Different teams have different analytics requirements (ops vs analysts)
198+
- You want real-time operational visibility without waiting for warehouse ETL
199+
- You're not sure yet whether Boards alone will be sufficient long-term
200+
201+
**When NOT to use**:
202+
203+
- Your needs clearly fit one pattern—all simple (use **Dashboard Native**) or all complex (use **Webhook-to-External**)
204+
- You want to minimize schema design complexity → use single-pattern approach
205+
- Small team where everyone uses the same analytics tools → use **Dashboard Native** or **Webhook-to-External** consistently
206+
- You're confident Boards will never be sufficient → skip straight to **Webhook-to-External**
207+
208+
**Example use case**: A growing SaaS company using Vapi for sales qualification calls. Sales ops team monitors daily metrics in Boards (call volume, booking rate). Data team exports full conversation analysis via webhooks to BigQuery for prompt optimization and quarterly business reviews.
209+
210+
---
211+
212+
{/* ## Decision framework: Choosing your pattern
213+
214+
<Tabs>
215+
<Tab title="By team capability">
216+
| Capability | Recommended Pattern |
217+
|------------|-------------------|
218+
| No backend engineering | **Dashboard Native** |
219+
| Backend team, no data warehouse | **Dashboard Native** (start here, migrate to Hybrid later) <span className="internal-note assumption"> Assumes backend teams without existing warehouse should start simple. Alternative: Could recommend Webhook-to-External with lightweight warehouse (Postgres) if team has capacity.</span> |
220+
| Backend team + data warehouse | **Webhook-to-External** or **Hybrid** |
221+
| Enterprise with existing BI stack | **Webhook-to-External** |
222+
</Tab>
223+
224+
<Tab title="By analytics needs">
225+
| Need | Recommended Pattern | |------|-------------------| | Simple
226+
operational metrics (volume, cost, success rate) | **Dashboard Native** | |
227+
Need to export to Tableau/PowerBI/Looker | **Webhook-to-External** | |
228+
Real-time ops + deep analysis | **Hybrid** | | Compliance requires data
229+
control | **Webhook-to-External** | | Using Scorecards for quality monitoring
230+
| **Webhook-to-External** or **Hybrid** (Scorecard results not in Boards) |
231+
</Tab>
232+
233+
<Tab title="By business context">
234+
| Context | Recommended Pattern |
235+
|---------|-------------------|
236+
| Startup / MVP stage | **Dashboard Native** |
237+
| Growing team (10-50 people) | **Hybrid** |
238+
| Enterprise (50+ people) | **Webhook-to-External** or **Hybrid** |
239+
| Must integrate with CRM/ERP | **Webhook-to-External** |
240+
| Need instant visibility, minimal engineering | **Dashboard Native** |
241+
</Tab>
242+
</Tabs>
243+
244+
<span className="vapi-validation">Are these recommendations aligned with how VAPI sees
245+
customer segments?</span>
246+
247+
--- */}
248+
249+
---
250+
251+
252+
## Common migration paths
253+
254+
<span className="internal-note"> Are reverse migrations possible/recommended?
255+
(Webhook-to-External → Hybrid or Hybrid → Dashboard Native)? Do teams ever
256+
simplify their extraction approach, or is migration always toward more
257+
complexity?</span>
258+
259+
### Dashboard Native → Hybrid
260+
261+
**When to migrate**: You need deeper analysis but want to keep operational dashboards
262+
263+
**What changes**: Add object-type Structured Outputs + webhook infrastructure. Existing scalar outputs continue flowing to Boards.
264+
265+
**Impact**: Minimal disruption—operations team keeps using Boards, analytics team gets external warehouse access.
266+
267+
---
268+
269+
### Hybrid → Webhook-to-External
270+
271+
**When to migrate**: External warehouse becomes single source of truth, Boards no longer provide value
272+
273+
**What changes**: Migrate all data extraction to webhooks, rebuild operational dashboards in external BI tool (Looker, Tableau, Metabase).
274+
275+
**Impact**: Medium effort—requires dashboard migration, but unifies analytics platform.
276+
277+
---
278+
279+
### Dashboard Native → Webhook-to-External
280+
281+
**When to migrate**: Compliance requirement, CRM integration, or sudden need for external data control
282+
283+
**What changes**: Full replacement—redesign schemas for richness, build webhook infrastructure, rebuild all dashboards externally.
284+
285+
**Impact**: High effort—complete platform migration, but necessary for regulatory or integration requirements.
286+
287+
---
288+
289+
## Schema design implications
290+
291+
<span className="internal-note"> This section should probably be in Structured Outputs doc pages; not here.</span>
292+
293+
Your extraction pattern choice **determines how you design Structured Output schemas** in the INSTRUMENT stage.
294+
295+
### Dashboard Native: Scalar fields only
296+
297+
**Constraint**: Only scalar types (boolean, string, number) flow to Boards. Nested objects are invisible to dashboards.
298+
299+
**Design strategy**: Flatten nested data into scalar fields. For example:
300+
301+
-`appointment_date` (string), `appointment_time` (string), `appointment_service` (string)
302+
-`appointment_details` (object with nested date/time/service)
303+
304+
**Tradeoff**: Simpler schemas, but loses data structure richness.
305+
306+
---
307+
308+
### Webhook-to-External: Full schema flexibility
309+
310+
**Freedom**: Use rich nested schemas—objects, arrays, complex types. Your data warehouse can query anything.
311+
312+
**Design strategy**: Structure data naturally. Nested customer objects, conversation analysis arrays, quality metric hierarchies.
313+
314+
**Tradeoff**: More expressive data model, but requires webhook infrastructure.
315+
316+
---
317+
318+
### Hybrid: Two-schema strategy
319+
320+
**Operational track** (Boards): Scalar fields for real-time metrics (success rate, call volume, cost)
321+
322+
**Analytics track** (Webhooks): Rich nested schemas for deep analysis (full conversation context, sentiment timelines, topic extraction)
323+
324+
**Design strategy**: Duplicate key metrics across both schemas. Operational team gets instant visibility; analytics team gets comprehensive data.
325+
326+
**Tradeoff**: Schema design complexity (must maintain two structures), but provides best of both worlds.
327+
328+
**[See schema examples and design patterns in Structured Outputs guide](/assistants/structured-outputs-quickstart)**
329+
330+
331+
332+
---
333+
334+
## Next steps
335+
336+
<CardGroup cols={2}>
337+
<Card
338+
title="Structured outputs"
339+
icon="database"
340+
href="/assistants/structured-outputs-quickstart"
341+
>
342+
Learn how to instrument your assistant with schemas
343+
</Card>
344+
345+
<Card
346+
title="Boards quickstart"
347+
icon="chart-line"
348+
href="/observability/boards-quickstart"
349+
>
350+
Build your first dashboard (Dashboard Native pattern)
351+
</Card>
352+
353+
<Card title="Back to overview" icon="arrow-left" href="/observability/framework">
354+
Return to the observability maturity model
355+
</Card>
356+
357+
<Card
358+
title="Production readiness"
359+
icon="check-circle"
360+
href="/observability/production-readiness"
361+
>
362+
Validate you're ready for production
363+
</Card>
364+
</CardGroup>

0 commit comments

Comments
 (0)