| title | Deep research agent using Vercel's AI SDK |
|---|---|
| sidebarTitle | Deep research agent |
| description | Deep research agent which generates comprehensive PDF reports using Vercel's AI SDK. |
| tag | v4 |
import RealtimeLearnMore from "/snippets/realtime-learn-more.mdx";
import UpgradeToV4Note from "/snippets/upgrade-to-v4-note.mdx";
Acknowledgements: This example project is derived from the brilliant [deep research guide](https://aie-feb-25.vercel.app/docs/deep-research) by [Nico Albanese](https://x.com/nicoalbanese10).This full-stack project is an intelligent deep research agent that autonomously conducts multi-layered web research, generating comprehensive reports which are then converted to PDF and uploaded to storage.
<video controls className="w-full aspect-video" src="https://github.com/user-attachments/assets/aa86d2b2-7aa7-4948-82ff-5e1e80cf8e37"
Tech stack:
- Next.js for the web app
- Vercel's AI SDK for AI model integration and structured generation
- Trigger.dev for task orchestration, execution and real-time progress updates
- OpenAI's GPT-4o model for intelligent query generation, content analysis, and report creation
- Exa API for semantic web search with live crawling
- LibreOffice for PDF generation
- Cloudflare R2 to store the generated reports
Features:
- Recursive research: AI generates search queries, evaluates their relevance, asks follow-up questions and searches deeper based on initial findings.
- Real-time progress: Live updates are shown on the frontend using Trigger.dev Realtime as research progresses.
- Intelligent source evaluation: AI evaluates search result relevance before processing.
- Research report generation: The completed research is converted to a structured HTML report using a detailed system prompt.
- PDF creation and uploading to Cloud storage: The completed reports are then converted to PDF using LibreOffice and uploaded to Cloudflare R2.
<Card title="View the Vercel AI SDK deep research agent repo" icon="GitHub" href="https://github.com/triggerdotdev/examples/tree/main/vercel-ai-sdk-deep-research-agent"
Click here to view the full code for this project in our examples repository on GitHub. You can fork it and use it as a starting point for your own project.
The research process is orchestrated through three connected Trigger.dev tasks:
deepResearchOrchestrator- Main task that coordinates the entire research workflow.generateReport- Processes research data into a structured HTML report using OpenAI's GPT-4o modelgeneratePdfAndUpload- Converts HTML to PDF using LibreOffice and uploads to R2 cloud storage
Each task uses triggerAndWait() to create a dependency chain, ensuring proper sequencing while maintaining isolation and error handling.
The core research logic uses a recursive depth-first search approach. A query is recursively expanded and the results are collected.
Key parameters:
depth: Controls recursion levels (default: 2)breadth: Number of queries per level (default: 2, halved each recursion)
Level 0 (Initial Query): "AI safety in autonomous vehicles"
│
├── Level 1 (depth = 1, breadth = 2):
│ ├── Sub-query 1: "Machine learning safety protocols in self-driving cars"
│ │ ├── → Search Web → Evaluate Relevance → Extract Learnings
│ │ └── → Follow-up: "How do neural networks handle edge cases?"
│ │
│ └── Sub-query 2: "Regulatory frameworks for autonomous vehicle testing"
│ ├── → Search Web → Evaluate Relevance → Extract Learnings
│ └── → Follow-up: "What are current safety certification requirements?"
│
└── Level 2 (depth = 2, breadth = 1):
├── From Sub-query 1 follow-up:
│ └── "Neural network edge case handling in autonomous systems"
│ └── → Search Web → Evaluate → Extract → DEPTH LIMIT REACHED
│
└── From Sub-query 2 follow-up:
└── "Safety certification requirements for self-driving vehicles"
└── → Search Web → Evaluate → Extract → DEPTH LIMIT REACHED
Process flow:
- Query generation: OpenAI's GPT-4o generates multiple search queries from the input
- Web search: Each query searches the web via the Exa API with live crawling
- Relevance evaluation: OpenAI's GPT-4o evaluates if results help answer the query
- Learning extraction: Relevant results are analyzed for key insights and follow-up questions
- Recursive deepening: Follow-up questions become new queries for the next depth level
- Accumulation: All learnings, sources, and queries are accumulated across recursion levels
We use the useRealtimeTaskTrigger React hook to trigger the deep-research task and subscribe to it's updates.
Frontend (React Hook):
const triggerInstance = useRealtimeTaskTrigger<typeof deepResearchOrchestrator>("deep-research", {
accessToken: triggerToken,
});
const { progress, label } = parseStatus(triggerInstance.run?.metadata);As the research progresses, the metadata is set within the tasks and the frontend is kept updated with every new status:
Task Metadata:
metadata.set("status", {
progress: 25,
label: `Searching the web for: "${query}"`,
});- Deep research task: Core logic in src/trigger/deepResearch.ts - orchestrates the recursive research process. Here you can change the model, the depth and the breadth of the research.
- Report generation: src/trigger/generateReport.ts - creates structured HTML reports from research data. The system prompt is defined in the code - this can be updated to be more or less detailed.
- PDF generation: src/trigger/generatePdfAndUpload.ts - converts reports to PDF and uploads to R2. This is a simple example of how to use LibreOffice to convert HTML to PDF.
- Research agent UI: src/components/DeepResearchAgent.tsx - handles form submission and real-time progress display using the
useRealtimeTaskTriggerhook. - Progress component: src/components/progress-section.tsx - displays live research progress.