| layout | default |
|---|---|
| title | Chapter 1: Getting Started with Deer Flow |
| parent | Deer Flow Tutorial |
| nav_order | 1 |
Welcome to Deer Flow! This chapter will guide you through installing and setting up Deer Flow, understanding its core concepts, and creating your first distributed workflow.
- Deer Flow installation and setup
- Core concepts (workflows, tasks, dependencies)
- Basic workflow creation and execution
- Web interface and API usage
- First distributed workflow walkthrough
Deer Flow consists of several key components working together to orchestrate distributed workflows:
graph TB
subgraph "Control Plane"
A[Workflow Manager]
B[Task Scheduler]
C[Dependency Resolver]
D[State Manager]
end
subgraph "Execution Plane"
E[Worker Nodes]
F[Task Executors]
G[Resource Pool]
H[Load Balancer]
end
subgraph "Data Plane"
I[Workflow Store]
J[Task Queue]
K[Result Store]
L[Metrics Store]
end
subgraph "Integration Layer"
M[REST API]
N[Web UI]
O[CLI Tools]
P[SDK Libraries]
end
A --> B
B --> C
C --> D
D --> E
E --> F
F --> G
G --> H
A --> I
A --> J
A --> K
A --> L
A --> M
A --> N
A --> O
A --> P
- Workflow Manager: Orchestrates workflow execution
- Task Scheduler: Assigns tasks to workers
- Worker Nodes: Execute tasks in parallel
- State Manager: Tracks workflow and task states
- Result Store: Persists task outputs and results
# Clone the repository
git clone https://github.com/bytedance/deer-flow.git
cd deer-flow
# Start all services
docker-compose up -d
# Check service status
docker-compose ps
# View logs
docker-compose logs -f deer-flow-serverWhat this starts:
- Deer Flow server (port 8080)
- Redis for task queuing
- PostgreSQL for data storage
- Web interface
# Install system dependencies
sudo apt update
sudo apt install python3.10 python3.10-venv postgresql redis-server
# Clone and setup
git clone https://github.com/bytedance/deer-flow.git
cd deer-flow
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Setup database
sudo -u postgres createdb deerflow
sudo -u postgres psql -c "CREATE USER deerflow WITH PASSWORD 'deerflow123';"
sudo -u postgres psql -c "GRANT ALL PRIVILEGES ON DATABASE deerflow TO deerflow;"
# Start Redis
sudo systemctl start redis-server
# Start the server
python run_server.py# kubernetes/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: deer-flow
spec:
replicas: 3
selector:
matchLabels:
app: deer-flow
template:
metadata:
labels:
app: deer-flow
spec:
containers:
- name: deer-flow
image: bytedance/deer-flow:latest
ports:
- containerPort: 8080
env:
- name: DATABASE_URL
value: "postgresql://deerflow:deerflow123@postgres:5432/deerflow"
- name: REDIS_URL
value: "redis://redis:6379"
resources:
limits:
cpu: "1000m"
memory: "1Gi"
requests:
cpu: "500m"
memory: "512Mi"# Database configuration
export DATABASE_URL="postgresql://deerflow:deerflow123@localhost:5432/deerflow"
# Redis configuration
export REDIS_URL="redis://localhost:6379"
# Server configuration
export SERVER_HOST="0.0.0.0"
export SERVER_PORT="8080"
# Worker configuration
export WORKER_CONCURRENCY="4"
export WORKER_PREFETCH="2"
# Logging
export LOG_LEVEL="INFO"
export LOG_FILE="/var/log/deer-flow.log"# config.yaml
server:
host: "0.0.0.0"
port: 8080
workers: 4
database:
url: "postgresql://deerflow:deerflow123@localhost:5432/deerflow"
pool_size: 10
max_overflow: 20
redis:
url: "redis://localhost:6379"
db: 0
worker:
concurrency: 4
prefetch: 2
heartbeat: 30
logging:
level: "INFO"
file: "/var/log/deer-flow.log"
format: "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
monitoring:
enabled: true
metrics_port: 9090Once installed, access Deer Flow through:
# Web interface
open http://localhost:8080
# API endpoints
curl http://localhost:8080/api/health
# CLI tools
deer-flow --help- Web UI: No authentication required (configure as needed)
- API: No authentication by default (add middleware for production)
// simple_workflow.json
{
"name": "hello_world",
"description": "A simple hello world workflow",
"version": "1.0",
"tasks": [
{
"id": "hello_task",
"name": "Hello Task",
"type": "shell",
"command": "echo 'Hello, Deer Flow!'",
"timeout": 30
}
]
}# Via API
curl -X POST http://localhost:8080/api/workflows \
-H "Content-Type: application/json" \
-d @simple_workflow.json
# Via CLI
deer-flow workflow submit simple_workflow.json
# Via Python SDK
from deerflow import WorkflowClient
client = WorkflowClient("http://localhost:8080")
workflow_id = client.submit_workflow("simple_workflow.json")
print(f"Workflow submitted: {workflow_id}")# Check workflow status
curl http://localhost:8080/api/workflows/{workflow_id}/status
# View task logs
curl http://localhost:8080/api/workflows/{workflow_id}/tasks/{task_id}/logs
# Get workflow results
curl http://localhost:8080/api/workflows/{workflow_id}/resultWorkflow States
├── PENDING - Workflow submitted, waiting to start
├── RUNNING - Workflow is currently executing
├── COMPLETED - All tasks completed successfully
├── FAILED - One or more tasks failed
├── CANCELLED - Workflow was cancelled
└── TIMEOUT - Workflow exceeded time limit
Task States
├── PENDING - Task waiting to be scheduled
├── SCHEDULED - Task assigned to a worker
├── RUNNING - Task is currently executing
├── COMPLETED - Task completed successfully
├── FAILED - Task execution failed
├── RETRY - Task failed, will retry
├── CANCELLED - Task was cancelled
└── TIMEOUT - Task exceeded time limit
{
"tasks": [
{
"id": "shell_task",
"name": "Shell Command",
"type": "shell",
"command": "echo 'Hello World'",
"working_directory": "/tmp",
"environment": {
"MY_VAR": "my_value"
},
"timeout": 30
},
{
"id": "http_task",
"name": "HTTP Request",
"type": "http",
"method": "GET",
"url": "https://api.example.com/data",
"headers": {
"Authorization": "Bearer token"
},
"timeout": 60
},
{
"id": "python_task",
"name": "Python Function",
"type": "python",
"module": "my_module",
"function": "my_function",
"args": ["arg1", "arg2"],
"kwargs": {"key": "value"}
}
]
}{
"name": "data_processing_pipeline",
"description": "Process and analyze data files",
"version": "1.0.0",
"author": "Data Team",
"tags": ["data", "processing", "analytics"],
"timeout": 3600,
"retry_policy": {
"max_attempts": 3,
"backoff": "exponential",
"initial_delay": 1
},
"notifications": {
"on_success": ["email@company.com"],
"on_failure": ["alerts@company.com"]
}
}{
"tasks": [
{
"id": "extract_data",
"name": "Extract Data",
"type": "shell",
"command": "python extract.py"
},
{
"id": "transform_data",
"name": "Transform Data",
"type": "shell",
"command": "python transform.py",
"depends_on": ["extract_data"]
},
{
"id": "load_data",
"name": "Load Data",
"type": "shell",
"command": "python load.py",
"depends_on": ["transform_data"]
}
]
}{
"tasks": [
{
"id": "task_a",
"name": "Task A",
"type": "shell",
"command": "echo 'Task A'"
},
{
"id": "task_b",
"name": "Task B",
"type": "shell",
"command": "echo 'Task B'",
"depends_on": ["task_a"]
},
{
"id": "task_c",
"name": "Task C",
"type": "shell",
"command": "echo 'Task C'",
"depends_on": ["task_a"]
},
{
"id": "task_d",
"name": "Task D",
"type": "shell",
"command": "echo 'Task D'",
"depends_on": ["task_b", "task_c"]
}
]
}The Deer Flow web interface provides:
- Workflow Overview: List of all workflows with status
- Task Details: Individual task execution details
- Real-time Logs: Live streaming of task logs
- Performance Metrics: Execution times and success rates
- Dependency Graph: Visual representation of task relationships
# Get workflow statistics
curl http://localhost:8080/api/workflows/stats
# List running workflows
curl http://localhost:8080/api/workflows?status=running
# Get task execution history
curl http://localhost:8080/api/tasks/history
# Monitor system health
curl http://localhost:8080/api/health# Check if services are running
docker-compose ps
# Verify API connectivity
curl http://localhost:8080/api/health
# Check logs
docker-compose logs deer-flow-server# Check database connectivity
docker-compose exec postgres psql -U deerflow -d deerflow -c "SELECT 1"
# Reset database
docker-compose down -v
docker-compose up -d# Check worker status
curl http://localhost:8080/api/workers
# Restart workers
docker-compose restart deer-flow-worker
# Scale workers
docker-compose up -d --scale deer-flow-worker=3# Check resource usage
docker stats
# Monitor queue length
curl http://localhost:8080/api/queue/stats
# Adjust worker concurrency
export WORKER_CONCURRENCY=8
docker-compose restart deer-flow-worker- Definition: JSON specification of tasks and dependencies
- Execution: Orchestrated by the workflow manager
- State: Tracked throughout lifecycle
- Results: Stored and accessible via API
- Types: Shell, HTTP, Python, custom
- Dependencies: Define execution order
- Retries: Automatic retry on failure
- Timeouts: Prevent hanging tasks
- Scaling: Horizontal scaling for performance
- Isolation: Each worker runs in separate container
- Monitoring: Health checks and metrics
- Resource: CPU and memory allocation
# Workflow metrics
workflow_metrics = {
'total_workflows': 0,
'running_workflows': 0,
'completed_workflows': 0,
'failed_workflows': 0,
'average_execution_time': 0.0
}
# Task metrics
task_metrics = {
'total_tasks': 0,
'running_tasks': 0,
'completed_tasks': 0,
'failed_tasks': 0,
'retry_rate': 0.0
}
# System metrics
system_metrics = {
'cpu_usage': 0.0,
'memory_usage': 0.0,
'queue_length': 0,
'worker_count': 0
}Congratulations! 🎉 You've successfully:
- ✅ Installed Deer Flow using Docker
- ✅ Configured the system components
- ✅ Created and executed your first workflow
- ✅ Explored the web interface and API
- ✅ Understood workflow and task states
- ✅ Set up basic monitoring
Ready to create more complex workflows? Let's explore Chapter 2: Workflow Basics to learn about task types, dependencies, and advanced workflow patterns.
Practice what you've learned:
- Experiment with different installation methods
- Create workflows with multiple task types
- Set up dependencies between tasks
- Monitor workflow execution in the web UI
- Explore the API endpoints for automation
What's the first distributed workflow you want to build? 🔀
Most teams struggle here because the hard part is not writing more code, but deciding clear boundaries for Task, deer, flow so behavior stays predictable as complexity grows.
In practical terms, this chapter helps you avoid three common failures:
- coupling core logic too tightly to one implementation path
- missing the handoff boundaries between setup, execution, and validation
- shipping changes without clear rollback or observability strategy
After working through this chapter, you should be able to reason about Chapter 1: Getting Started with Deer Flow as an operating subsystem inside Deer Flow Tutorial: Distributed Workflow Orchestration Platform, with explicit contracts for inputs, state transitions, and outputs.
Use the implementation notes around name, localhost, http as your checklist when adapting these patterns to your own repository.
Under the hood, Chapter 1: Getting Started with Deer Flow usually follows a repeatable control path:
- Context bootstrap: initialize runtime config and prerequisites for
Task. - Input normalization: shape incoming data so
deerreceives stable contracts. - Core execution: run the main logic branch and propagate intermediate state through
flow. - Policy and safety checks: enforce limits, auth scopes, and failure boundaries.
- Output composition: return canonical result payloads for downstream consumers.
- Operational telemetry: emit logs/metrics needed for debugging and performance tuning.
When debugging, walk this sequence in order and confirm each stage has explicit success/failure conditions.
Use the following upstream sources to verify implementation details while reading this chapter:
- Official Documentation
Why it matters: authoritative reference on
Official Documentation(github.com). - GitHub Repository
Why it matters: authoritative reference on
GitHub Repository(github.com). - API Reference
Why it matters: authoritative reference on
API Reference(github.com). - Community & Issues
Why it matters: authoritative reference on
Community & Issues(github.com). - Workflow Examples
Why it matters: authoritative reference on
Workflow Examples(github.com). - AI Codebase Knowledge Builder
Why it matters: authoritative reference on
AI Codebase Knowledge Builder(github.com).
Suggested trace strategy:
- search upstream code for
Taskanddeerto map concrete implementation paths - compare docs claims against actual runtime/config code before reusing patterns in production