Install and configure TelemetryFlow Hermes for autonomous incident response.
| Requirement | Version | Check |
|---|---|---|
| Python 3 | 3.8+ | python3 --version |
| Hermes Agent | Latest | hermes --version |
| TelemetryFlow Platform | Running | curl $TELEMETRYFLOW_API_URL/health |
| ClickHouse | Running | Via TFO API |
| Telegram Bot Token | 4 tokens | @BotFather |
| kubectl | 1.28+ | kubectl version |
git clone https://github.com/telemetryflow/telemetryflow-hermes.git
cd telemetryflow-hermesmake envEdit ~/.hermes/.env with your credentials. See Environment Variables for the complete reference.
Minimum required variables:
# TFO Platform Authentication (pick one method)
TELEMETRYFLOW_API_KEY=tfs_your_api_key_here
# TelemetryFlow Platform Connection
TELEMETRYFLOW_API_URL=http://localhost:3000/api/v2
TELEMETRYFLOW_ORGANIZATION_ID=your-org-uuid
TELEMETRYFLOW_WORKSPACE_ID=your-workspace-uuid
# LLM Provider (for Hermes agent reasoning)
ZHIPU_API_KEY=your_zhipu_key # Triage/Reviewer/Remediator
ANTHROPIC_API_KEY=your_anthropic_key # InvestigatorCreate 4 bots via @BotFather on Telegram. Each agent needs its own bot token (Telegram allows 1 connection per token).
make initThis installs the Hermes agent and configures everything:
- 4 agent profiles (triage, investigator, reviewer, remediator)
- 29 skills across 18 categories
- 6 cron jobs
- 40 plugin tools covering all 20 TFO Platform modules
- 3 lifecycle hooks
- ClickHouse read-only security
- 4 Telegram gateways
make verifyThis runs end-to-end verification:
- Hermes installation
- API connectivity
- ClickHouse access
- Telegram gateway
- Tool availability
make deployStarts all 4 Telegram gateways in background.
Send a test alert to your Triage agent's Telegram bot:
ALERT: payments-api p95 latency breach
Service: payments-api
Metric: http_server_duration_p95
Value: 640ms (threshold: 200ms)
Severity: HIGH
Time: 2026-06-04T03:47:00Z
The agent pipeline will:
graph LR
A["Alert received"] --> B["Triage classifies"]
B --> C["Investigator gathers evidence"]
C --> D["Reviewer verifies"]
D --> E["Remediator proposes action"]
E --> F["Human approves"]
Expected timeline: ~23 seconds from alert to proposed remediation.
- Architecture Overview — understand the system design
- Agent Configuration — customize each agent's behavior
- Tool Reference — all 40 tools and parameters
- Deployment Guide — production deployment