Full deployment with external LLM providers (Anthropic, Zhipu/OpenCode Go, etc.).
graph TB
subgraph "Agent Host"
HERMES["Hermes Agent<br/>4 Profiles"]
TG["Telegram Gateway<br/>4 Bot Tokens"]
end
subgraph "LLM Providers (External)"
ANTH["Anthropic API<br/>claude-sonnet-4-5"]
ZHIPU["OpenCode Go / Zhipu<br/>glm-5.1"]
end
subgraph "TelemetryFlow Platform"
TFO_API["TFO API<br/>:3000"]
TFO_CH["ClickHouse<br/>:9000"]
TFO_PG["PostgreSQL<br/>:5432"]
TFO_REDIS["Redis<br/>:6379"]
end
subgraph "Kubernetes Cluster"
K8S["kubectl<br/>Remediation actions"]
end
subgraph "Telegram"
TG_BOT["4 Telegram Bots"]
HUMAN["On-Call Engineer"]
end
HERMES -->|"Investigator"| ANTH
HERMES -->|"Triage/Reviewer/Remediator"| ZHIPU
HERMES -->|"REST API<br/>TELEMETRYFLOW_API_KEY"| TFO_API
TFO_API --> TFO_CH
TFO_API --> TFO_PG
TFO_API --> TFO_REDIS
HERMES -->|"scale/restart/rollback"| K8S
HERMES --> TG
TG <-->|"bidirectional"| TG_BOT
TG_BOT <--> HUMAN
git clone https://github.com/telemetryflow/telemetryflow-hermes.git
cd telemetryflow-hermes
make envEdit ~/.hermes/.env:
# Authentication (Method A recommended)
TELEMETRYFLOW_API_KEY=tfs_your_api_key_here
# Connection
TELEMETRYFLOW_API_URL=http://localhost:3000/api/v2
TELEMETRYFLOW_ORGANIZATION_ID=your-org-uuid
TELEMETRYFLOW_WORKSPACE_ID=your-workspace-uuid
# LLM Providers
ANTHROPIC_API_KEY=sk-ant-your-key # Investigator
ZHIPU_API_KEY=your-zhipu-key # Triage/Reviewer/RemediatorIn TelemetryFlow UI:
graph LR
UI["TFO UI"] --> SETTINGS["Settings"]
SETTINGS --> APIK["API Keys"]
APIK --> GEN["Generate New Key"]
GEN --> SCOPES["Set Scopes:<br/>llm:chat, llm:read, llm:write,<br/>llm:insights, telemetry:read"]
SCOPES --> COPY["Copy key → .env"]
Required scopes:
llm:chat— Send chat messagesllm:read— Read providers and conversationsllm:write— Manage providersllm:insights— Generate insightstelemetry:read— Query ClickHouse
# Run on ClickHouse server as admin
clickhouse-client < security/clickhouse-readonly.sqlThis creates hermes_readonly with SELECT grants on 20 telemetry tables. See ClickHouse Security.
Create 4 bots via @BotFather on Telegram:
/newbot → triage_bot
/newbot → investigator_bot
/newbot → reviewer_bot
/newbot → remediator_bot
For each bot, get the chat ID:
- Message the bot
- Visit
https://api.telegram.org/bot<TOKEN>/getUpdates - Copy the
chat.idvalue
make initThis installs the Hermes agent and configures:
- 4 agent profiles (triage, investigator, reviewer, remediator)
- 29 skills across 18 categories
- 6 cron jobs
- 37 plugin tools
- 3 lifecycle hooks
- ClickHouse read-only security
- 4 Telegram gateways
make verifyVerification checks:
graph TD
V1["Hermes installation"]
V2["Config files present"]
V3["API keys set"]
V4["TelemetryFlow API reachable"]
V5["ClickHouse accessible"]
V6["Telegram gateways"]
V7["Tools available"]
V1 --> V2 --> V3 --> V4 --> V5 --> V6 --> V7
V7 --> OK["ALL CHECKS PASSED"]
make deployStarts 4 background processes:
hermes -p triage gateway starthermes -p investigator gateway starthermes -p reviewer gateway starthermes -p remediator gateway start
Send a test alert to the Triage bot on Telegram:
ALERT: payments-api p95 latency breach
Service: payments-api
Metric: http_server_duration_p95
Value: 640ms (threshold: 200ms)
Severity: HIGH
Time: 2026-06-04T03:47:00Z
Expected flow: Triage → Investigator → Reviewer → Remediator → Human approval (~23 seconds).
| Target | Description |
|---|---|
make init |
First-time setup (install hermes + configure everything) |
make configure |
Install profiles, skills, plugins, cron, hooks |
make env |
Setup .env from .env.example |
make deploy |
Start all 4 Telegram agent gateways |
make stop |
Stop all gateways |
make status |
Check gateway status |
make verify |
End-to-end verification |
make doctor |
Run hermes doctor --fix |
make docker-build |
Build Docker image |
make docker-up |
Start Docker containers |
make docker-down |
Stop Docker containers |
make clean |
Remove installed components |
make reset |
Clean + re-configure |
Use systemd or screen to keep gateways running:
# Using screen
screen -dmS hermes-triage hermes -p triage gateway start
screen -dmS hermes-investigator hermes -p investigator gateway start
screen -dmS hermes-reviewer hermes -p reviewer gateway start
screen -dmS hermes-remediator hermes -p remediator gateway start
# Or for SSH disconnects
sudo loginctl enable-linger $USERAll agent activity is logged to ~/.hermes/logs/:
| Log File | Contents |
|---|---|
agent.log |
Agent activity and decisions |
gateway.log |
Telegram gateway events |
errors.log |
Error traces |
investigations.log |
Investigation start/end (from hooks) |
remediations.log |
Remediation outcomes (from hooks) |
alerts.log |
Alert received events (from hooks) |
# Check all gateway status
make status
# Watch recent investigation logs
tail -f ~/.hermes/logs/investigations.log
# Check errors
grep -i error ~/.hermes/logs/gateway.log | tail -20