TelemetryFlow Hermes — Self-Improving AI Agent for Observability Incident Response

Contributing to TelemetryFlow Hermes

Thank you for your interest in contributing! This guide covers everything you need to start contributing to TelemetryFlow Hermes.

Code of Conduct
Getting Started
Development Setup
Project Structure
Making Changes
Testing
Submitting Changes
Style Guide
Architecture Guidelines
Release Process

Code of Conduct

Be respectful, constructive, and professional. We follow the Contributor Covenant code of conduct.

Getting Started

Prerequisites

Requirement	Version	Install
Python 3	3.8+	`python3 --version`
pytest	Latest	`pip install pytest pytest-cov`
ruff	Latest	`pip install ruff`
Hermes Agent	Latest	See Quick Start

Fork and Clone

# Fork the repository on GitHub, then:
git clone https://github.com/YOUR_USERNAME/telemetryflow-hermes.git
cd telemetryflow-hermes

Development Setup

# Install dev dependencies
pip install pytest pytest-cov ruff bandit mypy

# Verify setup
make test

# Run linter
make lint

# Run CI pipeline locally
make ci-pipeline

Project Structure

telemetryflow-hermes/
├── plugins/telemetryflow/tools/    # 40 tool implementations (Python stdlib only)
├── profiles/                       # Agent profiles (triage, investigator, reviewer, remediator)
├── skills/                         # 29 skills across 18 categories
├── tests/                          # Test suite (472 tests, 97% coverage)
│   ├── conftest.py                 # Shared fixtures
│   ├── mocks/                      # Mock objects (MockTFOApi, response factories)
│   ├── unit/                       # Unit tests per tool (34 files)
│   └── integration/                # Integration tests
├── docs/                           # Documentation wiki (28+ pages)
├── cron/                           # Scheduled investigation jobs
├── scripts/                        # Deployment scripts
├── security/                       # ClickHouse read-only SQL
├── hooks/                          # Lifecycle hooks
├── Dockerfile                      # Multi-stage Docker (python:3.13-slim-trixie)
├── docker-compose.yaml             # 4 profiles: core, monitoring, tools, all
├── run-container.sh                # Build, tag, push, compose orchestration
├── Makefile                        # fmt, lint, test, build, ci targets
├── pyproject.toml                  # pytest, ruff, coverage config
└── .github/workflows/              # CI (ci.yml), Docker (docker.yml), Release (release.yml)

Making Changes

Tool Development

All tools follow the same pattern:

Create the tool in plugins/telemetryflow/tools/<tool_name>.py
Use _shared.py helpers: tfo_request(), clickhouse_query(), parse_args(), output_json()
Register in plugin.yaml with name, description, command, args
If it's a write operation, add requires_approval: true
Write tests in tests/unit/test_<tool_name>.py

Tool template:

#!/usr/bin/env python3
"""Description of what the tool does."""

import sys
import os

sys.path.insert(0, os.path.join(os.path.dirname(__file__)))
from _shared import tfo_request, parse_args, output_json


def main():
    args = parse_args()
    required_param = args.get("required_param")
    if not required_param:
        print("ERROR: --required_param is required", file=sys.stderr)
        sys.exit(1)

    result = tfo_request("/api/v2/endpoint", method="GET", params={
        "param": required_param,
    })
    if result is not None:
        output_json(result)


if __name__ == "__main__":
    main()

Skill Development

Skills are Markdown files with YAML frontmatter:

---
name: skill-name
description: >
  When to activate this skill.
  version: 1.2.0
author: agent
---

## Procedure

1. Step one
2. Step two

## Pitfalls

- Common mistake to avoid

## Verification

- How to verify success

Documentation

All docs use GitHub-flavored Markdown with mermaid diagrams for architecture visualizations. See docs/README.md for the wiki index.

Testing

Running Tests

# All tests
make test

# Unit tests only
pytest tests/unit -v

# Integration tests only
pytest tests/integration -v

# With coverage (95%+ required)
make test-cov

# Specific test file
pytest tests/unit/test_query_metrics.py -v

Test Coverage Requirements

Layer	Minimum Coverage
Shared utilities (`_shared.py`)	95%
Individual tools	90%
Overall	95%

Writing Tests

Follow the pattern in tests/unit/test_query_metrics.py:

"""Tests for <tool_name>.py tool."""

import json
from unittest import mock
import pytest


def _import_tool():
    import importlib
    import <tool_module>
    importlib.reload(<tool_module>)
    return <tool_module>


class Test<ToolName>:
    def test_basic(self, mock_env, mock_urlopen, capture_stdout):
        _, mock_resp = mock_urlopen
        mock_resp.read.return_value = json.dumps({"status": "ok"}).encode("utf-8")

        with mock.patch("sys.argv", ["tool.py", "--param", "value"]):
            tool = _import_tool()
            tool.main()

        output = json.loads(capture_stdout.getvalue())
        assert "status" in output

    def test_error_handling(self, mock_env, mock_urlopen_error, mock_exit):
        with mock.patch("sys.argv", ["tool.py", "--param", "value"]):
            tool = _import_tool()
            tool.main()
            mock_exit.assert_called_with(1)

    def test_missing_required_param(self, mock_env, mock_exit):
        with mock.patch("sys.argv", ["tool.py"]):
            tool = _import_tool()
            tool.main()
            mock_exit.assert_called_with(1)

Available Fixtures

Fixture	Description
`mock_env`	Sets all `TELEMETRYFLOW_*` environment variables
`mock_urlopen`	Mocks `urllib.request.urlopen` with configurable response
`mock_urlopen_error`	Mocks HTTP error (404)
`mock_urlopen_conn_error`	Mocks connection error
`capture_stdout`	Captures stdout output
`mock_exit`	Mocks `sys.exit` to prevent test termination

Submitting Changes

Pull Request Process

Create a feature branch: git checkout -b feature/amazing-feature
Write code following the style guide below
Write tests — maintain 95%+ coverage
Run CI locally: make ci-pipeline
Update documentation if adding new features
Commit: git commit -m 'Add amazing feature'
Push: git push origin feature/amazing-feature
Open a Pull Request against main

PR Checklist

All tests pass (make test)
Coverage remains ≥95% (make test-cov)
Linter passes (make lint)
No secrets committed
Documentation updated (if applicable)
CHANGELOG.md updated (if applicable)

Style Guide

Python

Python 3.8+ compatible (no walrus operator in critical paths)
stdlib only — no external pip dependencies in tools
No comments unless requested
Type hints are welcome but not required
Line length: 120 characters max
Formatter: ruff (configured in pyproject.toml)

Naming Conventions

Type	Convention	Example
Tool files	`snake_case.py`	`query_metrics.py`
Test files	`test_<tool>.py`	`test_query_metrics.py`
Skill files	`SKILL.md`	`skills/observability/k8s-pod-debug/SKILL.md`
Profile dirs	`kebab-case/`	`profiles/triage/`
Environment variables	`TELEMETRYFLOW_*`	`TELEMETRYFLOW_API_KEY`

Markdown

GitHub-flavored Markdown for all documentation
Mermaid diagrams for architecture and flow visualization
Tables for reference data
Code blocks with language annotation

Architecture Guidelines

Zero Dependencies Rule

All plugin tools must use Python standard library only. No requests, httpx, click, or any external package. This ensures:

Maximum portability (no virtualenv needed)
Zero supply chain risk
Instant deployment on any system with Python 3

TFO API Communication

All tools communicate with TelemetryFlow Platform through _shared.py:

graph LR
    Tool["Tool<br/>query_metrics.py"] --> Shared["_shared.py<br/>tfo_request()"]
    Shared --> API["TFO API<br/>/api/v2/*"]
    API --> CH["ClickHouse"]

Never connect to ClickHouse directly. Always go through the TFO API for:

Authentication and authorization
Workspace scoping
Audit logging
Rate limiting

Environment Variable Prefix

All environment variables use the TELEMETRYFLOW_ prefix:

TELEMETRYFLOW_API_KEY
TELEMETRYFLOW_API_URL
TELEMETRYFLOW_ORGANIZATION_ID
TELEMETRYFLOW_WORKSPACE_ID

Never use TFO_ or other abbreviations.

Release Process

Update VERSION in pyproject.toml and .github/workflows/ci.yml
Update CHANGELOG.md with the new version section
Commit: git commit -m "chore: bump version to X.Y.Z"
Tag: git tag vX.Y.Z
Push: git push origin main --tags
GitHub Actions automatically creates the release

Built with ❤️ by Telemetri Data Indonesia

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TelemetryFlow Hermes — Self-Improving AI Agent for Observability Incident Response

Contributing to TelemetryFlow Hermes

Table of Contents

Code of Conduct

Getting Started

Prerequisites

Fork and Clone

Development Setup

Project Structure

Making Changes

Tool Development

Skill Development

Documentation

Testing

Running Tests

Test Coverage Requirements

Writing Tests

Available Fixtures

Submitting Changes

Pull Request Process

PR Checklist

Style Guide

Python

Naming Conventions

Markdown

Architecture Guidelines

Zero Dependencies Rule

TFO API Communication

Environment Variable Prefix

Release Process

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

TelemetryFlow Hermes — Self-Improving AI Agent for Observability Incident Response

Contributing to TelemetryFlow Hermes

Table of Contents

Code of Conduct

Getting Started

Prerequisites

Fork and Clone

Development Setup

Project Structure

Making Changes

Tool Development

Skill Development

Documentation

Testing

Running Tests

Test Coverage Requirements

Writing Tests

Available Fixtures

Submitting Changes

Pull Request Process

PR Checklist

Style Guide

Python

Naming Conventions

Markdown

Architecture Guidelines

Zero Dependencies Rule

TFO API Communication

Environment Variable Prefix

Release Process