Description
Add support for ClickHouse as a direct observability backend, enabling users to query OpenTelemetry traces from ClickHouse databases.
Motivation
ClickHouse is mentioned in the roadmap ("Support for additional backends (SigNoz, ClickHouse)"). Many organizations export OTel traces directly to ClickHouse for cost-effective, high-performance storage and analytics. Direct ClickHouse support enables querying traces without intermediate layers.
Use Cases
- Custom OTel Pipelines: Teams using OTel Collector → ClickHouse directly
- Cost Optimization: ClickHouse is cheaper than many SaaS solutions
- Data Lakes: Organizations storing traces in ClickHouse alongside other data
- High Volume: ClickHouse handles billions of spans efficiently
Implementation Requirements
1. Backend Implementation
- Create
src/openllmetry_mcp/backends/clickhouse.py following the abstract interface in backends/base.py
- Implement the following methods:
search_traces() - Query traces using ClickHouse SQL
get_trace() - Retrieve full trace details
list_services() - List available services
get_aggregated_usage() - Aggregate token usage metrics using ClickHouse aggregation functions
2. Configuration
- Add
clickhouse as a backend type option
- Required configuration:
BACKEND_URL: ClickHouse server URL (e.g., http://localhost:8123 or https://clickhouse.cloud)
BACKEND_API_KEY: API key or empty for local
CLICKHOUSE_DATABASE: Database name (default: otel)
CLICKHOUSE_TABLE: Traces table name (default: otel_traces)
CLICKHOUSE_USER: Username (optional)
CLICKHOUSE_PASSWORD: Password (optional)
- Update
.env.example with ClickHouse configuration example
3. Schema Support
Support common OpenTelemetry trace schemas in ClickHouse:
CREATE TABLE otel_traces (
Timestamp DateTime64(9),
TraceId String,
SpanId String,
ParentSpanId String,
ServiceName LowCardinality(String),
SpanName String,
SpanKind LowCardinality(String),
Duration UInt64,
StatusCode LowCardinality(String),
ResourceAttributes Map(String, String),
SpanAttributes Map(String, String)
) ENGINE = MergeTree()
ORDER BY (ServiceName, Timestamp);
4. SQL Query Generation
Generate efficient ClickHouse SQL with proper parameterization to prevent SQL injection.
5. ClickHouse Client Integration
Use the official ClickHouse Python client:
from clickhouse_connect import get_client
class ClickHouseBackend(TraceBackend):
def __init__(self, config: BackendConfig):
self.client = get_client(
host=config.url,
database=config.database,
username=config.user,
password=config.password,
)
6. OpenLLMetry Support
Extract gen_ai.* attributes from SpanAttributes map for proper LLM trace analysis.
7. Documentation
- Add ClickHouse backend configuration to README.md
- Include schema requirements and setup
- Index recommendations for performance
- Connection examples (local and ClickHouse Cloud)
- Query performance tips
- Troubleshooting common issues
8. Testing
- Add unit tests in
tests/backends/test_clickhouse.py
- Test SQL generation and injection prevention
- Test error handling and connection issues
- Integration test with local ClickHouse instance
Example Configuration
Local ClickHouse
BACKEND_TYPE=clickhouse
BACKEND_URL=http://localhost:8123
CLICKHOUSE_DATABASE=otel
CLICKHOUSE_TABLE=otel_traces
ClickHouse Cloud
BACKEND_TYPE=clickhouse
BACKEND_URL=https://abc123.us-east-1.aws.clickhouse.cloud:8443
CLICKHOUSE_DATABASE=default
CLICKHOUSE_TABLE=otel_traces
CLICKHOUSE_USER=default
CLICKHOUSE_PASSWORD=your_password_here
Dependencies
Add to pyproject.toml:
clickhouse-connect = "^0.7.0"
References
Acceptance Criteria
Description
Add support for ClickHouse as a direct observability backend, enabling users to query OpenTelemetry traces from ClickHouse databases.
Motivation
ClickHouse is mentioned in the roadmap ("Support for additional backends (SigNoz, ClickHouse)"). Many organizations export OTel traces directly to ClickHouse for cost-effective, high-performance storage and analytics. Direct ClickHouse support enables querying traces without intermediate layers.
Use Cases
Implementation Requirements
1. Backend Implementation
src/openllmetry_mcp/backends/clickhouse.pyfollowing the abstract interface inbackends/base.pysearch_traces()- Query traces using ClickHouse SQLget_trace()- Retrieve full trace detailslist_services()- List available servicesget_aggregated_usage()- Aggregate token usage metrics using ClickHouse aggregation functions2. Configuration
clickhouseas a backend type optionBACKEND_URL: ClickHouse server URL (e.g.,http://localhost:8123orhttps://clickhouse.cloud)BACKEND_API_KEY: API key or empty for localCLICKHOUSE_DATABASE: Database name (default:otel)CLICKHOUSE_TABLE: Traces table name (default:otel_traces)CLICKHOUSE_USER: Username (optional)CLICKHOUSE_PASSWORD: Password (optional).env.examplewith ClickHouse configuration example3. Schema Support
Support common OpenTelemetry trace schemas in ClickHouse:
4. SQL Query Generation
Generate efficient ClickHouse SQL with proper parameterization to prevent SQL injection.
5. ClickHouse Client Integration
Use the official ClickHouse Python client:
6. OpenLLMetry Support
Extract
gen_ai.*attributes from SpanAttributes map for proper LLM trace analysis.7. Documentation
8. Testing
tests/backends/test_clickhouse.pyExample Configuration
Local ClickHouse
ClickHouse Cloud
Dependencies
Add to
pyproject.toml:References
backends/jaeger.py,backends/tempo.py,backends/traceloop.pyAcceptance Criteria