diff --git a/docs/observability/index.md b/docs/observability/index.md index 963e5c953..aba9bde6d 100644 --- a/docs/observability/index.md +++ b/docs/observability/index.md @@ -8,8 +8,8 @@ in-process behavior. Basic input and output monitoring is typically insufficient for agents with any significant level of complexity. Agent Development Kit (ADK) provides built-in observability through -[logging](/observability/logging/) and [traces](/observability/traces/) to help -you monitor and debug your agents. However, you may need to consider more +[logging](/observability/logging/), [metrics](/observability/metrics/), and +[traces](/observability/traces/) to help you monitor and debug your agents. However, you may need to consider more advanced [observability ADK Integrations](/integrations/?topic=observability) for monitoring and analysis. diff --git a/docs/observability/metrics.md b/docs/observability/metrics.md new file mode 100644 index 000000000..c424d8374 --- /dev/null +++ b/docs/observability/metrics.md @@ -0,0 +1,93 @@ +# Agent activity metrics + +
+ Supported in ADKPython v1.32.0 +
+ +Agent Development Kit (ADK) provides built-in, vendor-neutral metrics collection to help you understand the performance, cost, and usage patterns of your agents. While logs provide a detailed narrative of *what* happened, metrics give you aggregated, quantitative data to answer *how often* and *how fast* things are happening. + +## Metrics philosophy + +ADK's approach to metrics is designed to be lightweight, standardized, and entirely agnostic to your choice of monitoring backend. + +* **OpenTelemetry Semantic Conventions:** ADK implements the OpenTelemetry (OTel) [Semantic Conventions for GenAI](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/gen-ai/gen-ai-metrics.md). This ensures that metrics are recorded under standard, predictable attribute and metric names. +* **OTLP Wire Format:** ADK emits data using the standard OTLP format, ensuring that your metrics will seamlessly integrate into any OTel-compatible backend (e.g., Prometheus, Datadog, SigNoz, Google Cloud Monitoring). +* **Cost and Performance Focused:** Metrics are significantly less costly and more performant than logs or traces when performing analytics over large swathes of data. ADK tracks the most critical signals for LLM applications: token consumption, request latency, and tool execution reliability. +* **Vendor-Neutral Export:** ADK does not lock you into a specific metrics pipeline. You instantiate standard OTel meter providers and export data wherever your infrastructure demands. + +--- + +## Metrics schema + +When metrics are enabled, ADK automatically instruments the agent's lifecycle, workflow steps, and tool executions based on the OpenTelemetry GenAI Semantic Conventions. The following core metrics are emitted: + +| Metric Name | Type | Description | Key Attributes (Dimensions) | +| :--- | :--- | :--- | :--- | +| **`gen_ai.agent.invocation.duration`** | Histogram | The total time taken for an agent to process a prompt and return a response. | `gen_ai.agent.name`, `error.type` | +| **`gen_ai.tool.execution.duration`** | Histogram | The execution latency of individual tools called by the agent. Useful for spotting slow external APIs. | `gen_ai.tool.name`, `error.type` | +| **`gen_ai.agent.request.size`** | Histogram | The size or complexity of the incoming request sent to the agent. | `gen_ai.agent.name` | +| **`gen_ai.agent.response.size`** | Histogram | The size or complexity of the final response generated by the agent. | `gen_ai.agent.name` | +| **`gen_ai.agent.workflow.steps`** | Histogram | Tracks the number of iterative steps or reasoning loops an agent takes to complete a workflow. | `gen_ai.agent.name` | + +--- + +## Metrics export setup + +### Metrics export in ADK Web + +If you are running your agent using the `adk web` or `adk api_server` CLI commands, you can configure metrics export. + + +#### OTLP export + +To export metrics to an OTLP-compatible backend, set the standard OTel environment variables: + +```bash +export OTEL_EXPORTER_OTLP_METRICS_ENDPOINT="http://your-collector:4318/v1/metrics" +adk web path/to/your/agents_dir +``` + +> **Note:** You can also set the general `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable if you would like to send traces and logs to the same endpoint in addition to metrics. + +#### GCP export + +To enable metrics export to Google Cloud Monitoring, use the `-otel_to_cloud` flag: + +```bash +adk web -otel_to_cloud path/to/your/agents_dir +``` + +### Programmatic metrics export + +You can also configure metrics export programmatically in your application code. + +#### OTLP export setup + +To enable metrics and export them to an OpenTelemetry Collector (or an OTLP-compatible backend) programmatically: + +```python +from google.adk.telemetry.setup import maybe_set_otel_providers +import os + +os.environ["OTEL_EXPORTER_OTLP_METRICS_ENDPOINT"] = "http://your-collector:4318/v1/metrics" +os.environ["OTEL_SERVICE_NAME"] = "your-adk-agent" +os.environ["OTEL_RESOURCE_ATTRIBUTES"] = "key1=value1,key2=value2" +maybe_set_otel_providers() +``` + +#### GCP export setup + +To export metrics to Google Cloud Monitoring programmatically, use the OpenTelemetry Google Cloud exporter. Here is an example in Python: + +```python +from google.adk.telemetry.google_cloud import get_gcp_exporters +from google.adk.telemetry.setup import maybe_set_otel_providers +import os + +gcp_exporters = get_gcp_exporters( + enable_cloud_metrics = True, +) +os.environ["OTEL_SERVICE_NAME"] = "your-adk-agent" +os.environ["OTEL_RESOURCE_ATTRIBUTES"] = "key1=value1,key2=value2" +maybe_set_otel_providers([gcp_exporters]) +``` diff --git a/mkdocs.yml b/mkdocs.yml index bbe5d6fe5..97c116dda 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -346,6 +346,7 @@ nav: - Observability: - observability/index.md - Logging: observability/logging.md + - Metrics: observability/metrics.md - Traces: observability/traces.md - Evaluation: - evaluate/index.md