Skip to content

Latest commit

 

History

History
127 lines (93 loc) · 4.04 KB

File metadata and controls

127 lines (93 loc) · 4.04 KB

Observing a Workflow with OpenTelemetry Collector

This guide shows how to stream OpenTelemetry (OTel) traces from your NeMo Agent Toolkit workflows to the generic OTel collector, which in turn provides the ability to export those traces to many different places including file stores (like S3), Datadog, Dynatrace, and others.

In this guide, you will learn how to:

  • Deploy the generic OTel collector with a configuration that saves traces to the local file system. The configuration can be modified to export to other systems.
  • Configure your workflow (YAML) or Python script to send traces to the OTel collector.
  • Run the workflow and view traces in the local file.

Configure and deploy the OTel Collector

  1. Configure the OTel Collector using a otlp receiver and the exporter of your choice. For this example, create a file named otelcollectorconfig.yaml:

    receivers:
      otlp:
        protocols:
          http:
            endpoint: 0.0.0.0:4318
    
    processors:
      batch:
        send_batch_size: 100
        timeout: 10s
    
    exporters:
      file:
        path: /otellogs/llm_spans.json
        format: json
    
    service:
      pipelines:
        traces:
          receivers: [otlp]
          processors: [batch]
          exporters: [file]
  2. Install and run your configured OTel Collector noting the endpoint URL such as http://localhost:4318. For this example, run the OTel Collector using Docker and the configuration file from step 1:

    mkdir otellogs
    chmod 777 otellogs
    docker run -v $(pwd)/otelcollectorconfig.yaml:/etc/otelcol-contrib/config.yaml \
      -p 4318:4318 \
      -v $(pwd)/otellogs:/otellogs/ \
      otel/opentelemetry-collector-contrib:0.128.0

Install the OpenTelemetry Subpackage

If you installed the NeMo Agent Toolkit from source, you can install package extras with one of the following commands, depending on whether you installed the NeMo Agent Toolkit from source or from a package.

::::{tab-set} :sync-group: install-tool

:::{tab-item} source :selected: :sync: source

uv pip install -e ".[opentelemetry]"

:::

:::{tab-item} package :sync: package

uv pip install "nvidia-nat[opentelemetry]"

:::

::::

Modify Workflow Configuration

Update your workflow configuration file to include the telemetry settings.

Example configuration:

general:
  telemetry:
    tracing:
      otelcollector:
        _type: otelcollector
        # The endpoint where you have deployed the otel collector
        endpoint: http://0.0.0.0:4318/v1/traces
        project: your_project_name

Run the workflow

nat run --config_file <path/to/your/config/file.yml> --input "your notional input"

As the workflow runs, spans are sent to the OTel Collector which in turn exports them based on the exporter you configured. In this example, you can view the exported traces in the local file:

cat otellogs/llm_spans.json