Skip to content

n8n: METRIC_MAP contains three entries that don't match n8n's actual Prometheus output #23633

@kerr-bighealth

Description

@kerr-bighealth

Summary

Three entries in n8n/datadog_checks/n8n/metrics.py reference Prometheus metric names that n8n does not emit. As a result, the corresponding DataDog metrics never receive data, and one important histogram (workflow execution duration) is silently dropped.

Affected lines

metrics.py#L74-L76:

'workflow_executions_active': 'workflow.executions.active',
'workflow_executions_duration_seconds': 'workflow.executions.duration.seconds',
'workflow_executions': 'workflow.executions',

What n8n actually emits

Verified against n8n v2.19.4 (/metrics endpoint, with N8N_METRICS=true):

# HELP n8n_workflow_execution_duration_seconds Workflow execution duration in seconds.
# TYPE n8n_workflow_execution_duration_seconds histogram
n8n_workflow_execution_duration_seconds_bucket{...}
n8n_workflow_execution_duration_seconds_sum{...}
n8n_workflow_execution_duration_seconds_count{...}

# TYPE n8n_workflow_started_total counter
n8n_workflow_started_total{...}

# TYPE n8n_workflow_failed_total counter
n8n_workflow_failed_total{...}

# TYPE n8n_workflow_success_total counter
n8n_workflow_success_total{...}

# TYPE n8n_active_workflow_count gauge
n8n_active_workflow_count 40

No n8n_workflow_executions_* (plural) metric is emitted.

Diagnosis

  • L75 workflow_executions_duration_seconds — misspelling. n8n emits the singular workflow_execution_duration_seconds, consistent with Prometheus convention (singular subject + plural unit, e.g., http_request_duration_seconds). This is the bug with the highest user impact: workflow_execution_duration_seconds is the canonical workflow latency histogram.
  • L74 workflow_executions_active — n8n does not emit this. The closest concept (active workflows) is already mapped at L9 as active_workflow_countactive.workflow.count.
  • L76 workflow_executions — n8n does not emit a single counter for executions; it splits this into workflow_started_total / workflow_failed_total / workflow_success_total, all already correctly mapped at L77–L79.

Impact

With the n8n integration enabled, the agent's debug log shows:

DEBUG | (transform.py:79) | Skipping metric `n8n_workflow_execution_duration_seconds` as it is not defined in `metrics`

The execution duration histogram is silently dropped. Workflow latency dashboards/alerts can't be built from this integration without a workaround.

Proposed fix

  1. L75: rename key to workflow_execution_duration_seconds. The value should also be singular (workflow.execution.duration.seconds) to be consistent with the singular workflow_started/failed/success entries at L77–L79.
  2. L74 and L76: remove. They reference metrics that don't exist.

Reproduce

# n8n container with N8N_METRICS=true
curl -s http://<n8n-host>:5678/metrics | grep -E '^n8n_(workflow|active)'

# Then on the DD agent sidecar
agent check n8n --log-level debug 2>&1 | grep workflow_execution

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions