Skip to content

chore(metrics): generate internal metric documentation from Rust source#25460

Open
thomasqueirozb wants to merge 46 commits into
masterfrom
25374-website-internal-metrics-generated
Open

chore(metrics): generate internal metric documentation from Rust source#25460
thomasqueirozb wants to merge 46 commits into
masterfrom
25374-website-internal-metrics-generated

Conversation

@thomasqueirozb
Copy link
Copy Markdown
Member

@thomasqueirozb thomasqueirozb commented May 19, 2026

Summary

Implements #25374: internal metric documentation is now fully generated from Rust source rather than manually maintained in CUE. website/cue/reference/components/sources/internal_metrics.cue is now metadata-only and all metric output definitions come from the generated file.

  • CounterName, HistogramName, GaugeName in lib/vector-common/src/internal_event/metric_name.rs contain all information needed to generate internal metrics.
  • src/generate_schema.rs injects _metric_schemas: {counters, histograms, gauges} into the schema JSON produced by vector generate-schema. vdev build component-docs creates website/cue/reference/generated/internal_metric_descriptions.cue from the schema.

Vector configuration

NA

How did you test this PR?

make generate-component-docs passes. cargo check --workspace --no-default-features passes. make check-clippy passes.

The following script compares docs.json between master and this branch to verify metric fidelity (124 shared metrics are fully equal after normalization; 3 intentional changes noted):

verify_metrics.py
#!/usr/bin/env python3
"""
Verifies that every metric present in BOTH master and branch has identical
tag examples. Differences due to metrics only in one file are expected and
reported separately.
"""
import json, sys

with open("/private/tmp/docs.json.master") as f:
    master = json.load(f)
with open("/private/tmp/docs.json.branch") as f:
    branch = json.load(f)

def get_metrics(data):
    return data["components"]["sources"]["internal_metrics"]["output"]["metrics"]

def normalize(v):
    """Normalize strings: collapse whitespace, strip trailing periods, normalize quotes."""
    if isinstance(v, str):
        v = " ".join(v.split()).rstrip(".")
        # Normalize typographic quotes to ASCII (master went through Hugo, branch is raw Rust)
        v = v.replace("’", "'").replace("‘", "'")
        v = v.replace("“", '"').replace("”", '"')
        return v
    return v

def flat_diff(a, b, path=""):
    """Yield (path, master_val, branch_val) for every leaf that differs."""
    if isinstance(a, dict) and isinstance(b, dict):
        for k in sorted(set(a) | set(b)):
            yield from flat_diff(a.get(k), b.get(k), f"{path}.{k}" if path else k)
    elif normalize(a) != normalize(b):
        yield (path, a, b)

master_metrics = get_metrics(master)
branch_metrics = get_metrics(branch)

common = set(master_metrics) & set(branch_metrics)
only_master = set(master_metrics) - set(branch_metrics)
only_branch = set(branch_metrics) - set(master_metrics)

mismatches = []
for name in sorted(common):
    m = master_metrics[name]
    b = branch_metrics[name]
    diffs = list(flat_diff(m, b))
    if diffs:
        mismatches.append((name, diffs))

print(f"Metrics in both:        {len(common)}")
print(f"Only in master:         {len(only_master)}")
print(f"Only in branch (new):   {len(only_branch)}")
print()

if mismatches:
    print(f"{len(mismatches)} metric(s) differ:\n")
    for name, diffs in mismatches:
        print(f"  {name}:")
        for path, mv, bv in diffs:
            print(f"    {path}")
            print(f"      master: {mv!r}")
            print(f"      branch: {bv!r}")
else:
    print("PASS — all shared metrics are fully equal.")

print()
print("── Removed (only in master) ─────────────────────────────────────────────────")
for name in sorted(only_master):
    print(f"  - {name}")

print()
print("── Added (only in branch) ───────────────────────────────────────────────────")
for name in sorted(only_branch):
    print(f"  + {name}")

print()
print("── Changed (shared but differ) ──────────────────────────────────────────────")
if mismatches:
    for name, diffs in mismatches:
        fields = ", ".join(p for p, _, _ in diffs)
        print(f"  ~ {name}  [{fields}]")
else:
    print("  (none)")

if mismatches:
    sys.exit(1)
Output
Metrics in both:        124
Only in master:         10
Only in branch (new):   37

3 metric(s) differ:

  collect_duration_seconds:
    description
      master: 'The duration spent collecting of metrics for this component.'
      branch: 'The duration spent collecting metrics for this component.'
  http_server_handler_duration_seconds:
    description
      master: 'The duration spent handling a HTTP request.'
      branch: 'The duration spent handling an HTTP request.'
  open_files:
    type
      master: 'counter'
      branch: 'gauge'

── Removed (only in master) ─────────────────────────────────────────────────
  - buffer_received_event_bytes_total
  - buffer_sent_event_bytes_total
  - config_reloaded
  - http_requests_total
  - invalid_record_total
  - protobuf_decode_errors_total
  - send_errors_total
  - stdin_reads_failed_total
  - streams_total
  - timestamp_parse_errors_total

── Added (only in branch) ───────────────────────────────────────────────────
  + active_endpoints
  + adaptive_concurrency_back_pressure
  + adaptive_concurrency_past_rtt_mean
  + adaptive_concurrency_reached_limit
  + buffer_discarded_bytes_total
  + buffer_errors_total
  + buffer_max_byte_size
  + buffer_max_event_size
  + buffer_max_size_bytes
  + buffer_max_size_events
  + buffer_received_bytes_total
  + buffer_sent_bytes_total
  + component_allocated_bytes
  + component_allocated_bytes_total
  + component_deallocated_bytes_total
  + decoder_bom_removals_total
  + decoder_malformed_replacement_warnings_total
  + encoder_unmappable_replacement_warnings_total
  + http_client_error_rtt_seconds
  + http_client_errors_total
  + k8s_event_namespace_annotation_failures_total
  + k8s_event_node_annotation_failures_total
  + memory_enrichment_table_byte_size
  + memory_enrichment_table_failed_insertions
  + memory_enrichment_table_failed_reads
  + memory_enrichment_table_flushes_total
  + memory_enrichment_table_insertions_total
  + memory_enrichment_table_objects_count
  + memory_enrichment_table_reads_total
  + memory_enrichment_table_ttl_expirations
  + parse_errors_total
  + rewritten_timestamp_events_total
  + sqs_message_defer_succeeded_total
  + tag_cardinality_tracked_keys
  + tag_cardinality_untracked_events_total
  + websocket_bytes_sent_total
  + websocket_messages_sent_total

── Changed (shared but differ) ──────────────────────────────────────────────
  ~ collect_duration_seconds  [description]
  ~ http_server_handler_duration_seconds  [description]
  ~ open_files  [type]

The 3 differences are intentional: two grammar fixes (collecting ofcollecting, a HTTPan HTTP) and one type correction (open_files: countergauge).

Change Type

  • Bug fix
  • New feature
  • Dependencies
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

…rable macro to support deprecated = "..." on enum variants
…ags in Rust, eliminate enum metric blocks from internal_metrics.cue
@github-actions github-actions Bot added docs review on hold The documentation team reviews PRs only after a PR is approved by the COSE team. domain: external docs Anything related to Vector's external, public documentation domain: vdev Anything related to the vdev tooling and removed docs review on hold The documentation team reviews PRs only after a PR is approved by the COSE team. labels May 19, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Your preview site for the vector.dev will be ready in a few minutes, please allow time for it to build.

Heres your preview link:
vector.dev preview

@github-actions
Copy link
Copy Markdown
Contributor

Your preview site for the VRL Playground will be ready in a few minutes, please allow time for it to build.

Heres your preview link:
VRL Playground preview

@github-actions
Copy link
Copy Markdown
Contributor

Your preview site for the Rust Doc will be ready in a few minutes, please allow time for it to build.

Heres your preview link:
Rust Doc preview

@thomasqueirozb thomasqueirozb added the no-changelog Changes in this PR do not need user-facing explanations in the release changelog label May 19, 2026
@thomasqueirozb
Copy link
Copy Markdown
Member Author

@vectordotdev/documentation

These are the additions which should be reviewed:

  • active_endpoints
  • adaptive_concurrency_back_pressure
  • adaptive_concurrency_past_rtt_mean
  • adaptive_concurrency_reached_limit
  • buffer_discarded_bytes_total
  • buffer_errors_total
  • buffer_max_byte_size
  • buffer_max_event_size
  • buffer_max_size_bytes
  • buffer_max_size_events
  • buffer_received_bytes_total
  • buffer_sent_bytes_total
  • component_allocated_bytes
  • component_allocated_bytes_total
  • component_deallocated_bytes_total
  • decoder_bom_removals_total
  • decoder_malformed_replacement_warnings_total
  • encoder_unmappable_replacement_warnings_total
  • http_client_error_rtt_seconds
  • http_client_errors_total
  • k8s_event_namespace_annotation_failures_total
  • k8s_event_node_annotation_failures_total
  • memory_enrichment_table_byte_size
  • memory_enrichment_table_failed_insertions
  • memory_enrichment_table_failed_reads
  • memory_enrichment_table_flushes_total
  • memory_enrichment_table_insertions_total
  • memory_enrichment_table_objects_count
  • memory_enrichment_table_reads_total
  • memory_enrichment_table_ttl_expirations
  • parse_errors_total
  • rewritten_timestamp_events_total
  • sqs_message_defer_succeeded_total
  • tag_cardinality_tracked_keys
  • tag_cardinality_untracked_events_total
  • websocket_bytes_sent_total
  • websocket_messages_sent_total

All other metrics are pre-existing.

@thomasqueirozb thomasqueirozb marked this pull request as ready for review May 19, 2026 15:12
@thomasqueirozb thomasqueirozb requested review from a team as code owners May 19, 2026 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: external docs Anything related to Vector's external, public documentation domain: vdev Anything related to the vdev tooling no-changelog Changes in this PR do not need user-facing explanations in the release changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Generate internal metric documentation from source code

2 participants