Skip to content

chore(metrics): generate internal metric documentation from Rust source#25446

Closed
thomasqueirozb wants to merge 46 commits into
masterfrom
thomasqueirozb/25374-metric-name-docs
Closed

chore(metrics): generate internal metric documentation from Rust source#25446
thomasqueirozb wants to merge 46 commits into
masterfrom
thomasqueirozb/25374-metric-name-docs

Conversation

@thomasqueirozb
Copy link
Copy Markdown
Member

@thomasqueirozb thomasqueirozb commented May 15, 2026

Summary

Implements #25374: internal metric documentation is now fully generated from Rust source rather than manually maintained in CUE. website/cue/reference/components/sources/internal_metrics.cue is now metadata-only and all metric output definitions come from the generated file.

  • CounterName, HistogramName, GaugeName in lib/vector-common/src/internal_event/metric_name.rs now carry /// doc comments (descriptions) and #[configurable(metadata(docs::tags = ...))] annotations encoding the full tag set for each metric — including deprecation via #[configurable(deprecated = "...")] on enum variants.

  • src/generate_schema.rs injects _metric_schemas: {counters, histograms, gauges} into the schema JSON produced by vector generate-schema.

  • vdev build component-docs reads _metric_schemas and generates website/cue/reference/generated/internal_metric_descriptions.cue with description, type, default_namespace, and tags for every metric.

  • All tag-set definitions live in lib/vector-common/src/internal_event/metric_tags.rs as composable pub static LazyLock<Value> statics built with a merge_lazy() helper. Single-use tag shapes are inlined at the variant with merge_lazy(&BASE, json!({...})).

  • The #[configurable] macro now accepts path constants (SOME_CONST), macro invocations (json!(...)), reference expressions (&*LAZY), and function calls (merge_lazy(...)) as metadata values.

  • #[configurable(deprecated = "...")] is extended to work on enum variants (previously fields only).

Vector configuration

NA

How did you test this PR?

make generate-component-docs passes. cargo check --workspace --no-default-features passes. make check-clippy passes.

The following script compares docs.json between master and this branch to verify metric fidelity (124 shared metrics are fully equal after normalization; 3 intentional changes noted):

verify_metrics.py
#!/usr/bin/env python3
"""
Verifies that every metric present in BOTH master and branch has identical
tag examples. Differences due to metrics only in one file are expected and
reported separately.
"""
import json, sys

with open("/private/tmp/docs.json.master") as f:
    master = json.load(f)
with open("/private/tmp/docs.json.branch") as f:
    branch = json.load(f)

def get_metrics(data):
    return data["components"]["sources"]["internal_metrics"]["output"]["metrics"]

def normalize(v):
    """Normalize strings: collapse whitespace, strip trailing periods, normalize quotes."""
    if isinstance(v, str):
        v = " ".join(v.split()).rstrip(".")
        # Normalize typographic quotes to ASCII (master went through Hugo, branch is raw Rust)
        v = v.replace("’", "'").replace("‘", "'")
        v = v.replace("“", '"').replace("”", '"')
        return v
    return v

def flat_diff(a, b, path=""):
    """Yield (path, master_val, branch_val) for every leaf that differs."""
    if isinstance(a, dict) and isinstance(b, dict):
        for k in sorted(set(a) | set(b)):
            yield from flat_diff(a.get(k), b.get(k), f"{path}.{k}" if path else k)
    elif normalize(a) != normalize(b):
        yield (path, a, b)

master_metrics = get_metrics(master)
branch_metrics = get_metrics(branch)

common = set(master_metrics) & set(branch_metrics)
only_master = set(master_metrics) - set(branch_metrics)
only_branch = set(branch_metrics) - set(master_metrics)

mismatches = []
for name in sorted(common):
    m = master_metrics[name]
    b = branch_metrics[name]
    diffs = list(flat_diff(m, b))
    if diffs:
        mismatches.append((name, diffs))

print(f"Metrics in both:        {len(common)}")
print(f"Only in master:         {len(only_master)}")
print(f"Only in branch (new):   {len(only_branch)}")
print()

if mismatches:
    print(f"{len(mismatches)} metric(s) differ:\n")
    for name, diffs in mismatches:
        print(f"  {name}:")
        for path, mv, bv in diffs:
            print(f"    {path}")
            print(f"      master: {mv!r}")
            print(f"      branch: {bv!r}")
else:
    print("PASS — all shared metrics are fully equal.")

print()
print("── Removed (only in master) ─────────────────────────────────────────────────")
for name in sorted(only_master):
    print(f"  - {name}")

print()
print("── Added (only in branch) ───────────────────────────────────────────────────")
for name in sorted(only_branch):
    print(f"  + {name}")

print()
print("── Changed (shared but differ) ──────────────────────────────────────────────")
if mismatches:
    for name, diffs in mismatches:
        fields = ", ".join(p for p, _, _ in diffs)
        print(f"  ~ {name}  [{fields}]")
else:
    print("  (none)")

if mismatches:
    sys.exit(1)
Output
Metrics in both:        124
Only in master:         10
Only in branch (new):   37

3 metric(s) differ:

  collect_duration_seconds:
    description
      master: 'The duration spent collecting of metrics for this component.'
      branch: 'The duration spent collecting metrics for this component.'
  http_server_handler_duration_seconds:
    description
      master: 'The duration spent handling a HTTP request.'
      branch: 'The duration spent handling an HTTP request.'
  open_files:
    type
      master: 'counter'
      branch: 'gauge'

── Removed (only in master) ─────────────────────────────────────────────────
  - buffer_received_event_bytes_total
  - buffer_sent_event_bytes_total
  - config_reloaded
  - http_requests_total
  - invalid_record_total
  - protobuf_decode_errors_total
  - send_errors_total
  - stdin_reads_failed_total
  - streams_total
  - timestamp_parse_errors_total

── Added (only in branch) ───────────────────────────────────────────────────
  + active_endpoints
  + adaptive_concurrency_back_pressure
  + adaptive_concurrency_past_rtt_mean
  + adaptive_concurrency_reached_limit
  + buffer_discarded_bytes_total
  + buffer_errors_total
  + buffer_max_byte_size
  + buffer_max_event_size
  + buffer_max_size_bytes
  + buffer_max_size_events
  + buffer_received_bytes_total
  + buffer_sent_bytes_total
  + component_allocated_bytes
  + component_allocated_bytes_total
  + component_deallocated_bytes_total
  + decoder_bom_removals_total
  + decoder_malformed_replacement_warnings_total
  + encoder_unmappable_replacement_warnings_total
  + http_client_error_rtt_seconds
  + http_client_errors_total
  + k8s_event_namespace_annotation_failures_total
  + k8s_event_node_annotation_failures_total
  + memory_enrichment_table_byte_size
  + memory_enrichment_table_failed_insertions
  + memory_enrichment_table_failed_reads
  + memory_enrichment_table_flushes_total
  + memory_enrichment_table_insertions_total
  + memory_enrichment_table_objects_count
  + memory_enrichment_table_reads_total
  + memory_enrichment_table_ttl_expirations
  + parse_errors_total
  + rewritten_timestamp_events_total
  + sqs_message_defer_succeeded_total
  + tag_cardinality_tracked_keys
  + tag_cardinality_untracked_events_total
  + websocket_bytes_sent_total
  + websocket_messages_sent_total

── Changed (shared but differ) ──────────────────────────────────────────────
  ~ collect_duration_seconds  [description]
  ~ http_server_handler_duration_seconds  [description]
  ~ open_files  [type]

The 3 differences are intentional: two grammar fixes (collecting ofcollecting, a HTTPan HTTP) and one type correction (open_files: countergauge).

Change Type

  • Bug fix
  • New feature
  • Dependencies
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

@github-actions github-actions Bot added domain: external docs Anything related to Vector's external, public documentation docs review on hold The documentation team reviews PRs only after a PR is approved by the COSE team. labels May 15, 2026
@github-actions github-actions Bot added the domain: vdev Anything related to the vdev tooling label May 15, 2026
…rable macro to support deprecated = "..." on enum variants
@github-actions github-actions Bot added domain: topology Anything related to Vector's topology code domain: core Anything related to core crates i.e. vector-core, core-common, etc labels May 15, 2026
…ags in Rust, eliminate enum metric blocks from internal_metrics.cue
@thomasqueirozb thomasqueirozb changed the title chore(metrics): add rustdoc and Configurable derive to metric name enums chore(metrics): generate internal metric documentation from Rust source May 18, 2026
@thomasqueirozb thomasqueirozb added the no-changelog Changes in this PR do not need user-facing explanations in the release changelog label May 18, 2026
@thomasqueirozb
Copy link
Copy Markdown
Member Author

thomasqueirozb commented May 18, 2026

@vectordotdev/documentation

These are the additions which should be reviewed:

  • active_endpoints
  • adaptive_concurrency_back_pressure
  • adaptive_concurrency_past_rtt_mean
  • adaptive_concurrency_reached_limit
  • buffer_discarded_bytes_total
  • buffer_errors_total
  • buffer_max_byte_size
  • buffer_max_event_size
  • buffer_max_size_bytes
  • buffer_max_size_events
  • buffer_received_bytes_total
  • buffer_sent_bytes_total
  • component_allocated_bytes
  • component_allocated_bytes_total
  • component_deallocated_bytes_total
  • decoder_bom_removals_total
  • decoder_malformed_replacement_warnings_total
  • encoder_unmappable_replacement_warnings_total
  • http_client_error_rtt_seconds
  • http_client_errors_total
  • k8s_event_namespace_annotation_failures_total
  • k8s_event_node_annotation_failures_total
  • memory_enrichment_table_byte_size
  • memory_enrichment_table_failed_insertions
  • memory_enrichment_table_failed_reads
  • memory_enrichment_table_flushes_total
  • memory_enrichment_table_insertions_total
  • memory_enrichment_table_objects_count
  • memory_enrichment_table_reads_total
  • memory_enrichment_table_ttl_expirations
  • parse_errors_total
  • rewritten_timestamp_events_total
  • sqs_message_defer_succeeded_total
  • tag_cardinality_tracked_keys
  • tag_cardinality_untracked_events_total
  • websocket_bytes_sent_total
  • websocket_messages_sent_total

All other metrics are pre-existing.

@thomasqueirozb thomasqueirozb marked this pull request as ready for review May 18, 2026 22:52
@thomasqueirozb thomasqueirozb requested review from a team as code owners May 18, 2026 22:52
@github-actions github-actions Bot removed domain: topology Anything related to Vector's topology code domain: core Anything related to core crates i.e. vector-core, core-common, etc labels May 18, 2026
@thomasqueirozb
Copy link
Copy Markdown
Member Author

Replaces with #25460

@github-actions github-actions Bot locked and limited conversation to collaborators May 19, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

docs review on hold The documentation team reviews PRs only after a PR is approved by the COSE team. domain: external docs Anything related to Vector's external, public documentation domain: vdev Anything related to the vdev tooling no-changelog Changes in this PR do not need user-facing explanations in the release changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Generate internal metric documentation from source code

1 participant