Skip to content

Commit 78d2573

Browse files
nv-alichengclaude
andcommitted
test(metrics): rewrite skipped suites on registry/snapshot fixtures
The metrics pub/sub refactor (PR #N) module-level-skipped four test files plus their conftest because they hard-coupled to the deleted KVStore API. This commit reinstates them on the new fixtures, in scope with the design doc's "test impact" callout. - conftest.py: rewrites shared fixtures to construct MetricsRegistry and MetricsTable instances directly. Drops events_db (SQLite fixture deleted with the KVStore path). - test_metrics_table.py: 16 tests covering tracking-window lifecycle, trigger dispatch on field updates, tracked-block accounting, and the in-flight async-task drain path. - test_aggregator.py: 31 tests covering MetricsAggregatorService end to end (in-process, MagicMock publisher) — counter accounting, ISSUED/COMPLETE/error event handling, ENDED → publish_final sequence, and the LIVE → DRAINING state transition. Adds a new TestCounterAccounting class to cover the total_* vs tracked_* counter split that the legacy tests conflated. - test_aggregator_e2e.py: 3 tests round-tripping a real MetricsPublisher ↔ MetricsSnapshotSubscriber over IPC, covering COMPLETE-only delivery, LIVE-tick-then-COMPLETE lifecycle, and counter+series wire shape. - test_report_builder.py: 14 tests on Report.from_snapshot, including the complete=False derivation when state != COMPLETE or n_pending_tasks > 0. Net: 64 new tests across the 4 suites; full unit suite up from 660 to 724 passing. The 4 module-level skips are gone. Production-code surfaces flagged for follow-up coverage: - AsyncTokenTrigger exception path in metrics_table.py - SeriesSampler HDR clamp warn-once branch - MetricsAggregatorService._finalize shutdown_event signaling - Report.tps() OSL-empty-with-duration case Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 1a72e81 commit 78d2573

5 files changed

Lines changed: 1915 additions & 54 deletions

File tree

tests/unit/async_utils/services/metrics_aggregator/conftest.py

Lines changed: 105 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -15,18 +15,33 @@
1515

1616
"""Shared test doubles and factories for metrics aggregator tests.
1717
18-
NOTE: this conftest is in the process of being migrated to the
19-
registry-based aggregator (metrics_pubsub_design_v5.md). The legacy
20-
``InMemoryKVStore`` factories that previously lived here have been
21-
removed; tests that depended on them are skipped pending rewrite. New
22-
tests for ``snapshot.py``, ``registry.py``, and ``publisher.py`` are
23-
self-contained and do not need helpers from this module.
18+
Migrated for the registry/publisher refactor (metrics_pubsub_design_v5):
19+
no more ``InMemoryKVStore``. Tests that need to inspect emitted values
20+
build them directly off a ``MetricsRegistry`` and a ``MetricsSnapshot``.
21+
22+
The helpers here are intentionally small — most reused-across-tests
23+
construction lives in ``_make_aggregator`` style fixtures local to each
24+
test file (the aggregator's wire surface is small enough that a single
25+
shared fixture would mostly hide it).
2426
"""
2527

2628
from __future__ import annotations
2729

2830
import asyncio
31+
from unittest.mock import AsyncMock, MagicMock
2932

33+
from inference_endpoint.async_utils.services.metrics_aggregator.aggregator import (
34+
MetricsAggregatorService,
35+
)
36+
from inference_endpoint.async_utils.services.metrics_aggregator.registry import (
37+
MetricsRegistry,
38+
)
39+
from inference_endpoint.async_utils.services.metrics_aggregator.snapshot import (
40+
CounterStat,
41+
SeriesStat,
42+
SessionState,
43+
)
44+
from inference_endpoint.async_utils.transport.zmq.context import ManagedZMQContext
3045
from inference_endpoint.core.record import (
3146
EventRecord,
3247
SampleEventType,
@@ -35,8 +50,7 @@
3550
from inference_endpoint.core.types import TextModelOutput
3651

3752
# ---------------------------------------------------------------------------
38-
# Mock TokenizePool — still useful for tests that exercise async triggers
39-
# directly.
53+
# Mock TokenizePool — used by tests that exercise async triggers directly.
4054
# ---------------------------------------------------------------------------
4155

4256

@@ -86,3 +100,86 @@ def text_output(s: str) -> TextModelOutput:
86100

87101
def streaming_text(*chunks: str) -> TextModelOutput:
88102
return TextModelOutput(output=tuple(chunks))
103+
104+
105+
# ---------------------------------------------------------------------------
106+
# Registry / snapshot inspection helpers
107+
# ---------------------------------------------------------------------------
108+
109+
110+
def snapshot_counters(registry: MetricsRegistry) -> dict[str, int | float]:
111+
"""Return all counter values from a fresh snapshot.
112+
113+
State/n_pending values don't matter for counter inspection — they
114+
bypass the exact-vs-HDR fork. Tests that need series inspection
115+
should call ``snapshot_series_values`` instead.
116+
"""
117+
snap = registry.build_snapshot(state=SessionState.LIVE, n_pending_tasks=0)
118+
return {m.name: m.value for m in snap.metrics if isinstance(m, CounterStat)}
119+
120+
121+
def snapshot_series_count(registry: MetricsRegistry, name: str) -> int:
122+
"""Return ``count`` of a named series from a fresh snapshot.
123+
124+
Returns 0 if the series is unregistered or has no recordings.
125+
"""
126+
snap = registry.build_snapshot(state=SessionState.LIVE, n_pending_tasks=0)
127+
for m in snap.metrics:
128+
if isinstance(m, SeriesStat) and m.name == name:
129+
return m.count
130+
return 0
131+
132+
133+
def snapshot_series_total(registry: MetricsRegistry, name: str) -> int | float:
134+
"""Return ``total`` of a named series from a fresh snapshot."""
135+
snap = registry.build_snapshot(state=SessionState.LIVE, n_pending_tasks=0)
136+
for m in snap.metrics:
137+
if isinstance(m, SeriesStat) and m.name == name:
138+
return m.total
139+
return 0
140+
141+
142+
# ---------------------------------------------------------------------------
143+
# Aggregator factory
144+
# ---------------------------------------------------------------------------
145+
146+
147+
def make_aggregator(
148+
zmq_ctx: ManagedZMQContext,
149+
loop: asyncio.AbstractEventLoop,
150+
socket_name: str,
151+
*,
152+
tokenize_pool=None,
153+
streaming: bool = True,
154+
shutdown_event: asyncio.Event | None = None,
155+
) -> tuple[MetricsAggregatorService, MetricsRegistry, MagicMock]:
156+
"""Construct an aggregator wired to a real SUB socket and a mocked publisher.
157+
158+
The aggregator's ``start()`` is intentionally not called: tests inject
159+
events directly via ``await agg.process([...])``. The publisher is a
160+
``MagicMock`` so the aggregator's STARTED branch (which calls
161+
``publisher.start(...)``) and ENDED branch (which calls ``publish_final``
162+
+ ``close``) don't touch real I/O.
163+
164+
Returns ``(agg, registry, publisher_mock)``.
165+
"""
166+
registry = MetricsRegistry()
167+
# ``publish_final`` is awaited by the aggregator's ENDED handler, so it
168+
# must be an AsyncMock. The remaining surface (``start``, ``close``) is
169+
# synchronous and falls back to MagicMock's default attribute behavior.
170+
publisher = MagicMock()
171+
publisher.publish_final = AsyncMock()
172+
agg = MetricsAggregatorService(
173+
socket_name,
174+
zmq_ctx,
175+
loop,
176+
registry=registry,
177+
publisher=publisher,
178+
refresh_hz=4.0,
179+
sig_figs=3,
180+
n_histogram_buckets=10,
181+
tokenize_pool=tokenize_pool,
182+
streaming=streaming,
183+
shutdown_event=shutdown_event,
184+
)
185+
return agg, registry, publisher

0 commit comments

Comments
 (0)