You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs-website/docs/development/tracing.mdx
+156Lines changed: 156 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -356,3 +356,159 @@ tracing.enable_tracing(
356
356
357
357
Here’s what the resulting log would look like when a pipeline is run:
358
358
<ClickableImagesrc="/img/55c3d5c84282d726c95fb3350ec36be49a354edca8a6164f5dffdab7121cec58-image_2.png"alt="Console output showing Haystack pipeline execution with DEBUG level tracing logs including component names, types, and input/output specifications" />
359
+
360
+
361
+
## Behavioral Drift Monitoring with a Custom Tracer
362
+
363
+
Haystack's `Tracer` interface can be used for more than routing spans to a backend — it can also detect **behavioral drift** across pipeline runs. This is useful when your pipeline uses a retrieval-augmented or context-compression component and you want to know whether the agent's effective vocabulary is shifting between sessions.
364
+
365
+
The example below implements a lightweight `DriftMonitorTracer` that tracks which domain-specific terms (the "ghost lexicon") appear in the first pipeline run but disappear in later runs. A Ghost Consistency Score (GCS) below 0.40 typically signals that the pipeline is losing context-critical vocabulary.
366
+
367
+
```python
368
+
import contextlib
369
+
import re
370
+
from collections import defaultdict
371
+
from typing import Any, Iterator, Optional
372
+
373
+
from haystack import tracing
374
+
from haystack.tracing import Span, Tracer
375
+
376
+
377
+
classInMemorySpan(Span):
378
+
"""Lightweight span that accumulates tag values for drift inspection."""
379
+
380
+
def__init__(self) -> None:
381
+
self._tags: dict[str, Any] = {}
382
+
383
+
defset_tag(self, key: str, value: Any) -> None:
384
+
self._tags[key] = value
385
+
386
+
defget_tags(self) -> dict[str, Any]:
387
+
returnself._tags
388
+
389
+
390
+
classDriftMonitorTracer(Tracer):
391
+
"""Custom Haystack tracer that measures ghost-lexicon decay across pipeline runs.
392
+
393
+
Use this when you want to detect silent behavioral drift caused by context
394
+
compression or truncation in long-running pipelines.
This pattern requires no changes to Haystack internals. The `Tracer` interface is the only extension point needed. For production use, extend `_on_span_finished` to maintain a per-run rolling window and compare against a configurable baseline depth rather than only the first run.
511
+
512
+
:::note
513
+
This addresses the behavioral-drift monitoring use case from [#10971](https://github.com/deepset-ai/haystack/issues/10971) using the existing `Tracer` interface — no new hooks required.
0 commit comments