Skip to content

Releases: evalops/eval2otel

v0.3.0

08 Aug 17:26
7b84c00

Choose a tag to compare

What’s new

  • Deterministic content sampling based on EvalResult.id when using sampleContentRate.
  • New contentSampler hook for full control over sampling decisions.
  • Optional content truncation via contentMaxLength for messages and tool arguments.
  • Optional markTruncatedContent flag to emit gen_ai.message.content_truncated=true when truncation occurs.
  • Per-field redaction hooks: redactMessageContent and redactToolArguments with contextual info.
  • emitOperationalMetadata flag to suppress conversation/choice/agent/RAG events even when capture is enabled.
  • Standardized tool call event attributes: gen_ai.tool.name, gen_ai.tool.call.id, gen_ai.tool.arguments, plus gen_ai.response.choice.index.
  • Cache for custom evaluation histograms to avoid instrument churn.
  • SDK initialization: merge Resource.default() with custom attributes; support exporterProtocol and exporterHeaders.
  • CI: Add type-check step and npm pack --dry-run; enforce coverage thresholds.
  • Tests: coverage for validation events, converter events (attributes, truncation, sampling, redaction), metrics branches, and SDK init.

Breaking/behavioral changes

  • Conversation and assistant event attribute names are normalized to gen_ai.*:
    • Conversation: gen_ai.message.role, gen_ai.message.index, gen_ai.message.content, gen_ai.tool.call.id.
    • Assistant: gen_ai.response.choice.index, gen_ai.response.finish_reason, gen_ai.message.role, gen_ai.message.content.
  • Ensure performance.duration is provided in seconds (agent step durations remain in milliseconds).

Upgrade notes

  • If you referenced previous event attribute keys (e.g., content, role, choice.index), update to the new gen_ai.* names.
  • Consider enabling emitOperationalMetadata=false in sensitive environments to suppress message/agent/RAG events.
  • To limit payload size, set contentMaxLength and optionally markTruncatedContent=true to flag truncated entries.
  • For deterministic sampling across runs, rely on sampleContentRate (now deterministic by id) or implement a custom contentSampler.

Thanks

Thanks to everyone trying out eval2otel and filing issues/feedback!