Commit a5fa3da
feat: BigQuery Agent Analytics reliability fixes
Three related reliability/observability fixes to the BigQuery Agent
Analytics plugin.
1. Dropped-event observability. BigQuery logging is best-effort: events
are dropped when the in-memory queue overflows or a write ultimately
fails, and only a log line records the loss. Track dropped rows in
BatchProcessor by reason (queue_full, arrow_prep_failed,
retry_exhausted, non_retryable, unexpected_error), include the
running total in each drop log line, and expose the counts via
BatchProcessor.get_drop_stats()/dropped_event_count and an
aggregating BigQueryAgentAnalyticsPlugin.get_drop_stats() so a host
can poll them and export to its own monitoring.
2. Cross-region Storage Write API routing. The AppendRows streaming RPC
does not auto-populate the request-routing header, so writes to a
dataset outside the US multiregion could fail with a "session not
found" / stream-not-found error and silently drop every row. Set
x-goog-request-params: write_stream=<stream> on the append_rows call
so the request reaches the region that owns the write stream.
US-multiregion behavior is unchanged.
3. Stop exporting plugin-owned OTel spans. When Agent Engine telemetry
is enabled (GOOGLE_CLOUD_AGENT_ENGINE_ENABLE_TELEMETRY=true) with
Cloud Trace export on the global tracer provider, the plugin's
ID-carrier spans were exported alongside the framework's real spans,
producing a duplicate span for every instrumented operation. The
plugin now tracks span_id / trace_id on its own contextvar stack
without creating OTel spans; trace_id is inherited from the ambient
span, so BigQuery rows still join to Cloud Trace by trace_id and the
LLM/tool span_id-sharing contracts are preserved.
All paths covered by unit tests.
Change-Id: Ia7b73d816b14c574ef856a4c88c57243f6f38f7f1 parent 77aeadf commit a5fa3da
2 files changed
Lines changed: 574 additions & 159 deletions
File tree
- src/google/adk/plugins
- tests/unittests/plugins
0 commit comments