LCORE-2310: Added non-streaming agent query utilities by asimurka · Pull Request #1880 · lightspeed-core/lightspeed-stack

asimurka · 2026-06-09T07:26:04Z

Description

Adds utility functions for non-streaming query served by LLS pydantic-ai agent provider. For now, utilities are disconnected from query handler - that will be added in a separate PR.

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Assisted-by: Cursor

Related Tickets & Documents

Related Issue # https://redhat.atlassian.net/browse/LCORE-2310
Closes #

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

New Features
- Improved mapping of model/provider failures into user-facing HTTP errors
- Turn summaries now include LLM text, tool calls/results, provider response IDs, and token usage
- Token and call metrics recorded by provider, model, and endpoint
- Moderation refusals are persisted into conversations and return a minimal summary
Tests
- Unit tests for finish-reason mapping, token extraction, summary construction, moderation handling, and error-to-HTTP conversions

coderabbitai · 2026-06-09T07:26:17Z

Walkthrough

Adds non-streaming agent helpers to map inference errors to LCS error responses, extract token usage, build TurnSummary from agent runs (including tool calls), and orchestrate non-streaming agent execution with moderation handling. Unit tests exercise success, moderation-block, and error mappings.

Changes

Non-streaming agent query helpers

Layer / File(s)	Summary
Error handling and finish-reason resolution `src/utils/agents/query.py`, `tests/unit/utils/agents/test_query.py`	`AgentFinishReason` enum and mapping functions translate pydantic-ai `AgentRunError`, Llama Stack `APIStatusError`/`APIConnectionError`, and runtime errors into `AbstractErrorResponse` variants; finish-reason extraction handles `stop`, `length`, `content_filter`, missing reasons, and provider `cancelled`.
Token usage and turn summary aggregation `src/utils/agents/query.py`, `tests/unit/utils/agents/test_query.py`	`extract_agent_token_usage` converts `RunUsage` into `TokenCounter` and records provider/model/endpoint metrics. `build_turn_summary_from_agent_run` aggregates LLM text and native tool call/return parts into `TurnSummary`, attaches provider response id and token usage.
Agent response orchestration `src/utils/agents/query.py`, `tests/unit/utils/agents/test_query.py`	`retrieve_agent_response` persists moderation refusal when blocked and returns a minimal summary; otherwise runs the agent non-streaming, maps provider/runtime failures to `HTTPException`, derives vector-store/RAG ids from tools/config, and returns the complete `TurnSummary`.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

lightspeed-core/lightspeed-stack#1855: Related tool-processing helpers used to represent native/function tool calls and RAG chunk/document extraction.

Suggested reviewers

jrobertboos
tisnik

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly summarizes the main change: adding non-streaming agent query utility functions, which aligns with the changeset that introduces 328 lines of new agent utilities and supporting tests.
Docstring Coverage	✅ Passed	Docstring coverage is 93.10% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

✨ Simplify code

Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/utils/agents/query.py`:
- Around line 298-300: Update the Raises section of the docstring in
src/utils/agents/query.py to use clear, grammatical wording: replace the phrase
"On moderation is not applicable" with something like "HTTPException: If
moderation is not applicable for the input" and keep the second clause as
"HTTPException: on agent or provider failure" (or combine into a single line
that reads "HTTPException: If moderation is not applicable for the input, or on
agent/provider failure") so the meaning is explicit and grammatical; edit the
docstring where the Raises block mentions HTTPException to reflect this clearer
phrasing.
- Around line 83-98: Update the docstring to match the implementation: clarify
that non-context-length RuntimeError exceptions are converted to a generic
internal error response (InternalServerErrorResponse.generic()) rather than
re-raised. In the exception-handling function that checks AgentRunError,
APIStatusError, APIConnectionError and RuntimeError via
is_context_length_error(), change the docstring text to state that
non-context-length RuntimeError values return
InternalServerErrorResponse.generic() (or list the actual behavior), referencing
the RuntimeError branch and InternalServerErrorResponse.generic() so future
readers see the implemented behavior.
- Around line 149-152: In get_agent_finish_reason, avoid directly constructing
AgentFinishReason(response.finish_reason) because response.finish_reason can be
"tool_call" (or other unknown values) which will raise ValueError; update the
function to either explicitly check for "tool_call" on response.finish_reason
and return AgentFinishReason.ERROR, or wrap the enum conversion in a try/except
ValueError that returns AgentFinishReason.ERROR as a safe fallback, referencing
response.finish_reason and AgentFinishReason to locate the change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 193e0be0-68c0-4c67-b19f-e11b7be241e0

📥 Commits

Reviewing files that changed from the base of the PR and between aa29526 and 0386bad.

📒 Files selected for processing (2)

src/utils/agents/query.py
tests/unit/utils/agents/test_query.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (16)

GitHub Check: integration_tests (3.12)
GitHub Check: mypy
GitHub Check: Pyright
GitHub Check: spectral
GitHub Check: unit_tests (3.13)
GitHub Check: build-pr
GitHub Check: unit_tests (3.12)
GitHub Check: Pylinter
GitHub Check: E2E: library mode / ci / group 1
GitHub Check: E2E Tests for Lightspeed Evaluation job
GitHub Check: E2E: library mode / ci / group 3
GitHub Check: E2E: server mode / ci / group 2
GitHub Check: E2E: server mode / ci / group 3
GitHub Check: E2E: library mode / ci / group 2
GitHub Check: E2E: server mode / ci / group 1
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request

🧰 Additional context used

📓 Path-based instructions (2)

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

tests/unit/utils/agents/test_query.py

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Llama Stack imports: Use from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Use async def for I/O operations and external API calls
Use standard log levels with clear purposes: debug() for diagnostic info, info() for program execution, warning() for unexpected events, error() for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes: Configuration, Error/Exception, Resolver, Interface
Abstract classes must use ABC with @abstractmethod decorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes

Files:

src/utils/agents/query.py

🔇 Additional comments (12)

src/utils/agents/query.py (7)

1-57: LGTM!

60-68: LGTM!

101-135: LGTM!

154-182: LGTM!

184-219: LGTM!

222-276: LGTM!

315-315: Avoid unsafe cast: ensure responses_params.input is str before cast(str, responses_params.input)
ResponseInput is str | list[ResponseItem] (src/models/common/responses/types.py), so cast(str, ...) is only safe if this non-streaming path guarantees string input earlier; add an isinstance(..., str) guard (or normalize list input) before calling await agent.run(...).

tests/unit/utils/agents/test_query.py (5)

1-131: LGTM!

133-181: LGTM!

183-243: LGTM!

358-487: LGTM!

276-290: Ensure the test actually exercises ModelResponse.native_tool_calls: build_turn_summary_from_agent_run iterates message.native_tool_calls (src/utils/agents/query.py lines 262-264), but the test builds ModelResponse by placing NativeToolCallPart/NativeToolReturnPart into parts (tests/unit/utils/agents/test_query.py lines 276-290). This only tests the native-tools path if ModelResponse.native_tool_calls is populated from parts; otherwise, the test should construct/set native_tool_calls explicitly (or assert behavior that depends on native-tool processing).

coderabbitai · 2026-06-09T07:32:20Z

+    if response.finish_reason is None:
+        return AgentFinishReason.ERROR
+    return AgentFinishReason(response.finish_reason)
+


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check pydantic-ai FinishReason definition to confirm possible values rg -n "FinishReason" --type py -A 5

Repository: lightspeed-core/lightspeed-stack

Length of output: 7310

🏁 Script executed:

#!/bin/bash set -euo pipefail # Show the relevant enum + get_agent_finish_reason implementation sed -n '40,180p' src/utils/agents/query.py # Show the unit tests for get_agent_finish_reason sed -n '120,220p' tests/unit/utils/agents/test_query.py

Repository: lightspeed-core/lightspeed-stack

Length of output: 8661

🏁 Script executed:

#!/bin/bash set -euo pipefail # Locate imports/definitions around ModelResponse / FinishReason usage rg -n "from .* import .*ModelResponse|ModelResponse" src/utils/agents/query.py src/utils -S # Find how finish_reason is typed/used elsewhere rg -n "\"tool_calls\"|\"tool_output\"|finish_reason\s*:" src -S # Show utils.pydantic_ai module (likely where ModelResponse/FinishReason are re-exported) fd -a -t f "pydantic_ai.py" src/utils -x echo fd -a -t f "pydantic_ai*" src/utils -x echo

Repository: lightspeed-core/lightspeed-stack

Length of output: 1254

🏁 Script executed:

#!/bin/bash set -euo pipefail # Inspect utils/pydantic_ai.py for re-exports/types that define ModelResponse / FinishReason sed -n '1,220p' src/utils/pydantic_ai.py # Find any references to "finish_reason" across the repo rg -n "finish_reason" src -S # Inspect pydantic-ai ModelResponse type hints only if vendored/aliased in this repo; # otherwise, we can at least see whether code ever expects tool_calls/tool_output. rg -n "\"tool_calls\"|\"tool_output\"" src -S

Repository: lightspeed-core/lightspeed-stack

Length of output: 5568

🏁 Script executed:

#!/bin/bash set -euo pipefail python3 - <<'PY' import inspect import typing try: import pydantic_ai from pydantic_ai.messages import ModelResponse except Exception as e: print("Failed to import pydantic_ai / ModelResponse:", e) raise print("pydantic_ai version:", getattr(pydantic_ai, "__version__", "unknown")) print("ModelResponse:", ModelResponse) # Print annotated fields for ModelResponse ann = getattr(ModelResponse, "__annotations__", None) print("ModelResponse.__annotations__ keys:", sorted(list(ann.keys())) if ann else None) # Try to find finish_reason field annotation finish_ann = None if ann and "finish_reason" in ann: finish_ann = ann["finish_reason"] print("finish_reason annotation:", finish_ann) # Try to resolve/inspect any enums/literals involved in finish_reason try: from pydantic_ai.messages import FinishReason print("FinishReason:", FinishReason) print("FinishReason members/values:") if hasattr(FinishReason, "__members__"): print(list(FinishReason.__members__.items())[:50]) else: print("Non-enum type:", FinishReason) except Exception as e: print("Could not import FinishReason from pydantic_ai.messages:", e) # If finish_reason is a Literal, show its args if finish_ann is not None: origin = typing.get_origin(finish_ann) args = typing.get_args(finish_ann) print("finish_reason origin:", origin) print("finish_reason args (first 30):", list(args)[:30]) PY

Repository: lightspeed-core/lightspeed-stack

Length of output: 294

🏁 Script executed:

#!/bin/bash set -euo pipefail # Find dependency pins for pydantic-ai / pydantic_ai ls -la fd -a -t f "pyproject.toml|requirements*.txt|uv.lock|poetry.lock|Pipfile.lock" . 2>/dev/null || true rg -n "pydantic[-_]ai|pydantic_ai" -S pyproject.toml requirements*.txt poetry.lock uv.lock Pipfile.lock 2>/dev/null || true # Also show the relevant dependency section if pyproject exists if [ -f pyproject.toml ]; then sed -n '1,200p' pyproject.toml fi

Repository: lightspeed-core/lightspeed-stack

Length of output: 9409

🌐 Web query:

pydantic-ai 1.105.0 FinishReason tool_calls tool_output

💡 Result:

In Pydantic AI, FinishReason, tool_calls, and tool_output are core concepts related to how an agent manages model responses and tool execution. FinishReason The FinishReason is an enumeration that describes why a model stopped generating a response, normalized to OpenTelemetry values [1]. It includes the following possible values: 'stop', 'length', 'content_filter', 'tool_call', and 'error' [1]. Developers can use the finish_reason attribute on a ModelResponse to inspect why a model ended its generation, for instance, to implement fallback logic if a response was truncated due to length limits or a content filter [2]. Tool Calls and Tool Output Pydantic AI distinguishes between function tools (which perform actions) and output tools (which produce the final result of a run) [3][4]. 1. Tool Calls: When a model decides to use a tool, it emits a tool call. If the model emits a tool call, the finish_reason for that response is typically 'tool_call' [1][5]. 2. Tool Output: This is the data returned after executing a tool. In the default "Tool Output" mode, the JSON schema for output types is provided to the model as a special output tool [6][7]. 3. Managing Execution: Recent updates, such as those in PR #5339 (which influenced behavior leading up to v1.105.0), refined how these are handled [8][9]. Specifically: - The end_strategy parameter (e.g., 'graceful', 'exhaustive') controls the execution flow when a model emits both function tools and an output tool in the same response [4][7]. - 'Graceful' ensures function tools are executed even if a final result is found, while 'exhaustive' ensures all tools—including output tools—are executed [4]. - ToolOutput can be used as a marker class to customize output tools [6][7]. Pydantic AI v1.105.0 was released on June 2, 2026, and focused on features like on-demand loading and Grok model support [8]. The concepts of FinishReason and tool management remain central to the framework's operational flow [1][6].

Citations:

1: https://pydantic.dev/docs/ai/api/pydantic-ai/messages/

2: https://pydantic.dev/docs/ai/models/overview/

3: https://github.com/pydantic/pydantic-ai/blob/8b9ac2bd/docs/tools.md

4: https://pydantic.dev/docs/ai/tools-toolsets/tools-advanced/

5: https://github.com/pydantic/pydantic-ai/blob/v1.70.0/pydantic_ai_slim/pydantic_ai/models/openai.py

6: https://pydantic.dev/docs/ai/core-concepts/output/

7: https://github.com/pydantic/pydantic-ai/blob/8b9ac2bd/docs/output.md

8: https://github.com/pydantic/pydantic-ai/releases/tag/v1.105.0

9: Run tools in emission order with graceful default; sequential=True is now a per-tool barrier; function-tool ModelRetry suppresses the output result pydantic/pydantic-ai#5339

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n "get_agent_finish_reason\\(|finish_reason=.*tool" tests src -S

Repository: lightspeed-core/lightspeed-stack

Length of output: 882

🏁 Script executed:

#!/bin/bash set -euo pipefail sed -n '200,280p' src/utils/agents/query.py

Repository: lightspeed-core/lightspeed-stack

Length of output: 2918

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n "\"tool_call\"|\"tool_calls\"|\"tool_output\"|tool_output|tool_calls" src tests -S

Repository: lightspeed-core/lightspeed-stack

Length of output: 28367

🌐 Web query:

pydantic-ai FinishReason enum values stop length content_filter tool_call error ModelResponse.finish_reason

💡 Result:

In Pydantic AI, the FinishReason type is a TypeAlias representing the reason a model stopped generating a response [1]. It is defined as a Literal string type with the following valid values [1]: stop: The model reached a natural stop point or a provided stop sequence [1][2]. length: The maximum number of tokens specified in the request was reached [1][2]. content_filter: The generation was stopped due to a content filter flag [1][2]. tool_call: The model initiated a tool call [1]. error: The generation ended due to an error (e.g., failure or cancellation) [1][3][4]. The ModelResponse class includes a finish_reason attribute (of type FinishReason | None) that records this information [1]. This field is intended to normalize diverse provider-specific finish reasons into a standardized format aligned with OpenTelemetry conventions [1][5]. Individual model providers (such as OpenAI, Anthropic, and xAI) implement internal mapping logic to convert their native API response finish reasons into these standardized Pydantic AI FinishReason values [3][6][7][8].

Citations:

1: https://pydantic.dev/docs/ai/api/pydantic-ai/messages/

2: Add finish_reason to ModelResponse pydantic/pydantic-ai#1539

3: https://github.com/pydantic/pydantic-ai/blob/v1.70.0/pydantic_ai_slim/pydantic_ai/models/openai.py

4: https://github.com/pydantic/pydantic-ai/blob/main/pydantic_ai_slim/pydantic_ai/models/openai.py

5: Add id and finish_reason to ModelResponse pydantic/pydantic-ai#886

6: https://github.com/pydantic/pydantic-ai/blob/531d6da5/pydantic_ai_slim/pydantic_ai/models/xai.py

7: https://github.com/pydantic/pydantic-ai/blob/7f5214c6/pydantic_ai_slim/pydantic_ai/models/anthropic.py

8: https://github.com/pydantic/pydantic-ai/blob/7f5214c6/pydantic_ai_slim/pydantic_ai/models/openai.py

Potential ValueError when finish_reason is tool_call (or otherwise unknown)

get_agent_finish_reason() constructs AgentFinishReason(response.finish_reason) without a guard. pydantic-ai’s FinishReason includes "tool_call", which is not a member of the local AgentFinishReason enum, so this will raise ValueError instead of mapping to an LCS error.

Add explicit handling for tool_call (or catch ValueError and map unknown finish reasons to a safe fallback like AgentFinishReason.ERROR) rather than directly converting to the enum.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/utils/agents/query.py` around lines 149 - 152, In get_agent_finish_reason, avoid directly constructing AgentFinishReason(response.finish_reason) because response.finish_reason can be "tool_call" (or other unknown values) which will raise ValueError; update the function to either explicitly check for "tool_call" on response.finish_reason and return AgentFinishReason.ERROR, or wrap the enum conversion in a try/except ValueError that returns AgentFinishReason.ERROR as a safe fallback, referencing response.finish_reason and AgentFinishReason to locate the change.

From pydantic-ai openai processor:

_CHAT_FINISH_REASON_MAP: dict[ Literal['stop', 'length', 'tool_calls', 'content_filter', 'function_call'], FinishReason ] = { 'stop': 'stop', 'length': 'length', 'tool_calls': 'tool_call', 'content_filter': 'content_filter', 'function_call': 'tool_call', } _RESPONSES_FINISH_REASON_MAP: dict[Literal['max_output_tokens', 'content_filter'] | ResponseStatus, FinishReason] = { 'max_output_tokens': 'length', 'content_filter': 'content_filter', 'completed': 'stop', 'cancelled': 'error', 'failed': 'error', }

OpenResponses cannot result in tool_call finish reason.

@asimurka, thanks for clarifying — you're right. Since this code targets the OpenAI Responses API (not Chat Completions), _RESPONSES_FINISH_REASON_MAP applies, and "tool_call" is not a possible finish reason from that path. All values it can emit — stop, length, content_filter, error — are covered by AgentFinishReason. My concern doesn't apply here.

✏️ Learnings added

Learnt from: asimurka Repo: lightspeed-core/lightspeed-stack PR: 1880 File: src/utils/agents/query.py:149-152 Timestamp: 2026-06-09T07:36:38.363Z Learning: In `src/utils/agents/query.py` (lightspeed-core/lightspeed-stack), `get_agent_finish_reason` and `AgentFinishReason` enum are designed specifically for the pydantic-ai OpenAI **Responses API** (not Chat Completions). The Responses API `_RESPONSES_FINISH_REASON_MAP` only produces `stop`, `length`, `content_filter`, and `error` as finish reasons — `tool_call` cannot occur from this API path. Therefore, a `ValueError` from `AgentFinishReason(response.finish_reason)` is not a concern in this context.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/utils/agents/query.py`:
- Around line 74-86: Update the function docstrings in src/utils/agents/query.py
to follow the repo's Google-style docstring headings: replace "Args:" with
"Parameters:", ensure each docstring includes "Returns:" and "Raises:" sections
(and "Attributes:" for any classes), and keep the same descriptions and
parameter names/typing; specifically update the docstring for the top helper
that "Map agent run failures..." and the other helpers at the reported ranges
(lines ~104-112, 136-143, 156-164, 187-196, 228-242, 283-298) so they
consistently use "Parameters:", "Returns:", and "Raises:" headings and conform
to the project's docstring conventions.
- Around line 254-257: The loop over run_result.new_messages() currently
overwrites state.turn_summary.llm_response for each ModelResponse with text;
instead, append each message.text to preserve all fragments: when encountering a
ModelResponse in the loop, check state.turn_summary.llm_response (which may be
None/empty) and concatenate the new message.text (with a separator like a space
or newline) so earlier assistant text is not lost; update the code paths that
set llm_response to perform accumulation rather than assignment.
- Around line 311-313: The code casts responses_params.input to str before
calling agent.run but prepare_input() always returns a str and
retrieve_agent_response() has no production callers, so tighten the contract:
update retrieve_agent_response()'s signature to expect input: str (instead of
str | list[ResponseItem]) or add a lightweight runtime assertion inside
retrieve_agent_response() (e.g., assert isinstance(responses_params.input, str)
or raise TypeError) and then remove the cast at the call site where
build_agent(...) and agent.run(...) are invoked; reference
prepare_responses_params, prepare_input, retrieve_agent_response, build_agent,
and agent.run when making this change so the type narrowing/guard is applied
consistently.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 14fcd2e8-5a1a-4dfc-9846-6c5a167f47ce

📥 Commits

Reviewing files that changed from the base of the PR and between 0386bad and dd950ae.

📒 Files selected for processing (2)

src/utils/agents/query.py
tests/unit/utils/agents/test_query.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request

🧰 Additional context used

📓 Path-based instructions (2)

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Use pytest.mark.asyncio marker for async tests

Files:

tests/unit/utils/agents/test_query.py

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Llama Stack imports: Use from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Use async def for I/O operations and external API calls
Use standard log levels with clear purposes: debug() for diagnostic info, info() for program execution, warning() for unexpected events, error() for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes: Configuration, Error/Exception, Resolver, Interface
Abstract classes must use ABC with @abstractmethod decorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes

Files:

src/utils/agents/query.py

tisnik

LGTM

tisnik · 2026-06-10T10:15:57Z

+        RuntimeError: Re-raised when ``exc`` is a non-agent ``RuntimeError`` that is
+            not a recognized context-length failure.
+    """
+    if isinstance(exc, AgentRunError):


can be rewritten using pattern matching, but it's a nitpick

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/utils/agents/query.py (1)
211-218: ⚠️ Potential issue | 🟠 Major

Fix llm_calls_total undercount for multi-step agent runs
extract_agent_token_usage() sets TokenCounter.llm_calls = max(usage.requests, 1), but it calls recording.record_llm_call() only once; record_llm_call() increments metrics.llm_calls_total...inc() by +1 with no parameter for the actual count. If RunUsage.requests > 1, llm_calls_total will underreport vs the returned token_usage.llm_calls. Make record_llm_call accept a count (or loop) and pass token_counter.llm_calls.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/utils/agents/query.py` around lines 211 - 218, The code records token
usage once but always increments llm_calls_total by 1, causing undercount when
extract_agent_token_usage sets TokenCounter.llm_calls > 1; update
recording.record_llm_call to accept a count parameter (e.g., count:int) or
support being called multiple times, then call it with token_counter.llm_calls
instead of a single call; ensure record_llm_token_usage still records tokens
once and record_llm_call uses the passed count to increment
metrics.llm_calls_total by that amount (or loop internally), updating the
record_llm_call signature and all call sites accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/utils/agents/query.py`:
- Around line 211-218: The code records token usage once but always increments
llm_calls_total by 1, causing undercount when extract_agent_token_usage sets
TokenCounter.llm_calls > 1; update recording.record_llm_call to accept a count
parameter (e.g., count:int) or support being called multiple times, then call it
with token_counter.llm_calls instead of a single call; ensure
record_llm_token_usage still records tokens once and record_llm_call uses the
passed count to increment metrics.llm_calls_total by that amount (or loop
internally), updating the record_llm_call signature and all call sites
accordingly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a988433f-7e32-4643-82bc-297dc1e3d1ce

📥 Commits

Reviewing files that changed from the base of the PR and between dd950ae and c3d5f18.

📒 Files selected for processing (1)

src/utils/agents/query.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)

GitHub Check: radon
GitHub Check: list_outdated_dependencies
GitHub Check: integration_tests (3.12)
GitHub Check: Pylinter
GitHub Check: E2E: library mode / ci / group 3
GitHub Check: E2E: library mode / ci / group 1
GitHub Check: E2E: library mode / ci / group 2
GitHub Check: E2E: server mode / ci / group 2
GitHub Check: E2E: server mode / ci / group 1
GitHub Check: E2E: server mode / ci / group 3
GitHub Check: E2E Tests for Lightspeed Evaluation job
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request

🧰 Additional context used

📓 Path-based instructions (1)

src/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Llama Stack imports: Use from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Use logger = get_logger(__name__) from log.py for module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Use async def for I/O operations and external API calls
Use standard log levels with clear purposes: debug() for diagnostic info, info() for program execution, warning() for unexpected events, error() for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes: Configuration, Error/Exception, Resolver, Interface
Abstract classes must use ABC with @abstractmethod decorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes

Files:

src/utils/agents/query.py

🔇 Additional comments (2)

src/utils/agents/query.py (2)

74-86: Still using Args: headings instead of the repo-required Parameters: sections.

This new module still hasn't been aligned with the required Google-style docstring headings. As per coding guidelines, "Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes."

Also applies to: 106-114, 138-145, 158-166, 189-198, 230-244, 285-300

Source: Coding guidelines

256-259: Assistant text is still overwritten instead of accumulated.

If run_result.new_messages() yields multiple text-bearing ModelResponses, earlier fragments are dropped and the summary only keeps the last one.

jrobertboos

LGTM but some of the e2e tests are failing, is that expected?

asimurka · 2026-06-10T16:17:40Z

LGTM but some of the e2e tests are failing, is that expected?

Retried job passed, failing test was unrelated anyway.

asimurka requested review from jrobertboos and tisnik June 9, 2026 07:26

asimurka mentioned this pull request Jun 9, 2026

LCORE-2310: Query agent run #1865

Draft

19 tasks

coderabbitai Bot reviewed Jun 9, 2026

View reviewed changes

Added non-streaming agent query utilities

dd950ae

asimurka force-pushed the agent_query_utils branch from 0386bad to dd950ae Compare June 9, 2026 09:45

coderabbitai Bot reviewed Jun 9, 2026

View reviewed changes

Comment thread src/utils/agents/query.py

Comment thread src/utils/agents/query.py

Comment thread src/utils/agents/query.py

tisnik approved these changes Jun 10, 2026

View reviewed changes

Refactor error mapping using pattern matching

c3d5f18

coderabbitai Bot reviewed Jun 10, 2026

View reviewed changes

jrobertboos approved these changes Jun 10, 2026

View reviewed changes

tisnik merged commit 2bfd657 into lightspeed-core:main Jun 10, 2026
36 of 37 checks passed

Conversation

asimurka commented Jun 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asimurka Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

tisnik Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

jrobertboos left a comment

Choose a reason for hiding this comment

Uh oh!

asimurka commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

asimurka commented Jun 9, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading

coderabbitai Bot Jun 9, 2026 •

edited

Loading