You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/openapi.md
+216-3Lines changed: 216 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4969,14 +4969,16 @@ activates the OKP provider; all other IDs refer to entries in ``byok_rag``.
4969
4969
4970
4970
Backward compatibility:
4971
4971
- ``inline`` defaults to ``[]`` (no inline RAG).
4972
-
- ``tool`` defaults to ``None`` which means all registered vector stores
4973
-
are used (identical to the previous ``tool.byok.enabled = True`` default).
4972
+
- ``tool`` defaults to ``[]`` (no tool RAG).
4973
+
4974
+
If no RAG strategy is defined (inline and tool are empty),
4975
+
the RAG tool will register all stores available to llama-stack.
4974
4976
4975
4977
4976
4978
| Field | Type | Description |
4977
4979
|-------|------|-------------|
4978
4980
| inline | array | RAG IDs whose sources are injected as context before the LLM call. Use 'okp' to enable OKP inline RAG. Empty by default (no inline RAG). |
4979
-
| tool || RAG IDs made available to the LLM as a file_search tool. Use 'okp' to include the OKP vector store. When omitted, all registered BYOK vector stores are used (backward compatibility). |
4981
+
| tool |array| RAG IDs made available to the LLM as a file_search tool. Use 'okp' to include the OKP vector store. When omitted, all registered BYOK vector stores are used (backward compatibility). |
4980
4982
4981
4983
4982
4984
## ReadinessResponse
@@ -5029,6 +5031,161 @@ Attributes:
5029
5031
| source || Index name identifying the knowledge source from configuration |
5030
5032
5031
5033
5034
+
## ResponseInput
5035
+
5036
+
5037
+
5038
+
5039
+
5040
+
## ResponseItem
5041
+
5042
+
5043
+
5044
+
5045
+
5046
+
## ResponsesRequest
5047
+
5048
+
5049
+
Model representing a request for the Responses API following LCORE specification.
5050
+
5051
+
Attributes:
5052
+
input: Input text or structured input items containing the query.
5053
+
model: Model identifier in format "provider/model". Auto-selected if not provided.
5054
+
conversation: Conversation ID linking to an existing conversation. Accepts both
5055
+
OpenAI and LCORE formats. Mutually exclusive with previous_response_id.
5056
+
include: Explicitly specify output item types that are excluded by default but
5057
+
should be included in the response.
5058
+
instructions: System instructions or guidelines provided to the model (acts as
5059
+
the system prompt).
5060
+
max_infer_iters: Maximum number of inference iterations the model can perform.
5061
+
max_output_tokens: Maximum number of tokens allowed in the response.
5062
+
max_tool_calls: Maximum number of tool calls allowed in a single response.
5063
+
metadata: Custom metadata dictionary with key-value pairs for tracking or logging.
5064
+
parallel_tool_calls: Whether the model can make multiple tool calls in parallel.
5065
+
previous_response_id: Identifier of the previous response in a multi-turn
5066
+
conversation. Mutually exclusive with conversation.
5067
+
prompt: Prompt object containing a template with variables for dynamic
5068
+
substitution.
5069
+
reasoning: Reasoning configuration for the response.
5070
+
safety_identifier: Safety identifier for the response.
5071
+
store: Whether to store the response in conversation history. Defaults to True.
5072
+
stream: Whether to stream the response as it is generated. Defaults to False.
5073
+
temperature: Sampling temperature controlling randomness (typically 0.0–2.0).
5074
+
text: Text response configuration specifying output format constraints (JSON
5075
+
schema, JSON object, or plain text).
5076
+
tool_choice: Tool selection strategy ("auto", "required", "none", or specific
5077
+
tool configuration).
5078
+
tools: List of tools available to the model (file search, web search, function
5079
+
calls, MCP tools). Defaults to all tools available to the model.
5080
+
generate_topic_summary: LCORE-specific flag indicating whether to generate a
5081
+
topic summary for new conversations. Defaults to True.
5082
+
shield_ids: LCORE-specific list of safety shield IDs to apply. If None, all
0 commit comments