-
Notifications
You must be signed in to change notification settings - Fork 94
LCORE-1037: update BYOK and RAG guides to use lightspeed-stack config #1842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
are-ces
merged 1 commit into
lightspeed-core:main
from
are-ces:lcore-1037-update-byok-guide-stack-config
Jun 3, 2026
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
Does Ollama support tool calling or function calling as of 2026?💡 Result:
Yes. As of 2026-06-03, Ollama supports tool calling (also called function calling) in its chat API. Ollama’s official documentation states that “Ollama supports tool calling (also known as function calling)” and describes how to provide a tools list in the request, how the model returns tool calls, and how to send tool results back for the model to complete its reply [1]. The API documentation in Ollama’s repo likewise documents a
toolsparameter (with returnedtool_calls) in chat requests [2], and Ollama’s blog post explains enabling tool calling by providingtoolsand that supported models returntool_calls[3]. So, for an up-to-date 2026 answer: Ollama has built-in tool/function calling capability (for models that support it), via the standardtools/tool_callsmechanism in the chat endpoints [1][2][3].Citations:
🏁 Script executed:
Repository: lightspeed-core/lightspeed-stack
Length of output: 2896
🏁 Script executed:
Repository: lightspeed-core/lightspeed-stack
Length of output: 119
🏁 Script executed:
Repository: lightspeed-core/lightspeed-stack
Length of output: 50392
🏁 Script executed:
Repository: lightspeed-core/lightspeed-stack
Length of output: 6547
🌐 Web query:
llama-stack remote::ollama tool calling support tool_calls💡 Result:
Llama Stack does support tool calling when using the remote::ollama provider, via the OpenAI-compatible chat.completions interface’s tools/tool_calls fields—but support can depend on (a) the Ollama model’s chat template/tool-calling wiring and (b) Llama Stack version. 1) Whether tool calling is supported - Llama Stack’s Ollama integration is implemented as a remote provider spec with provider_type "remote::ollama" [1]. - Tool calling is expected to work through standard OpenAI-compatible chat completions (tools passed in request, model returns tool_calls in response) as shown in Llama Stack tests and related provider work (e.g., remote-vllm tool calling support landed, indicating the general tool-calling mechanism is wired end-to-end for remote providers) [2]. - For Ollama specifically, Ollama’s own docs confirm it supports tool calling using tool_calls in responses and a follow-up request that includes tool results [3]. 2) Known caveat: tools may not be passed through for non-“Llama” Ollama models / certain model template conversions - There is an explicit Llama Stack issue reporting that when using an Ollama model that is not detected as a Llama model (e.g., custom-converted models), the.Tools variable in the Ollama template is not filled with the tools passed to Llama Stack, i.e., tool calling can fail because the model never receives tool definitions [4]. 3) Practical implication for “tool_calls” support - If your Ollama model’s chat template expects a.Tools section and that section is correctly populated by the Llama Stack Ollama provider, you should see tool_calls come back in chat completion responses (matching Ollama’s documented tool_calls behavior) [3]. - If the model is not properly detected / its template path differs (the scenario described in the issue), tool definitions may not be injected, and tool_calls may not be produced (or may be unreliable) [4]. 4) What to check in your setup - Ensure you are using the Llama Stack API endpoint variant that carries OpenAI-style tools (chat.completions) and passes them to the remote::ollama provider [1]. - Verify your Ollama model supports tool calling in its template (Ollama documents the required tool_calls flow) [3]. - If you use a custom-converted or nonstandard Ollama model, be aware of the reported provider/template bug for non-Llama models where tools are not injected [4]. Bottom line: Yes—Llama Stack + remote::ollama is intended to support tool calling via tool_calls, and Ollama supports tool_calls. But tool calling can break for certain Ollama models/templates (not properly detected as Llama), so your results may vary depending on the model and Llama Stack version [4][3].
Citations:
🌐 Web query:
site:github.com/openai/llama-stack remote::ollama tools tool_calls💡 Result:
github.com/openai/llama-stack) doesn’t appear to be the active home anymore—“Llama Stack is now OGX” and the code lives atogx-ai/ogx(GitHub shows the rename/announcement in the README). [1]remote::ollamaprovider docs (OGX): configuration includesbase_urland the sample usesOLLAMA_URL(defaulting tohttp://localhost:11434/v1). [2]remote::ollamasits under Inference providers and links to it. [3]tool_calls) reference: Ollama documents tool calling and showsresponse.message.tool_calls[...]usage. [4]Sources
[1] (github.com)
[2] (ogx-ai.github.io)
[3] (ogx-ai.github.io)
[4] (docs.ollama.com)
Citations:
🌐 Web query:
site:docs.ollama.com tool calling tools parameter tool_calls💡 Result:
Tool calling is a capability in LLMs that allows the model to interact with external tools (such as functions, APIs, or databases) by requesting their execution during a conversation [1]. When a model determines that a tool is needed to answer a user's request, it generates a structured request—typically referred to as a tool call—instead of a standard text response [1]. In API implementations (such as the OpenAI-compatible API often used by tools like Ollama), this is facilitated through the tool_calls parameter [1][2]. How tool_calls works: 1. Tool Definition: The user provides a list of available tools (often using a JSON schema format) in the request to the model [1][3]. 2. Model Request: If the model decides to use a tool, it returns a response containing a tool_calls array [1]. Each object in this array includes details about the requested function, such as: - Name: The identifier of the function to execute [1]. - Arguments: The parameters the model has determined are necessary for the function [1]. 3. Execution and Response: The application code executes the specified function using the provided arguments [1]. After execution, the application sends the result back to the model by adding a new message to the conversation history with the role of tool, including the tool_name and the content (the function's output) [1]. 4. Final Answer: The model then processes this result and generates a final response for the user [1]. This process supports parallel tool calling, where multiple tool requests can be generated in a single turn, and multi-turn loops where the model continues to call tools until it has sufficient information to provide a final answer [1]. When using streaming, tool calls are emitted as partial chunks, which the client must accumulate to fully reconstruct the tool call before execution [1][4].
Citations:
Fix misleading
remote::ollama“no tool calling” documentationdocs/rag_guide.mdcurrently claimsremote::ollama“does not support tool calling,” but Ollama (and the OGX/Llama Stackremote::ollamaintegration) supports tool calling via OpenAI-compatibletools/tool_calls. Tool-calling/RAG-as-a-tool can still fail depending on the specific Ollama model’s chat-template/tool wiring (e.g., custom/nonstandard models), so the docs should describe this dependency instead of stating blanket non-support.🤖 Prompt for AI Agents