This sample starts the Foundry Local OpenAI-compatible web service, then calls the Responses API with the official OpenAI Python client.
It demonstrates:
- A non-streaming
/v1/responsescall - A streaming
/v1/responsescall - A function/tool-calling round trip using
previous_response_id
Install the sample dependencies from requirements.txt:
pip install -r requirements.txtThat installs:
foundry-local-sdkon non-Windows platformsfoundry-local-sdk-winmlon Windowsopenai
The sample downloads/registers Foundry Local execution providers and downloads the qwen2.5-0.5b model the first time it runs.
From this directory:
python -m venv .venv
.\.venv\Scripts\activate
pip install -r requirements.txt
python src\app.pyOn macOS or Linux, activate the virtual environment with:
source .venv/bin/activateThe sample starts the local web service, sends Responses API requests to http://localhost:<port>/v1, prints the model output, and then unloads the model and stops the web service.