lightspeed-core · tisnik · Feb 19, 2026 · Feb 19, 2026 · Feb 19, 2026
diff --git a/README.md b/README.md
@@ -73,6 +73,7 @@ The service includes comprehensive user data collection capabilities for various
     * [OpenAPI specification](#openapi-specification)
     * [Readiness Endpoint](#readiness-endpoint)
     * [Liveness Endpoint](#liveness-endpoint)
+    * [Models endpoint](#models-endpoint)
 * [Database structure](#database-structure)
 * [Publish the service as Python package on PyPI](#publish-the-service-as-python-package-on-pypi)
     * [Generate distribution archives to be uploaded into Python registry](#generate-distribution-archives-to-be-uploaded-into-python-registry)
@@ -1054,7 +1055,9 @@ Stack service. It is possible to specify "model_type" query parameter that is
 used as a filter. For example, if model type is set to "llm", only LLM models
 will be returned:
 
+```bash
 curl http://localhost:8080/v1/models?model_type=llm
+```
 
 The "model_type" query parameter is optional. When not specified, all models
 will be returned.

diff --git a/docs/openapi.json b/docs/openapi.json
@@ -245,7 +245,7 @@
                     "models"
                 ],
                 "summary": "Models Endpoint Handler",
-                "description": "Handle requests to the /models endpoint.\n\nProcess GET requests to the /models endpoint, returning a list of available\nmodels from the Llama Stack service. It is possible to specify \"model_type\"\nquery parameter that is used as a filter. For example, if model type is set\nto \"llm\", only LLM models will be returned:\n\n    curl http://localhost:8080/v1/models?model_type=llm\n\nThe \"model_type\" query parameter is optional. When not specified, all models\nwill be returned.\n\n## Parameters:\n    request: The incoming HTTP request.\n    auth: Authentication tuple from the auth dependency.\n    model_type: Optional filter to return only models matching this type.\n\n## Raises:\n    HTTPException: If unable to connect to the Llama Stack server or if\n    model retrieval fails for any reason.\n\n## Returns:\n    ModelsResponse: An object containing the list of available models.",
+                "description": "Handle requests to the /models endpoint.\n\nProcess GET requests to the /models endpoint, returning a list of available\nmodels from the Llama Stack service. It is possible to specify \"model_type\"\nquery parameter that is used as a filter. For example, if model type is set\nto \"llm\", only LLM models will be returned:\n\n    curl http://localhost:8080/v1/models?model_type=llm\n\nThe \"model_type\" query parameter is optional. When not specified, all models\nwill be returned.\n\n### Parameters:\n    request: The incoming HTTP request.\n    auth: Authentication tuple from the auth dependency.\n    model_type: Optional filter to return only models matching this type.\n\n### Raises:\n    HTTPException: If unable to connect to the Llama Stack server or if\n    model retrieval fails for any reason.\n\n### Returns:\n    ModelsResponse: An object containing the list of available models.",
                 "operationId": "models_endpoint_handler_v1_models_get",
                 "parameters": [
                     {

diff --git a/docs/openapi.md b/docs/openapi.md
@@ -12,6 +12,46 @@ Lightspeed Core Service (LCS) service API specification.
 
 # 🛠️ APIs
 
+## List of REST API endpoints
+
+| Method | Path                                  | Description |
+|--------|---------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------|
+| GET    | `/`                                   | Returns the static HTML index page                                                                                                                   |
+| GET    | `/v1/info`                            | Returns the service name, version and Llama-stack version                                                                                            |
+| GET    | `/v1/models`                          | List of available models                                                                                                                             |
+| GET    | `/v1/tools`                           | Consolidated list of available tools from all configured MCP servers                                                                                 |
+| GET    | `/v1/mcp-auth/client-options`         | List of MCP servers configured to accept client-provided authorization tokens, along with the header names where clients should provide these tokens |
+| GET    | `/v1/shields`                         | List of available shields from the Llama Stack service                                                                                               |
+| GET    | `/v1/providers`                       | List all available providers grouped by API type                                                                                                     |
+| GET    | `/v1/providers/{provider_id}`         | Retrieve a single provider identified by its unique ID                                                                                               |
+| GET    | `/v1/rags`                            | List all available RAGs                                                                                                                              |
+| GET    | `/v1/rags/{rag_id}`                   | Retrieve a single RAG identified by its unique ID                                                                                                    |
+| POST   | `/v1/query`                           | Processes a POST request to a query endpoint, forwarding the user's query to a selected Llama Stack LLM and returning the generated response         |
+| POST   | `/v1/streaming_query`                 | Streaming response using Server-Sent Events (SSE) format with content type text/event-stream                                                         |
+| GET    | `/v1/config`                          | Returns the current service configuration                                                                                                            |
+| POST   | `/v1/feedback`                        | Processes a user feedback submission, storing the feedback and returning a confirmation response                                                     |
+| GET    | `/v1/feedback/status`                 | Return the current enabled status of the feedback functionality                                                                                      |
+| PUT    | `/v1/feedback/status`                 | Change the feedback status: enables or disables it                                                                                                   |
+| GET    | `/v1/conversations`                   | Retrieve all conversations for the authenticated user                                                                                                |
+| GET    | `/v1/conversations/{conversation_id}` | Retrieve a conversation by ID using Conversations API                                                                                                |
+| DELETE | `/v1/conversations/{conversation_id}` | Delete a conversation by ID using Conversations API                                                                                                  |
+| PUT    | `/v1/conversations/{conversation_id}` | Update a conversation metadata using Conversations API                                                                                               |
+| GET    | `/v2/conversations`                   | Retrieve all conversations for the authenticated user                                                                                                |
+| GET    | `/v2/conversations/{conversation_id}` | Retrieve a conversation identified by its ID                                                                                                         |
+| DELETE | `/v2/conversations/{conversation_id}` | Delete a conversation identified by its ID                                                                                                           |
+| PUT    | `/v2/conversations/{conversation_id}` | Update a conversation topic summary by ID                                                                                                            |
+| POST   | `/v1/infer`                           | Serves requests from the RHEL Lightspeed Command Line Assistant (CLA)                                                                                |
+| GET    | `/readiness`                          | Returns service readiness state                                                                                                                      |
+| GET    | `/liveness`                           | Returns liveness status of the service                                                                                                               |
+| POST   | `/authorized`                         | Returns the authenticated user's ID and username                                                                                                     |
+| GET    | `/metrics`                            | Returns the latest Prometheus metrics in a form of plain text                                                                                        |
+| GET    | `/.well-known/agent-card.json`        | Serve the A2A Agent Card at the well-known location                                                                                                  |
+| GET    | `/.well-known/agent.json`             | Handle A2A JSON-RPC requests following the A2A protocol specification                                                                                |
+| GET    | `/a2a`                                | Handle A2A JSON-RPC requests following the A2A protocol specification                                                                                |
+| POST   | `/a2a`                                | Handle A2A JSON-RPC requests following the A2A protocol specification                                                                                |
+| GET    | `/a2a/health`                         | Handle A2A JSON-RPC requests following the A2A protocol specification                                                                                |
+
+
 ## GET `/`
 
 > **Root Endpoint Handler**
@@ -70,8 +110,8 @@ Examples
 
 Handle request to the /info endpoint.
 
-Process GET requests to the /info endpoint, returning the
-service name, version and Llama-stack version.
+Process GET requests to the /info endpoint, returning the service name, version
+and Llama-stack version.
 
 Raises:
     HTTPException: with status 500 and a detail object
@@ -203,16 +243,16 @@ to "llm", only LLM models will be returned:
 The "model_type" query parameter is optional. When not specified, all models
 will be returned.
 
-## Parameters:
+### Parameters:
     request: The incoming HTTP request.
     auth: Authentication tuple from the auth dependency.
     model_type: Optional filter to return only models matching this type.
 
-## Raises:
+### Raises:
     HTTPException: If unable to connect to the Llama Stack server or if
     model retrieval fails for any reason.
 
-## Returns:
+### Returns:
     ModelsResponse: An object containing the list of available models.
 
 

diff --git a/src/app/endpoints/models.py b/src/app/endpoints/models.py
@@ -92,16 +92,16 @@ async def models_endpoint_handler(
     The "model_type" query parameter is optional. When not specified, all models
     will be returned.
 
-    ## Parameters:
+    ### Parameters:
         request: The incoming HTTP request.
         auth: Authentication tuple from the auth dependency.
         model_type: Optional filter to return only models matching this type.
 
-    ## Raises:
+    ### Raises:
         HTTPException: If unable to connect to the Llama Stack server or if
         model retrieval fails for any reason.
 
-    ## Returns:
+    ### Returns:
         ModelsResponse: An object containing the list of available models.
     """
     # Used only by the middleware