Skip to content

Commit d8c99d2

Browse files
committed
added example for making an inference call with minimal client
1 parent 196d69c commit d8c99d2

File tree

6 files changed

+52
-3
lines changed

6 files changed

+52
-3
lines changed

CHANGELOG.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
11
Changelog
22
=========
33

4+
* Added example for calling the inference endpoint with a minimal client
5+
* Added missing doc generation for inference examples
6+
47
v1.10.0 (2025-04-17)
58
-------------------
69

docs/source/examples/containers/index.rst

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,13 @@ This section contains examples demonstrating how to work with containers in Data
77
:maxdepth: 1
88
:caption: Contents:
99

10-
compute_resources
1110
deployments
11+
compute_resources
1212
environment_variables
1313
registry_credentials
1414
secrets
1515
sglang
16-
scaling
16+
scaling
17+
inference_async
18+
inference_sync
19+
inference_minimal
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
Calling the inference endpoint in async mode
2+
============================================
3+
4+
This example demonstrates how to call the inference endpoint in async mode.
5+
6+
.. literalinclude:: ../../../../examples/containers/calling_the_inference_endpoint_in_async_mode.py
7+
:language: python
8+
:caption: Calling the inference endpoint in async mode
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
Calling the inference endpoint using a minimal client
2+
=====================================================
3+
4+
This example demonstrates how to call the inference endpoint using a minimal client that only uses only an inference key (no client credentials).
5+
6+
.. literalinclude:: ../../../../examples/containers/calling_the_endpoint_with_inference_key.py
7+
:language: python
8+
:caption: Calling the inference endpoint using a minimal client

docs/source/examples/containers/sglang.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,4 @@ This example demonstrates how to deploy and manage SGLang applications in DataCr
55

66
.. literalinclude:: ../../../../examples/containers/sglang_deployment_example.py
77
:language: python
8-
:caption: SGLang Deployment
8+
:caption: SGLang Deployment Example
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
import os
2+
from datacrunch.InferenceClient import InferenceClient
3+
4+
# Get inference key and endpoint base url from environment variables
5+
DATACRUNCH_INFERENCE_KEY = os.environ.get('DATACRUNCH_INFERENCE_KEY')
6+
DATACRUNCH_ENDPOINT_BASE_URL = os.environ.get('DATACRUNCH_ENDPOINT_BASE_URL')
7+
8+
# Create an inference client that uses only the inference key, without client credentials
9+
inference_client = InferenceClient(
10+
inference_key=DATACRUNCH_INFERENCE_KEY,
11+
endpoint_base_url=DATACRUNCH_ENDPOINT_BASE_URL
12+
)
13+
14+
# Make a synchronous request to the endpoint.
15+
# This example demonstrates calling a SGLang deployment which serves LLMs using an OpenAI-compatible API format
16+
data = {
17+
"model": "deepseek-ai/deepseek-llm-7b-chat",
18+
"prompt": "Is consciousness fundamentally computational, or is there something more to subjective experience that cannot be reduced to information processing?",
19+
"max_tokens": 128,
20+
"temperature": 0.7,
21+
"top_p": 0.9
22+
}
23+
24+
response = inference_client.run_sync(data=data, path='v1/completions')
25+
26+
# Print the response
27+
print(response.output())

0 commit comments

Comments
 (0)