Skip to content

Commit ac7e9cc

Browse files
apartsinclaude
andcommitted
docs: clarify provider registration, dot-notation, add non-OpenAI provider example
- Q1: Explain how setting env vars triggers auto-discovery and pool formation - Q3: Explain how capability pools are formed from provider model tags - Q5: Clarify relationship between shortcuts and dot-notation paths - Q10: Add complete BaseProvider example for non-OpenAI-compatible APIs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent d0b8fcc commit ac7e9cc

File tree

1 file changed

+68
-0
lines changed

1 file changed

+68
-0
lines changed

docs/guides/FAQ.md

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,8 @@ response = client.chat.completions.create(
2525
print(response.choices[0].message.content)
2626
```
2727

28+
**How does this work?** Setting `OPENAI_API_KEY` triggers auto-discovery: ModelMesh finds the OpenAI provider, registers its models, and groups them into **capability pools** by what each model can do. `create("chat-completion")` returns a client wired to the pool containing all chat-capable models. The shortcut `"chat-completion"` resolves to the full dot-notation path `generation.text-generation.chat-completion` automatically (see [Q5](#5-what-does-request-capabilities-not-model-names-mean)).
29+
2830
When you need more control, add a YAML file or pass options programmatically. All three layers compose: env vars for secrets, YAML for topology, code for runtime overrides.
2931

3032
```python
@@ -109,6 +111,8 @@ for i in range(100):
109111

110112
Your code makes the same call every time. The library handles detection, pooling, and rotation internally.
111113

114+
**How are pools formed?** Each provider registers its models with capability tags (e.g. `generation.text-generation.chat-completion`). ModelMesh groups all models sharing a capability into a single pool. When you call `create("chat-completion")`, you get a client backed by every chat-capable model across all discovered providers. Adding a new API key adds that provider's models to the existing pools automatically.
115+
112116
See the [Free-Tier Aggregation](QuickStart.md) guide.
113117

114118
---
@@ -175,6 +179,8 @@ matches = modelmesh.capabilities.search("text")
175179
client = modelmesh.create("chat-completion")
176180
```
177181

182+
**Shortcuts vs dot-notation:** Every capability has a full dot-notation path reflecting its position in the hierarchy tree (e.g. `generation.text-generation.chat-completion`). Shortcuts like `"chat-completion"` are leaf-node aliases that resolve automatically. Both forms work everywhere: `create("chat-completion")` and `create("generation.text-generation.chat-completion")` are equivalent. Providers tag their models with full paths; you use whichever form is convenient.
183+
178184
When a new model launches or an old one is deprecated, update your config. Your application code stays the same.
179185

180186
See the [Capability Discovery](Capabilities.md) guide.
@@ -391,6 +397,68 @@ policy = ThresholdRotationPolicy(ThresholdRotationConfig(
391397
))
392398
```
393399

400+
**Custom provider for a non-OpenAI API:**
401+
402+
When your API doesn't follow the OpenAI format, inherit from `BaseProvider` and override four hook methods. BaseProvider handles HTTP transport, retries, and error classification; you only translate the request and response formats.
403+
404+
```python
405+
from modelmesh.cdk import BaseProvider, BaseProviderConfig
406+
from modelmesh.interfaces.provider import (
407+
ModelInfo, CompletionRequest, CompletionResponse,
408+
CompletionChoice, ChatMessage, TokenUsage,
409+
)
410+
411+
class CorpLLMProvider(BaseProvider):
412+
"""Provider for a custom internal API."""
413+
414+
def _get_completion_endpoint(self) -> str:
415+
return f"{self._config.base_url.rstrip('/')}/api/generate"
416+
417+
def _build_headers(self) -> dict[str, str]:
418+
return {
419+
"Content-Type": "application/json",
420+
"X-Corp-Token": self._config.api_key,
421+
}
422+
423+
def _build_request_payload(self, request: CompletionRequest) -> dict:
424+
return {
425+
"prompt": request.messages[-1]["content"],
426+
"model_name": request.model,
427+
"params": {"temperature": request.temperature or 0.7},
428+
}
429+
430+
def _parse_response(self, data: dict) -> CompletionResponse:
431+
return CompletionResponse(
432+
id=data.get("request_id", ""),
433+
model=data.get("model", ""),
434+
choices=[CompletionChoice(
435+
index=0,
436+
message=ChatMessage(role="assistant", content=data["output"]),
437+
finish_reason="stop",
438+
)],
439+
usage=TokenUsage(
440+
prompt_tokens=data.get("tokens_in", 0),
441+
completion_tokens=data.get("tokens_out", 0),
442+
total_tokens=data.get("tokens_in", 0) + data.get("tokens_out", 0),
443+
),
444+
)
445+
446+
provider = CorpLLMProvider(BaseProviderConfig(
447+
base_url="https://llm.corp.internal",
448+
api_key="corp-token-123",
449+
models=[
450+
ModelInfo(
451+
id="corp.internal-llm",
452+
name="Internal LLM",
453+
capabilities=["generation.text-generation.chat-completion"],
454+
context_window=32_000,
455+
),
456+
],
457+
))
458+
```
459+
460+
Override only what differs: `_get_completion_endpoint()` for the URL path, `_build_headers()` for authentication, `_build_request_payload()` to translate the request format, and `_parse_response()` to translate the response back. For streaming, also override `_parse_sse_chunk()`.
461+
394462
Six connector types are extensible this way: providers, rotation policies, secret stores, storage backends, observability sinks, and discovery connectors.
395463

396464
See the [CDK](../ConnectorCatalogue.md) reference and [CDK Developer Guide](../cdk/DeveloperGuide.md).

0 commit comments

Comments
 (0)