ApartsinProjects
diff --git a/‎docs/ConnectorCatalogue.md‎
Lines changed: 8 additions & 7 deletions b/‎docs/ConnectorCatalogue.md‎
Lines changed: 8 additions & 7 deletions
diff --git a/‎docs/CoverageMatrix.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/CoverageMatrix.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/guides/Capabilities.md‎
Lines changed: 115 additions & 0 deletions b/‎docs/guides/Capabilities.md‎
Lines changed: 115 additions & 0 deletions
diff --git a/‎docs/guides/ErrorHandling.md‎
Lines changed: 137 additions & 0 deletions b/‎docs/guides/ErrorHandling.md‎
Lines changed: 137 additions & 0 deletions
@@ -301,13 +301,13 @@ Interface: [ConnectorInterfaces.md — Rotation Policy](ConnectorInterfaces.html
 | Policy | Description |
 | --- | --- |
 | **`rotation.modelmesh.stick-until-failure.v1`** | Use the current model until it fails, then rotate. Default policy. |
-| **`rotation.modelmesh.priority-selection.v1`** | *(Planned)* Follow an ordered model/provider preference list; fall back on exhaust. |
-| **`rotation.modelmesh.round-robin.v1`** | *(Planned)* Cycle through active models in sequence. |
-| **`rotation.modelmesh.cost-first.v1`** | *(Planned)* Select the cheapest active model for each request. |
-| **`rotation.modelmesh.latency-first.v1`** | *(Planned)* Select the model with the lowest observed latency. |
-| **`rotation.modelmesh.session-stickiness.v1`** | *(Planned)* Route all requests in a session to the same model. |
-| **`rotation.modelmesh.rate-limit-aware.v1`** | *(Planned)* Switch models preemptively before hitting rate limits. |
-| **`rotation.modelmesh.load-balanced.v1`** | *(Planned)* Distribute requests proportionally to each model's rate-limit headroom. |
+| **`rotation.modelmesh.priority-selection.v1`** | Follow an ordered model/provider preference list; fall back on exhaust. |
+| **`rotation.modelmesh.round-robin.v1`** | Cycle through active models in sequence. |
+| **`rotation.modelmesh.cost-first.v1`** | Select the model with the lowest accumulated cost. |
+| **`rotation.modelmesh.latency-first.v1`** | Select the model with the lowest observed latency. |
+| **`rotation.modelmesh.session-stickiness.v1`** | Route all requests in a session to the same model via consistent hashing. |
+| **`rotation.modelmesh.rate-limit-aware.v1`** | Track per-model request/token quotas and switch before exhaustion. |
+| **`rotation.modelmesh.load-balanced.v1`** | Distribute requests proportionally using weighted round-robin. |
 
 ---
 
@@ -560,6 +560,7 @@ All trace entries carry a severity level. The `min_severity` configuration optio
 | **`observability.modelmesh.webhook.v1`** | HTTP POST | Sends routing events and logs to a configurable URL. Use for alerting, dashboards, or external log aggregation. |
 | **`observability.modelmesh.json-log.v1`** | JSONL file | Appends JSON Lines records to a file. Each line is a self-contained JSON object with type, timestamp, and payload. Optimized for log aggregation pipelines. |
 | **`observability.modelmesh.callback.v1`** | Python callback | Invokes a user-supplied Python callable for each event. Useful for custom integrations, in-process dashboards, and testing. |
+| **`observability.modelmesh.prometheus.v1`** | Prometheus text | Exposes metrics in Prometheus text exposition format. Zero-dependency implementation with counters, gauges, and histograms. Call `render_metrics()` for scrape output. |
 
 ### Connector-Specific Configuration
 
 
@@ -5,7 +5,7 @@ title: "Test Coverage Matrix"
 
 # Test Coverage Matrix
 
-Correlates documented features with test coverage. The project includes 855 Python tests across 15 test files and 511 TypeScript tests across 13 test files, for a total of 1,366 tests.
+Correlates documented features with test coverage. The project includes 1,028 Python tests across 18 test files and 644 TypeScript tests across 13 test files, for a total of 1,672 tests.
 
 ---
 
 
@@ -0,0 +1,115 @@
+# Capability Discovery
+
+ModelMesh uses a hierarchical capability system to route requests. Instead of memorizing full dotted paths like `generation.text-generation.chat-completion`, you can use short aliases and the discovery API.
+
+## Capability Aliases
+
+| Alias | Full Path |
+|-------|-----------|
+| `chat-completion` | `generation.text-generation.chat-completion` |
+| `text-generation` | `generation.text-generation` |
+| `code-generation` | `generation.text-generation.code-generation` |
+| `text-embeddings` | `representation.embeddings.text-embeddings` |
+| `text-to-speech` | `generation.audio.text-to-speech` |
+| `speech-to-text` | `understanding.audio.speech-to-text` |
+| `text-to-image` | `generation.image.text-to-image` |
+| `image-to-text` | `representation.image.image-to-text` |
+
+## Discovery API
+
+### Python
+
+```python
+import modelmesh
+
+# List all aliases
+caps = modelmesh.capabilities.list_all()
+# ['chat-completion', 'code-generation', 'image-to-text', ...]
+
+# Resolve alias → full path
+path = modelmesh.capabilities.resolve("chat-completion")
+# 'generation.text-generation.chat-completion'
+
+# Dotted paths pass through unchanged
+modelmesh.capabilities.resolve("generation.text-generation")
+# 'generation.text-generation'
+
+# Unknown aliases return unchanged
+modelmesh.capabilities.resolve("custom-cap")
+# 'custom-cap'
+
+# Search by keyword (case-insensitive)
+modelmesh.capabilities.search("text")
+# ['text-embeddings', 'text-generation', 'text-to-image', 'text-to-speech']
+
+# View the hierarchy tree
+tree = modelmesh.capabilities.tree()
+# {
+#   'generation': {
+#     'text-generation': {
+#       'chat-completion': {},
+#       'code-generation': {},
+#     },
+#     'audio': {'text-to-speech': {}},
+#     'image': {'text-to-image': {}},
+#   },
+#   'representation': { ... },
+#   'understanding': { ... },
+# }
+```
+
+### TypeScript
+
+```typescript
+import * as capabilities from '@nistrapa/modelmesh-core/capabilities';
+
+const caps = capabilities.listAll();
+const path = capabilities.resolve('chat-completion');
+const matches = capabilities.search('text');
+const tree = capabilities.tree();
+```
+
+## Using Capabilities
+
+When creating a client, the capability name determines which pool handles your requests:
+
+```python
+# These are equivalent:
+client = modelmesh.create("chat-completion")
+client = modelmesh.create("generation.text-generation.chat-completion")
+```
+
+When calling `create()`, ModelMesh resolves the capability to find or create a pool containing all models that support it.
+
+## Hierarchy
+
+The capability tree follows a three-level hierarchy:
+
+```
+Category
+└── Domain
+    └── Specific Capability
+```
+
+Categories:
+- **generation** — Create new content (text, audio, images)
+- **representation** — Transform content into structured forms (embeddings, descriptions)
+- **understanding** — Analyze and interpret content (transcription, classification)
+
+## Custom Capabilities
+
+You can register custom capabilities in your YAML configuration:
+
+```yaml
+models:
+  my-model:
+    provider: my-provider
+    capabilities:
+      - generation.custom.my-capability
+
+pools:
+  my-pool:
+    capability: generation.custom.my-capability
+```
+
+Custom capability paths don't need aliases — use the full dotted path directly.
@@ -0,0 +1,137 @@
+# Error Handling
+
+ModelMesh provides a structured exception hierarchy so you can catch failures at the right level of specificity.
+
+## Exception Tree
+
+```
+ModelMeshError (base)
+├── RoutingError
+│   ├── NoActiveModelError        (retryable)
+│   └── AllProvidersExhaustedError
+├── ProviderError
+│   ├── AuthenticationError
+│   ├── RateLimitError            (retryable, has retry_after)
+│   └── ProviderTimeoutError      (retryable)
+├── ConfigurationError
+└── BudgetExceededError
+```
+
+All exceptions inherit from `ModelMeshError`, so you can use a single broad catch or handle specific failure modes individually.
+
+## Quick Start
+
+### Python
+
+```python
+from modelmesh.exceptions import (
+    ModelMeshError,
+    NoActiveModelError,
+    AllProvidersExhaustedError,
+    RateLimitError,
+    BudgetExceededError,
+)
+
+try:
+    response = client.chat.completions.create(
+        model="chat-completion",
+        messages=[{"role": "user", "content": "Hello"}],
+    )
+except RateLimitError as e:
+    print(f"Rate limited by {e.provider_id}, retry after {e.retry_after}s")
+except NoActiveModelError:
+    print("No models available — try again shortly")
+except AllProvidersExhaustedError as e:
+    print(f"All {e.attempts} attempts failed: {e.last_error}")
+except BudgetExceededError as e:
+    print(f"Budget exceeded: {e.limit_type} limit of {e.limit_value}")
+except ModelMeshError as e:
+    print(f"ModelMesh error: {e}")
+```
+
+### TypeScript
+
+```typescript
+import {
+  ModelMeshError,
+  NoActiveModelError,
+  AllProvidersExhaustedError,
+  RateLimitError,
+  BudgetExceededError,
+} from '@nistrapa/modelmesh-core';
+
+try {
+  const response = await client.chat.completions.create({
+    model: 'chat-completion',
+    messages: [{ role: 'user', content: 'Hello' }],
+  });
+} catch (e) {
+  if (e instanceof RateLimitError) {
+    console.log(`Rate limited, retry after ${e.retryAfter}s`);
+  } else if (e instanceof NoActiveModelError) {
+    console.log('No models available');
+  } else if (e instanceof AllProvidersExhaustedError) {
+    console.log(`All ${e.attempts} attempts failed`);
+  } else if (e instanceof ModelMeshError) {
+    console.log(`ModelMesh error: ${e.message}`);
+  }
+}
+```
+
+## Exception Details
+
+Every ModelMesh exception carries structured metadata:
+
+| Field | Type | Description |
+|-------|------|-------------|
+| `message` | `str` | Human-readable error description |
+| `details` | `dict` | Arbitrary structured context |
+| `retryable` | `bool` | Hint: may succeed on retry |
+
+### Routing Exceptions
+
+| Exception | Extra Fields | When Raised |
+|-----------|-------------|-------------|
+| `NoActiveModelError` | `pool_name` | All models in a pool are in standby |
+| `AllProvidersExhaustedError` | `pool_name`, `attempts`, `last_error` | All retry/rotation attempts failed |
+
+### Provider Exceptions
+
+| Exception | Extra Fields | When Raised |
+|-----------|-------------|-------------|
+| `AuthenticationError` | `provider_id`, `model_id` | Invalid API key or credentials |
+| `RateLimitError` | `provider_id`, `model_id`, `retry_after` | Rate limit or quota exceeded |
+| `ProviderTimeoutError` | `provider_id`, `model_id`, `timeout_seconds` | Request timed out |
+
+### Other Exceptions
+
+| Exception | Extra Fields | When Raised |
+|-----------|-------------|-------------|
+| `ConfigurationError` | — | Invalid config, missing fields |
+| `BudgetExceededError` | `limit_type`, `limit_value`, `actual_value` | Cost limit breached |
+
+## Retry Guidance
+
+Use the `retryable` field to decide whether to retry:
+
+```python
+try:
+    response = client.chat.completions.create(...)
+except ModelMeshError as e:
+    if e.retryable:
+        # Safe to retry — model may become available or rate limit may reset
+        time.sleep(getattr(e, 'retry_after', 5))
+        response = client.chat.completions.create(...)
+    else:
+        # Permanent failure — fix config, check credentials, or increase budget
+        raise
+```
+
+## Backward Compatibility
+
+The new exceptions maintain backward compatibility:
+
+- `AllProvidersExhaustedError` inherits from `ModelMeshError` → `Error`
+- `BudgetExceededError` inherits from `ModelMeshError` → `Error`
+- Code catching base `Error` continues to work
+- All new fields are additive — no existing behavior changed