Skip to content

Commit bdd0954

Browse files
apartsinclaude
andcommitted
feat: implement 7 major features — rotation strategies, observability, proxy, config, resilience, LangChain, auto-discovery
Add 7 new rotation strategies (cost-first, latency-first, round-robin, priority-selection, session-stickiness, rate-limit-aware, load-balanced), Prometheus metrics connector, enhanced proxy server (health/ready/metrics endpoints, rate limiting, request IDs, graceful shutdown), config validation with hot-reload and 5 templates, resilience patterns (circuit breaker, timeout, streaming checkpoint mixins), LangChain/LangGraph adapter, and provider auto-discovery with model registry. 47 connectors registered. 1,028 tests passing (91 new). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent ab7b692 commit bdd0954

31 files changed

+5474
-91
lines changed

docs/cdk/Mixins.md

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1636,3 +1636,89 @@ class CachedHttpProvider {
16361636
}
16371637
}
16381638
```
1639+
1640+
---
1641+
1642+
## CircuitBreakerMixin
1643+
1644+
Implements the circuit breaker pattern to prevent cascading failures when a provider becomes unhealthy. When a provider's error count exceeds the failure threshold, the circuit opens and requests fail fast without calling the provider. After a reset timeout, a single probe request is allowed; if it succeeds, the circuit closes.
1645+
1646+
### States
1647+
1648+
| State | Description |
1649+
| --- | --- |
1650+
| `CLOSED` | Normal operation. Requests pass through. |
1651+
| `OPEN` | Provider is unhealthy. Requests fail fast with `CircuitOpenError`. |
1652+
| `HALF_OPEN` | Probe mode. A limited number of requests pass through to test recovery. |
1653+
1654+
### Methods
1655+
1656+
| Method | Signature | Description |
1657+
| --- | --- | --- |
1658+
| `configure_circuit_breaker` | `(failure_threshold, reset_timeout, half_open_max, success_threshold)` | Set circuit breaker parameters. |
1659+
| `check_circuit` | `()` | Verify the circuit allows a request. Raises `CircuitOpenError` if open. |
1660+
| `record_success` | `()` | Record a successful request. Closes circuit in half-open state. |
1661+
| `record_failure` | `()` | Record a failed request. Opens circuit when threshold is reached. |
1662+
| `reset_circuit` | `()` | Manually reset the circuit to CLOSED. |
1663+
| `circuit_breaker_stats` | `()` | Return current state, failure count, and config. |
1664+
1665+
### Configuration
1666+
1667+
| Parameter | Type | Default | Description |
1668+
| --- | --- | --- | --- |
1669+
| `failure_threshold` | `int` | `5` | Consecutive failures to trip the circuit. |
1670+
| `reset_timeout` | `float` | `60.0` | Seconds before OPEN transitions to HALF_OPEN. |
1671+
| `half_open_max` | `int` | `1` | Max probe requests in HALF_OPEN state. |
1672+
| `success_threshold` | `int` | `1` | Successes in HALF_OPEN needed to close. |
1673+
1674+
---
1675+
1676+
## TimeoutMixin
1677+
1678+
Enforces per-request timeouts by wrapping async operations with `asyncio.wait_for`. Raises `RequestTimeoutError` when exceeded.
1679+
1680+
### Methods
1681+
1682+
| Method | Signature | Description |
1683+
| --- | --- | --- |
1684+
| `configure_timeout` | `(default, streaming, streaming_total, connect)` | Set timeout values in seconds. |
1685+
| `with_timeout` | `(coro, timeout?, operation?)` | Execute a coroutine with a timeout. |
1686+
| `with_stream_timeout` | `(aiter, first_chunk_timeout?, total_timeout?)` | Wrap a streaming async iterator with timeouts. |
1687+
1688+
### Configuration
1689+
1690+
| Parameter | Type | Default | Description |
1691+
| --- | --- | --- | --- |
1692+
| `default` | `float` | `30.0` | Non-streaming request timeout. |
1693+
| `streaming` | `float` | `60.0` | First-chunk timeout for streams. |
1694+
| `streaming_total` | `float` | `300.0` | Total stream duration timeout. |
1695+
| `connect` | `float` | `10.0` | Connection establishment timeout. |
1696+
1697+
---
1698+
1699+
## StreamingCheckpointMixin
1700+
1701+
Tracks streaming response progress so that interrupted streams can be detected and potentially resumed. Buffers content and token counts per active stream.
1702+
1703+
### Methods
1704+
1705+
| Method | Signature | Description |
1706+
| --- | --- | --- |
1707+
| `configure_checkpoints` | `(max_buffer_tokens, checkpoint_interval, max_checkpoints)` | Set checkpoint parameters. |
1708+
| `create_checkpoint` | `(request_id, model_id?)` | Create and register a new stream checkpoint. |
1709+
| `get_checkpoint` | `(request_id)` | Retrieve a checkpoint by request ID. |
1710+
| `remove_checkpoint` | `(request_id)` | Remove a completed checkpoint. |
1711+
| `active_checkpoints` | `()` | Return all incomplete checkpoints. |
1712+
| `checkpoint_stats` | `()` | Return summary statistics. |
1713+
1714+
### StreamCheckpoint
1715+
1716+
| Field | Type | Description |
1717+
| --- | --- | --- |
1718+
| `request_id` | `str` | Unique request identifier. |
1719+
| `model_id` | `str` | Model generating the stream. |
1720+
| `tokens_received` | `int` | Total tokens received. |
1721+
| `content_buffer` | `str` | Accumulated text. |
1722+
| `is_complete` | `bool` | Whether the stream finished normally. |
1723+
| `duration` | `float` | Elapsed time in seconds (property). |
1724+
| `tokens_per_second` | `float` | Throughput metric (property). |

samples/connectors/typescript/customDiscovery.ts

Lines changed: 36 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -237,9 +237,9 @@ class YamlDiscoveryConnector implements DiscoveryConnector {
237237
errors.push(`Failed to load registry file: ${message}`);
238238
this.syncStatus.status = "error";
239239
return {
240-
new_models: [],
241-
deprecated_models: [],
242-
updated_models: [],
240+
newModels: [],
241+
deprecatedModels: [],
242+
updatedModels: [],
243243
errors,
244244
};
245245
}
@@ -291,16 +291,16 @@ class YamlDiscoveryConnector implements DiscoveryConnector {
291291
// Update sync status.
292292
const now = new Date();
293293
this.syncStatus = {
294-
last_sync: now,
295-
next_sync: new Date(now.getTime() + this.syncIntervalMs),
296-
models_synced: modelsToSync.length,
294+
lastSync: now,
295+
nextSync: new Date(now.getTime() + this.syncIntervalMs),
296+
modelsSynced: modelsToSync.length,
297297
status: errors.length > 0 ? "completed_with_errors" : "idle",
298298
};
299299

300300
return {
301-
new_models: newModels,
302-
deprecated_models: [...new Set(deprecatedModels)], // deduplicate
303-
updated_models: updatedModels,
301+
newModels: newModels,
302+
deprecatedModels: [...new Set(deprecatedModels)], // deduplicate
303+
updatedModels: updatedModels,
304304
errors,
305305
};
306306
}
@@ -333,7 +333,7 @@ class YamlDiscoveryConnector implements DiscoveryConnector {
333333

334334
if (!provider) {
335335
const result: ProbeResult = {
336-
provider_id: providerId,
336+
providerId: providerId,
337337
success: false,
338338
error: `Provider "${providerId}" not found in registry`,
339339
};
@@ -342,8 +342,8 @@ class YamlDiscoveryConnector implements DiscoveryConnector {
342342
}
343343

344344
const healthUrl = this.buildHealthUrl(provider);
345-
const method = (provider.health_method ?? "GET").toUpperCase();
346-
const expectedStatus = provider.health_expected_status ?? 200;
345+
const method = (provider.healthMethod ?? "GET").toUpperCase();
346+
const expectedStatus = provider.healthExpectedStatus ?? 200;
347347

348348
const startTime = performance.now();
349349

@@ -369,10 +369,10 @@ class YamlDiscoveryConnector implements DiscoveryConnector {
369369
const success = response.status === expectedStatus;
370370

371371
const result: ProbeResult = {
372-
provider_id: providerId,
372+
providerId: providerId,
373373
success,
374-
latency_ms: Math.round(latencyMs * 100) / 100,
375-
status_code: response.status,
374+
latencyMs: Math.round(latencyMs * 100) / 100,
375+
statusCode: response.status,
376376
error: success
377377
? undefined
378378
: `Unexpected status ${response.status} (expected ${expectedStatus})`,
@@ -397,9 +397,9 @@ class YamlDiscoveryConnector implements DiscoveryConnector {
397397
}
398398

399399
const result: ProbeResult = {
400-
provider_id: providerId,
400+
providerId: providerId,
401401
success: false,
402-
latency_ms: Math.round(latencyMs * 100) / 100,
402+
latencyMs: Math.round(latencyMs * 100) / 100,
403403
error: errorDescription,
404404
};
405405

@@ -493,8 +493,8 @@ class YamlDiscoveryConnector implements DiscoveryConnector {
493493
current: YamlModelEntry,
494494
): boolean {
495495
if (previous.status !== current.status) return true;
496-
if (previous.context_window !== current.context_window) return true;
497-
if (previous.max_output_tokens !== current.max_output_tokens) return true;
496+
if (previous.contextWindow !== current.contextWindow) return true;
497+
if (previous.maxOutputTokens !== current.maxOutputTokens) return true;
498498

499499
// Compare capabilities arrays.
500500
if (previous.capabilities.length !== current.capabilities.length) return true;
@@ -506,8 +506,8 @@ class YamlDiscoveryConnector implements DiscoveryConnector {
506506
// Compare pricing if present.
507507
if (previous.pricing && current.pricing) {
508508
if (
509-
previous.pricing.input_per_1k_tokens !== current.pricing.input_per_1k_tokens ||
510-
previous.pricing.output_per_1k_tokens !== current.pricing.output_per_1k_tokens
509+
previous.pricing.inputPer1kTokens !== current.pricing.inputPer1kTokens ||
510+
previous.pricing.outputPer1kTokens !== current.pricing.outputPer1kTokens
511511
) {
512512
return true;
513513
}
@@ -530,19 +530,19 @@ class YamlDiscoveryConnector implements DiscoveryConnector {
530530

531531
/** Build the full health endpoint URL for a provider. */
532532
private buildHealthUrl(provider: YamlProviderEntry): string {
533-
const base = provider.base_url.replace(/\/+$/, "");
534-
const endpoint = provider.health_endpoint.startsWith("/")
535-
? provider.health_endpoint
536-
: `/${provider.health_endpoint}`;
533+
const base = provider.baseUrl.replace(/\/+$/, "");
534+
const endpoint = provider.healthEndpoint.startsWith("/")
535+
? provider.healthEndpoint
536+
: `/${provider.healthEndpoint}`;
537537
return `${base}${endpoint}`;
538538
}
539539

540540
/** Record a probe result in the rolling history. */
541541
private recordProbeResult(result: ProbeResult): void {
542-
let history = this.probeHistory.get(result.provider_id);
542+
let history = this.probeHistory.get(result.providerId);
543543
if (!history) {
544544
history = [];
545-
this.probeHistory.set(result.provider_id, history);
545+
this.probeHistory.set(result.providerId, history);
546546
}
547547

548548
// Add to the front (newest first).
@@ -570,9 +570,9 @@ class YamlDiscoveryConnector implements DiscoveryConnector {
570570

571571
if (history.length === 0) {
572572
return {
573-
provider_id: providerId,
573+
providerId: providerId,
574574
available: true, // assume available until proven otherwise
575-
availability_score: 1.0,
575+
availabilityScore: 1.0,
576576
timestamp: now,
577577
};
578578
}
@@ -597,23 +597,23 @@ class YamlDiscoveryConnector implements DiscoveryConnector {
597597

598598
// Compute average latency from successful probes.
599599
const successfulProbes = history.filter(
600-
(r) => r.success && r.latency_ms !== undefined,
600+
(r) => r.success && r.latencyMs !== undefined,
601601
);
602602
const avgLatency =
603603
successfulProbes.length > 0
604-
? successfulProbes.reduce((sum, r) => sum + (r.latency_ms ?? 0), 0) /
604+
? successfulProbes.reduce((sum, r) => sum + (r.latencyMs ?? 0), 0) /
605605
successfulProbes.length
606606
: undefined;
607607

608608
return {
609-
provider_id: providerId,
609+
providerId: providerId,
610610
available,
611-
latency_ms: avgLatency !== undefined
611+
latencyMs: avgLatency !== undefined
612612
? Math.round(avgLatency * 100) / 100
613-
: latest.latency_ms,
614-
status_code: latest.status_code,
613+
: latest.latencyMs,
614+
statusCode: latest.statusCode,
615615
error: latest.error,
616-
availability_score: Math.round(availabilityScore * 1000) / 1000,
616+
availabilityScore: Math.round(availabilityScore * 1000) / 1000,
617617
timestamp: now,
618618
};
619619
}

0 commit comments

Comments
 (0)