[FEATURE] Support a fixed, injected `AsyncOpenAI` client to enable alternative interface-compatible clients

### Problem Statement

Many users need to supply an **alternative interface-compatible implementation** of `openai.client` (e.g., a `GuardrailsAsyncOpenAI` wrapper to implement OpenAI Guardrails). Today the SDK creates a new `AsyncOpenAI` client **per request** to avoid HTTPX connection sharing across event loops. This makes it impossible to:

* Inject a pre-configured, guardrails-enabled client.
* Reuse connection pools efficiently within a single event loop/worker.
* Centralise observability, retries, timeouts and networking policy on one client.

Separately, Strands currently has very limited support for guardrails outside of **Bedrock Guardrails**, so users commonly reach for OpenAI-side guardrails via a wrapper client. Without support for injecting a fixed client, this is impossible

### Proposed Solution


Allow `OpenAIModel` to accept a **fixed, injected** `AsyncOpenAI`-compatible client, created once per worker/event loop at application startup and closed at shutdown. Continue to support current behaviour when no client is provided (backwards compatible).

**Key changes (additive, non-breaking):**

1. **Constructor injection**

   * `OpenAIModel(client: Optional[Client] = None, client_args: Optional[dict] = None, …)`
   * If `client` is provided, **reuse it** and do not create/close a new client internally.
   * If `client` is `None`, retain current behaviour (construct ephemeral client).

2. **Lifecycle guidance in docs**

   * Recommend creating one client per worker/event loop (e.g., FastAPI lifespan `startup`/`shutdown`).
   * Emphasise that clients should not be shared across event loops, but can be safely reused across tasks **within** a loop.

3. **Acceptance criteria**

   * Works with any `AsyncOpenAI`-compatible interface (e.g., `GuardrailsAsyncOpenAI`, custom proxies, instrumentation wrappers).
   * Streaming and structured output paths both reuse the injected client.
   * Clear examples for FastAPI and generic asyncio.

**Code sketch (constructor + reuse):**

```python
from typing import Any, Optional, Protocol
from openai import AsyncOpenAI

class Client(Protocol):
    @property
    def chat(self) -> Any: ...

class OpenAIModel(Model):
    def __init__(self, client: Optional[Client] = None, client_args: Optional[dict] = None, **config):
        self.client = client
        self._owns_client = client is None
        self.client_args = client_args or {}
        self.config = dict(config)

    async def stream(...):
        request = self.format_request(...)
        if self.client is not None:
            # Reuse injected client
            response = await self.client.chat.completions.create(**request)
            ...
        else:
            # Back-compat
            async with AsyncOpenAI(**self.client_args) as c:
                response = await c.chat.completions.create(**request)
                ...
```

**Example usage (FastAPI lifespan, per-worker client):**

```python
from fastapi import FastAPI
from openai import AsyncOpenAI
# from my_guardrails import GuardrailsAsyncOpenAI

app = FastAPI()

@app.on_event("startup")
async def startup():
    base = AsyncOpenAI()
    app.state.oai = base  # or GuardrailsAsyncOpenAI(base)
    app.state.model = OpenAIModel(client=app.state.oai, model_id="gpt-4o")

@app.on_event("shutdown")
async def shutdown():
    await app.state.oai.close()
```

### Use Case

* **Guardrails:** Wrap the OpenAI client with `GuardrailsAsyncOpenAI` to enforce content filters, schema validation, and redaction before responses reach application code.
* **Observability & policy:** Centralise timeouts, retries, logging, tracing, and network egress policy (e.g., custom `httpx.AsyncClient`).
* **Performance:** Reuse keep-alive connections and connection pools within a worker/event loop for lower latency and higher throughput.
* **Multi-model routing:** Swap the injected client to target proxies or gateways without touching model code (e.g., toggling `base_url`, auth, or headers).

This would help with:

* Meeting compliance requirements where guardrails must run **before** responses are consumed.
* Reducing tail latency by avoiding per-request client construction.
* Simplifying integration with enterprise networking and telemetry.

### Alternatives Solutions

1. **Create a new client per request**

   * *Pros:* Safe wrt event-loop boundaries; current behaviour.
   * *Cons:* Loses pooling; higher latency and allocation overhead; hard to apply cross-cutting concerns (guardrails, tracing) consistently.

2. **Global client shared across event loops**

   * *Pros:* Simple in theory.
   * *Cons:* Unsafe; HTTPX pools cannot be shared across loops; leads to intermittent runtime errors.

3. **Disable pooling (force `Connection: close`)**

   * *Pros:* Avoids cross-loop sharing issues.
   * *Cons:* Sacrifices performance; still doesn’t enable easy injection of guardrails wrappers.

### Additional Context

* Rationale: HTTPX connection pools are not shareable across asyncio event loops; reuse is safe **within** a loop.
* Need: Strands’ current guardrails support focuses on Bedrock; many users need OpenAI-side guardrails today.
* The OpenAI Python SDK supports async reuse and custom HTTP clients (`http_client=`), making injection straightforward.

If useful, I’m happy to contribute a PR with the constructor change, a small `_stream_with_client` helper, tests, and docs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Support a fixed, injected `AsyncOpenAI` client to enable alternative interface-compatible clients #1103

Problem Statement

Proposed Solution

Use Case

Alternatives Solutions

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE] Support a fixed, injected AsyncOpenAI client to enable alternative interface-compatible clients #1103

Description

Problem Statement

Proposed Solution

Use Case

Alternatives Solutions

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[FEATURE] Support a fixed, injected `AsyncOpenAI` client to enable alternative interface-compatible clients #1103