Skip to content

Commit 32e4928

Browse files
committed
feat(agent): add WebSocket connection layer
Adds the Agent sub-client, AgentHandle, AgentEventStream, and AgentEvent types that connect to wss://agent.deepgram.com/v1/agent/converse and tie together the message and response types from prior commits. Surface: - Deepgram::agent() returns Agent<'_>; Agent::start() opens a session and returns (AgentHandle, AgentEventStream). - AgentHandle exposes send_settings/send_update_speak/send_update_think/ send_update_prompt/send_inject_user_message/send_inject_agent_message/ send_function_call_response for JSON messages, send_data for binary audio frames, and keep_alive/close for control. Method naming follows the existing FluxHandle convention (send_data for audio, plain verbs for control), with send_<message_type> for the typed JSON sends. - AgentEvent::Json(AgentResponse) | Audio(Bytes) is the unified stream item type — JSON events and audio frames are interleaved in their natural ordering. Matches the Python and JS SDKs. Connection details: - URL is a hardcoded constant (wss://agent.deepgram.com/v1/agent/converse). Self-hosted agent support is on the 0.10.0 roadmap; for now this is fixed. - Auth uses the Deepgram client's auth method (Token X / Bearer X). - dg-request-id from the upgrade response is parsed when present; falls back to the server's Welcome event when absent. Plumbing: - The agent feature now pulls in tungstenite + tokio-tungstenite (shared with listen). - WsError, TungsteniteError, and the From<TungsteniteError> impl are extended from cfg(feature = "listen") to cfg(any(feature = "listen", feature = "agent")) so agent works standalone. - auth field's allow-unused extended to cover agent; base_url and client stay listen-gated since agent doesn't use them today. 128 tests pass (3 new), doctest in the module example compiles. Examples and an integration test against a mock server follow in subsequent commits.
1 parent d76efd7 commit 32e4928

4 files changed

Lines changed: 549 additions & 9 deletions

File tree

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ default = ["manage", "listen", "speak"]
5959
manage = []
6060
listen = ["dep:tungstenite", "dep:tokio-tungstenite"]
6161
speak = []
62-
agent = []
62+
agent = ["dep:tungstenite", "dep:tokio-tungstenite"]
6363

6464
[[example]]
6565
name = "grant_token"

src/agent/mod.rs

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,10 +27,14 @@
2727
//! `SpeakUpdated`, `ThinkUpdated`, `InjectionRefused`,
2828
//! `FunctionCallResponse`) plus an `Unknown` catch-all for forward
2929
//! compatibility, all unified under the [`response::AgentResponse`] enum.
30+
//! - [`websocket`] — the [`Agent`] sub-client and live-session
31+
//! primitives ([`AgentHandle`], [`AgentEventStream`], [`AgentEvent`])
32+
//! that connect to `wss://agent.deepgram.com/v1/agent/converse`.
3033
//!
3134
//! Wire format matches the AsyncAPI schemas in `deepgram-docs` under
32-
//! `api/specs/asyncapi/schemas/agent/`. The WebSocket connection helpers
33-
//! (builder, handle, stream) land in subsequent commits.
35+
//! `api/specs/asyncapi/schemas/agent/`. Examples and additional
36+
//! convenience layers (e.g. file-based audio input) follow in
37+
//! subsequent commits.
3438
3539
pub mod audio;
3640
pub mod aws_credentials;
@@ -42,6 +46,7 @@ pub mod response;
4246
pub mod settings;
4347
pub mod speak;
4448
pub mod think;
49+
pub mod websocket;
4550

4651
pub use audio::{
4752
AudioConfig, AudioContainer, AudioInput, AudioInputEncoding, AudioOutput, AudioOutputEncoding,
@@ -77,3 +82,4 @@ pub use settings::{
7782
};
7883
pub use speak::{SpeakProvider, SpeakSettings};
7984
pub use think::{ContextLength, FunctionEndpoint, ThinkFunction, ThinkProvider, ThinkSettings};
85+
pub use websocket::{Agent, AgentEvent, AgentEventStream, AgentHandle};

0 commit comments

Comments
 (0)