feat: mock non-speech audio generation (ElevenLabs, fal.ai, Gemini)#140
Open
feat: mock non-speech audio generation (ElevenLabs, fal.ai, Gemini)#140
Conversation
commit: |
AudioResponse.audio now accepts string (base64) or { b64Json, contentType }
object form. Adds "audio-gen" and "fal-audio" endpoint types to FixtureMatch,
RecordProviderKey, and router compatibility checks. Exports FORMAT_TO_CONTENT_TYPE
from helpers. Guards speech handler against object-form audio.
Handles AudioResponse fixtures in both non-streaming and streaming Gemini paths by emitting inlineData parts with base64 audio. Adds reasoning/thought accumulation and audio inlineData extraction to collapseGeminiSSE. Fixes Cohere SSE collapse to preserve content alongside tool calls.
Handles AudioResponse fixtures by sending inlineData frames with turnComplete. Adds ContentWithToolCallsResponse handler that streams text then sends tool calls. Filters thought parts from text extraction. Pre-computes tool call IDs for wire/history consistency.
Handles /v1/sound-generation and /v1/music/* endpoints. Supports sound-generation (text field), music compose/stream/plan (prompt field with composition_plan fallback). Returns binary audio, chunked stream, or JSON plan based on subType. Full journal integration with fixture matching and proxy-and-record fallback.
Handles /fal/queue/submit, /fal/queue/requests (status/result/cancel), and /fal/run endpoints. FalJobMap singleton with TTL and size-bound eviction. Translates AudioResponse to fal.ai file objects with computed sizes. Full journal integration, proxy-and-record fallback, test-ID scoped job isolation.
…thods Adds server routes for ElevenLabs (/v1/sound-generation, /v1/music/*), fal.ai (/fal/queue/*, /fal/run/*) with CORS, chaos, and journal. Adds falJobs.clear() to reset handler and LLMock.reset(). Adds onAudio, onSoundEffect, onMusic, onFalAudio convenience methods. Fixes nextRequestError splice race.
Detects audio inlineData in both non-streaming JSON and streaming SSE Gemini responses, producing AudioResponse fixtures with b64Json and contentType. Widens recorder EndpointType to include audio-gen and fal-audio. Adds Float32Array alignment guard for embeddings.
6 new/modified test files covering ElevenLabs sound/music, fal.ai queue lifecycle, Gemini HTTP audio (streaming + non-streaming), Gemini WebSocket audio, audio recording/replay, and multimedia type guard + endpoint filtering.
70ace03 to
0fe22fb
Compare
|
Just created tests with this for the elevenlabs adapter. It's looking good! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds mock support for non-speech audio generation endpoints, closing #118:
/v1/sound-generation) and music (/v1/music/*) endpoints with support for generation, streaming, and composition plans/fal/queue/submit/*,/fal/queue/requests/*,/fal/run/*) with full lifecycle (submit → status → result → cancel)generateContent/streamGenerateContentwithinlineDataparts containing audio MIME typesonAudio(),onSoundEffect(),onMusic(),onFalAudio()audiofield now supports bothstring(base64) and{ b64Json, contentType }object formaudio-genandfal-audioendpoint typesFiles changed (23 files, +3159/-50)
elevenlabs-audio.ts,fal-audio.tsgemini.ts,ws-gemini-live.ts,speech.tsserver.ts,router.ts,types.ts,helpers.ts,recorder.ts,stream-collapse.ts,llmock.ts,index.ts,fixture-loader.tselevenlabs-audio.test.ts,fal-audio.test.ts,gemini-audio.test.ts,gemini-audio-record.test.ts,ws-gemini-live-audio.test.ts,multimedia-types.test.tsindex.html,mcp/index.html,sidebar.jsTest plan
tsc --noEmitcleanprettier --checkcleaneslintcleanCloses #118