feat(audio): add AudioProcessingOptions by hiroshihorie · Pull Request #1048 · livekit/client-sdk-swift

hiroshihorie · 2026-06-22T10:26:32Z

Summary

Adds an explicit, per-effect audio processing API so apps can choose how each effect runs, either Apple's platform Voice Processing I/O on the device or WebRTC's software processing, instead of relying on implicit toggles. This also adopts the webrtc-sdk audio processing state v2 contract for accurate, engine-wide diagnostics.

Motivation

Until now, software voice processing was reached indirectly by disabling platform voice processing, and the exact behavior changed across releases. Some apps deliberately avoid Apple VPIO (for consistent hardware volume, faster call start, screen recording with audio, and no system mute sounds) while still wanting echo cancellation and noise suppression in software. This PR makes that an explicit, stable choice.

New API

`AudioProcessingMode`

Per-effect selection of the implementation:

.automatic (default): prefer platform processing when available, fall back to WebRTC software otherwise.
.platform: use platform processing only. Rejected if the platform implementation is unavailable.
.software: force WebRTC software processing and disable the matching platform effect when possible.

`AudioProcessingOptions`

A value type describing all four effects (echo cancellation, noise suppression, auto gain control, high-pass filter) with an enabled flag and a mode each. Includes AudioProcessingOptions.communication and AudioProcessingOptions.noProcessing presets.

`AudioCaptureOptions`

Gains matching *Mode fields (echoCancellationMode, autoGainControlMode, noiseSuppressionMode, highPassFilterMode), plus interop with AudioProcessingOptions (a convenience init and an audioProcessingOptions accessor). Existing call sites keep working since modes default to .automatic.

Runtime control

LocalAudioTrack.setAudioProcessingOptions(_:) applies options to an already-published track and returns an AudioProcessingOptionsResult (for example .applied, .stored, or a rejection reason).

Diagnostics

AudioManager.audioProcessingState and AudioManager.platformAudioProcessingState expose the v2 state, following the requested -> resolved -> active -> effective vocabulary, read from the factory-owned, engine-wide module rather than a single peer connection.

Renamed API

AudioManager.setVoiceProcessingEnabled(_:) and isVoiceProcessingEnabled are renamed to setPlatformVoiceProcessingAllowed(_:) and isPlatformVoiceProcessingAllowed, matching the underlying ADM accessors and the "allowed" policy meaning. The old names remain as deprecated, renamed forwarders, so existing code keeps compiling with a fix-it.

Usage

Publish the microphone with all effects forced to software:

let options = AudioCaptureOptions(
    echoCancellation: true,
    autoGainControl: true,
    noiseSuppression: true,
    highpassFilter: true,
    echoCancellationMode: .software,
    autoGainControlMode: .software,
    noiseSuppressionMode: .software,
    highPassFilterMode: .software
)
try await room.localParticipant.setMicrophone(enabled: true, captureOptions: options)

Set it as the room default so it applies whenever the mic track is published:

try await room.connect(url: url, token: token,
                       roomOptions: RoomOptions(defaultAudioCaptureOptions: options))

Guarantee Apple VPIO is never used, then rely on software processing:

try AudioManager.shared.setPlatformVoiceProcessingAllowed(false)

Verify what each effect resolved to:

let state = AudioManager.shared.audioProcessingState
print(state.echoCancellation.effective) // Software
print(state.noiseSuppression.effective)  // Software

Documentation

Docs/audio.md gains a new section, "Audio Processing Modes (software, platform, automatic)", covering publish-time, room-default, and runtime configuration, plus how to verify the resolved implementation. The "Disallowing Platform Voice Processing" section is updated to the renamed API.

Example app

A companion example demonstrates the full surface, including a live "Voice Processing" and "Runtime Audio Processing" panel with effective-state diagnostics:

Branch: https://github.com/livekit-examples/swift-example/tree/hiroshi/runtime-vp
Audio controls UI: Multiplatform/Views/AudioControlsPanel.swift
Wiring: Multiplatform/Controllers/AppContext.swift

That branch pins its SDK dependency to this branch (hiroshi/runtime-vp).

Notes

Requires LiveKitWebRTC 144.7559.10 (already on main).
There is no platform high-pass filter, so .platform is rejected for highPassFilterMode. Use .software.
Room defaults apply at track creation. An already-published mic track is updated through setAudioProcessingOptions(_:), not by toggling the mic.

Test plan

swift build passes
Example app builds and runs on macOS against this branch
swift test

github-actions · 2026-06-22T10:26:44Z

⚠️ This PR does not contain any files in the .changes directory.

… contract Introduce AudioProcessingOptions and wire it through AudioCaptureOptions, AudioManager, and LocalAudioTrack. The audio processing state read-back moved from PeerConnection to the factory upstream, so the per-publisher source registry is gone: AudioManager reads the factory-owned engine-wide state directly. State types follow the v2 vocabulary (requested -> resolved -> active -> effective) with collapsed booleans, and the device-level BuiltIn* types are renamed Platform*. ADM voice-processing calls target the renamed isPlatformVoiceProcessingAllowed accessors. Docs/audio.md updated to match.

Rename AudioManager.setVoiceProcessingEnabled(_:)/isVoiceProcessingEnabled to setPlatformVoiceProcessingAllowed(_:)/isPlatformVoiceProcessingAllowed to match the underlying ADM accessors and the v2 'allowed' vocabulary. Keep the old names as deprecated, renamed forwarders. Update tests and Docs/audio.md to the new API.

Add a Docs/audio.md section covering the per-effect AudioProcessingMode (software / platform / automatic) and AudioProcessingOptions: how to set modes at publish time, as a room default, and at runtime via setAudioProcessingOptions, plus how to verify the resolved implementation through AudioManager.audioProcessingState.

hiroshihorie force-pushed the hiroshi/runtime-vp branch from 443d6c2 to d9f632f Compare June 22, 2026 10:28

hiroshihorie force-pushed the hiroshi/runtime-vp branch from d9f632f to 70d0e9d Compare June 22, 2026 10:30

hiroshihorie changed the title ~~feat(audio): add AudioProcessingOptions and adopt webrtc-sdk state v2 contract~~ feat(audio): add AudioProcessingOptions Jun 22, 2026

hiroshihorie added 2 commits June 22, 2026 19:48

hiroshihorie mentioned this pull request Jun 24, 2026

Hiroshi/vp mode #1007

Closed

hiroshihorie marked this pull request as ready for review June 24, 2026 09:54

hiroshihorie requested review from pblazej and xianshijing-lk as code owners June 24, 2026 09:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(audio): add AudioProcessingOptions#1048

feat(audio): add AudioProcessingOptions#1048
hiroshihorie wants to merge 3 commits into
mainfrom
hiroshi/runtime-vp

hiroshihorie commented Jun 22, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

hiroshihorie commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

New API

AudioProcessingMode

AudioProcessingOptions

AudioCaptureOptions

Runtime control

Diagnostics

Renamed API

Usage

Documentation

Example app

Notes

Test plan

Uh oh!

github-actions Bot commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hiroshihorie commented Jun 22, 2026 •

edited

Loading

`AudioProcessingMode`

`AudioProcessingOptions`

`AudioCaptureOptions`