Skip to content

Commit 70d0e9d

Browse files
committed
feat(audio): add AudioProcessingOptions and adopt webrtc-sdk state v2 contract
Introduce AudioProcessingOptions and wire it through AudioCaptureOptions, AudioManager, and LocalAudioTrack. The audio processing state read-back moved from PeerConnection to the factory upstream, so the per-publisher source registry is gone: AudioManager reads the factory-owned engine-wide state directly. State types follow the v2 vocabulary (requested -> resolved -> active -> effective) with collapsed booleans, and the device-level BuiltIn* types are renamed Platform*. ADM voice-processing calls target the renamed isPlatformVoiceProcessingAllowed accessors. Docs/audio.md updated to match.
1 parent 7f3af14 commit 70d0e9d

7 files changed

Lines changed: 519 additions & 41 deletions

File tree

.changes/audio-processing-options

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
patch type="added" "Add AudioProcessingOptions"

Docs/audio.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -46,25 +46,25 @@ AudioManager.shared.audioSession.isAutomaticDeactivationEnabled = false
4646

4747
When set to `false`, the audio session remains active after the LiveKit call ends, preserving your app's audio state.
4848

49-
## Disabling Voice Processing
49+
## Disallowing Platform Voice Processing
5050

51-
Apple's voice processing is enabled by default, such as echo cancellation and auto-gain control.
51+
Apple's platform voice processing is allowed by default, such as echo cancellation and auto-gain control.
5252

53-
If your app doesn't require voice processing at all, you can disable it entirely:
53+
If your app must not use Apple Voice Processing I/O, disable voice processing:
5454

5555
```swift
5656
try AudioManager.shared.setVoiceProcessingEnabled(false)
5757
```
5858

59-
This restarts the internal `AVAudioEngine` to apply the change. It can cause a short audio glitch, so it is recommended to set it once before connecting to a Room. Disabling voice processing also disables muted speaker detection.
59+
This restarts the internal `AVAudioEngine` when an Apple VPIO path is active. It is recommended to set it once before connecting to a Room. Runtime `AudioProcessingOptions` with `automatic` mode will fall back to WebRTC software processing while platform voice processing is disallowed.
6060

61-
If your app requires toggling voice processing at run-time, it is recommended to use:
61+
For per-track or per-capture software processing, use `AudioProcessingOptions` with `.software` modes. The lower-level bypass API remains available when you need to directly control Apple VPIO:
6262

6363
```swift
6464
AudioManager.shared.isVoiceProcessingBypassed = true
6565
```
6666

67-
Set it back to `false` to re-enable processing. This uses `AVAudioEngine`'s [isVoiceProcessingBypassed](https://developer.apple.com/documentation/avfaudio/avaudioinputnode/isvoiceprocessingbypassed) and works seamlessly at run-time.
67+
Set it back to `false` to re-enable the Apple path. This uses `AVAudioEngine`'s [isVoiceProcessingBypassed](https://developer.apple.com/documentation/avfaudio/avaudioinputnode/isvoiceprocessingbypassed). Runtime `AudioProcessingOptions` can overwrite this Apple-specific state when capture starts or options are reapplied.
6868

6969
## Other audio ducking
7070

Sources/LiveKit/Audio/Manager/AudioManager.swift

Lines changed: 39 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -320,10 +320,15 @@ public class AudioManager: Loggable {
320320
set { RTC.audioDeviceModule.duckingLevel = newValue.toRTCType() }
321321
}
322322

323-
/// The main flag that determines whether to enable Voice-Processing I/O of the internal AVAudioEngine. Toggling this requires restarting the AudioEngine.
324-
/// Setting this to `false` prevents any voice-processing-related initialization, and muted talker detection will not work.
325-
/// Typically, it is recommended to keep this set to `true` and toggle ``isVoiceProcessingBypassed`` when possible.
326-
/// Defaults to `true`.
323+
/// Whether Apple's platform voice processing is allowed.
324+
///
325+
/// Defaults to `true`. When set to `false`, runtime ``AudioProcessingOptions``
326+
/// treat Apple Voice Processing I/O as unavailable. `automatic` mode falls
327+
/// back to WebRTC software processing and `platform` mode is rejected.
328+
///
329+
/// Use ``AudioProcessingOptions`` with `.software` modes for per-track or
330+
/// per-capture software voice processing. Use this policy when the app must
331+
/// guarantee Apple Voice Processing I/O is not used.
327332
public var isVoiceProcessingEnabled: Bool { RTC.audioDeviceModule.isPlatformVoiceProcessingAllowed }
328333

329334
public func setVoiceProcessingEnabled(_ enabled: Bool) throws {
@@ -333,6 +338,8 @@ public class AudioManager: Loggable {
333338

334339
/// Bypass Voice-Processing I/O of internal AVAudioEngine.
335340
/// It is valid to toggle this at runtime and AudioEngine doesn't require restart.
341+
/// Runtime ``AudioProcessingOptions`` may overwrite this Apple-specific state
342+
/// when capture starts or when local audio track options are reapplied.
336343
/// Defaults to `false`.
337344
public var isVoiceProcessingBypassed: Bool {
338345
get {
@@ -359,6 +366,21 @@ public class AudioManager: Loggable {
359366
set { RTC.audioDeviceModule.isVoiceProcessingAGCEnabled = newValue }
360367
}
361368

369+
/// Device-level platform voice-processing capability and requested/active state.
370+
public var platformAudioProcessingState: PlatformAudioProcessingState {
371+
RTC.audioDeviceModule.platformAudioProcessingState.toLKType()
372+
}
373+
374+
/// Diagnostic snapshot of the resolved audio processing state.
375+
///
376+
/// The audio processing module is owned by the peer connection factory and
377+
/// shared engine-wide, so this reflects what is actually applied across the
378+
/// engine rather than any single track or connection — use it to verify what
379+
/// a ``LocalAudioTrack/setAudioProcessingOptions(_:)`` request resolved to.
380+
public var audioProcessingState: AudioProcessingState {
381+
RTC.audioProcessingState().toLKType()
382+
}
383+
362384
/// Enables manual rendering (no-device) mode of AVAudioEngine.
363385
/// In this mode, you can provide audio buffers by calling `AudioManager.shared.mixer.capture(appAudio:)` continuously.
364386
/// Remote audio will not play out automatically. Get remote mixed audio buffers with `AudioManager.shared.add(localAudioRenderer:)` or individual tracks with ``RemoteAudioTrack/add(audioRenderer:)``.
@@ -383,22 +405,31 @@ public class AudioManager: Loggable {
383405
/// which keeps recording initialized and pre-warms voice processing.
384406
///
385407
/// - Parameter enabled: Pass `true` to enable always-prepared recording, or `false` to disable it.
408+
/// - Parameter audioProcessingOptions: Optional voice-processing options used when prewarming mic input.
386409
/// - Note: If `audioSession.isAutomaticConfigurationEnabled` is `true`, the session category is configured to `.playAndRecord`.
387410
/// - Note: Microphone permission is required. iOS may prompt if not already granted.
388411
/// - Note: This persists across ``Room`` lifecycles and connections until disabled.
389412
/// - Throws: An error if the underlying audio device module fails to apply the setting.
390-
public func setRecordingAlwaysPreparedMode(_ enabled: Bool) async throws {
391-
let result = RTC.audioDeviceModule.setRecordingAlwaysPreparedMode(enabled)
413+
public func setRecordingAlwaysPreparedMode(
414+
_ enabled: Bool,
415+
audioProcessingOptions: AudioProcessingOptions? = nil,
416+
) async throws {
417+
let result = RTC.audioDeviceModule.setRecordingAlwaysPreparedMode(
418+
enabled,
419+
audioProcessingOptions: audioProcessingOptions?.toRTCType(),
420+
)
392421
try checkAdmResult(code: result)
393422
}
394423

395424
/// Starts mic input to the SDK even without any ``Room`` or a connection.
396425
/// Audio buffers will flow into ``LocalAudioTrack/add(audioRenderer:)`` and ``capturePostProcessingDelegate``.
397-
public func startLocalRecording() throws {
426+
public func startLocalRecording(audioProcessingOptions: AudioProcessingOptions? = nil) throws {
398427
// Always unmute APM if muted by last session.
399428
RTC.audioProcessingModule.isMuted = false // TODO: Possibly not required anymore with new libs
400429
// Start recording on the ADM.
401-
let result = RTC.audioDeviceModule.initAndStartRecording()
430+
let result = RTC.audioDeviceModule.initAndStartRecording(
431+
audioProcessingOptions: audioProcessingOptions?.toRTCType(),
432+
)
402433
try checkAdmResult(code: result)
403434
}
404435

Sources/LiveKit/Core/RTC.swift

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,10 @@ actor RTC {
8383
delegate: nil) }
8484
}
8585

86+
static func audioProcessingState() -> LKRTCAudioProcessingState {
87+
DispatchQueue.liveKitWebRTC.sync { peerConnectionFactory.audioProcessingState }
88+
}
89+
8690
static func createVideoSource(forScreenShare: Bool) -> LKRTCVideoSource {
8791
DispatchQueue.liveKitWebRTC.sync { peerConnectionFactory.videoSource(forScreenCast: forScreenShare) }
8892
}

Sources/LiveKit/Track/Local/LocalAudioTrack.swift

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,10 @@ public class LocalAudioTrack: Track, LocalTrackProtocol, AudioTrackProtocol, @un
6666
"googNoiseSuppression": options.noiseSuppression.toString(),
6767
"googTypingNoiseDetection": options.typingNoiseDetection.toString(),
6868
"googHighpassFilter": options.highpassFilter.toString(),
69+
"echoCancellationMode": options.echoCancellationMode.toConstraintValue(),
70+
"autoGainControlMode": options.autoGainControlMode.toConstraintValue(),
71+
"noiseSuppressionMode": options.noiseSuppressionMode.toConstraintValue(),
72+
"highPassFilterMode": options.highPassFilterMode.toConstraintValue(),
6973
]
7074

7175
let audioConstraints = DispatchQueue.liveKitWebRTC.sync { LKRTCMediaConstraints(mandatoryConstraints: nil,
@@ -90,12 +94,32 @@ public class LocalAudioTrack: Track, LocalTrackProtocol, AudioTrackProtocol, @un
9094
try await super._unmute()
9195
}
9296

97+
/// Updates this local track's voice processing options without restarting capture.
98+
///
99+
/// If this track is already published, WebRTC reapplies the updated options through
100+
/// the active sender. Effective APM configuration is shared by the WebRTC voice engine,
101+
/// so conflicting updates from multiple local audio tracks are last-writer-wins.
102+
@discardableResult
103+
public func setAudioProcessingOptions(_ options: AudioProcessingOptions) throws -> AudioProcessingOptionsResult {
104+
guard let audioTrack = mediaTrack as? LKRTCAudioTrack else {
105+
throw LiveKitError(.invalidState, message: "Media track is not an audio track")
106+
}
107+
let result = audioTrack.setAudioProcessingOptions(options.toRTCType()).toLKType()
108+
guard result.isSuccess else {
109+
let reason = result.message.isEmpty ? "\(result.code)" : "\(result.code): \(result.message)"
110+
throw LiveKitError(.webRTC, message: "Failed to set audio processing options: \(reason)")
111+
}
112+
return result
113+
}
114+
93115
// MARK: - Internal
94116

95117
override func startCapture() async throws {
96118
// AudioDeviceModule's InitRecording() and StartRecording() automatically get called by WebRTC, but
97119
// explicitly init & start it early to detect audio engine failures (mic not accessible for some reason, etc.).
98-
try AudioManager.shared.startLocalRecording()
120+
try AudioManager.shared.startLocalRecording(
121+
audioProcessingOptions: captureOptions.audioProcessingOptions,
122+
)
99123
}
100124

101125
override func stopCapture() async throws {

0 commit comments

Comments
 (0)