feat(audio): add AudioProcessingOptions#1048
Open
hiroshihorie wants to merge 3 commits into
Open
Conversation
|
|
443d6c2 to
d9f632f
Compare
… contract Introduce AudioProcessingOptions and wire it through AudioCaptureOptions, AudioManager, and LocalAudioTrack. The audio processing state read-back moved from PeerConnection to the factory upstream, so the per-publisher source registry is gone: AudioManager reads the factory-owned engine-wide state directly. State types follow the v2 vocabulary (requested -> resolved -> active -> effective) with collapsed booleans, and the device-level BuiltIn* types are renamed Platform*. ADM voice-processing calls target the renamed isPlatformVoiceProcessingAllowed accessors. Docs/audio.md updated to match.
d9f632f to
70d0e9d
Compare
Rename AudioManager.setVoiceProcessingEnabled(_:)/isVoiceProcessingEnabled to setPlatformVoiceProcessingAllowed(_:)/isPlatformVoiceProcessingAllowed to match the underlying ADM accessors and the v2 'allowed' vocabulary. Keep the old names as deprecated, renamed forwarders. Update tests and Docs/audio.md to the new API.
Add a Docs/audio.md section covering the per-effect AudioProcessingMode (software / platform / automatic) and AudioProcessingOptions: how to set modes at publish time, as a room default, and at runtime via setAudioProcessingOptions, plus how to verify the resolved implementation through AudioManager.audioProcessingState.
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds an explicit, per-effect audio processing API so apps can choose how each effect runs, either Apple's platform Voice Processing I/O on the device or WebRTC's software processing, instead of relying on implicit toggles. This also adopts the webrtc-sdk audio processing state v2 contract for accurate, engine-wide diagnostics.
Motivation
Until now, software voice processing was reached indirectly by disabling platform voice processing, and the exact behavior changed across releases. Some apps deliberately avoid Apple VPIO (for consistent hardware volume, faster call start, screen recording with audio, and no system mute sounds) while still wanting echo cancellation and noise suppression in software. This PR makes that an explicit, stable choice.
New API
AudioProcessingModePer-effect selection of the implementation:
.automatic(default): prefer platform processing when available, fall back to WebRTC software otherwise..platform: use platform processing only. Rejected if the platform implementation is unavailable..software: force WebRTC software processing and disable the matching platform effect when possible.AudioProcessingOptionsA value type describing all four effects (echo cancellation, noise suppression, auto gain control, high-pass filter) with an enabled flag and a mode each. Includes
AudioProcessingOptions.communicationandAudioProcessingOptions.noProcessingpresets.AudioCaptureOptionsGains matching
*Modefields (echoCancellationMode,autoGainControlMode,noiseSuppressionMode,highPassFilterMode), plus interop withAudioProcessingOptions(a convenience init and anaudioProcessingOptionsaccessor). Existing call sites keep working since modes default to.automatic.Runtime control
LocalAudioTrack.setAudioProcessingOptions(_:)applies options to an already-published track and returns anAudioProcessingOptionsResult(for example.applied,.stored, or a rejection reason).Diagnostics
AudioManager.audioProcessingStateandAudioManager.platformAudioProcessingStateexpose the v2 state, following therequested -> resolved -> active -> effectivevocabulary, read from the factory-owned, engine-wide module rather than a single peer connection.Renamed API
AudioManager.setVoiceProcessingEnabled(_:)andisVoiceProcessingEnabledare renamed tosetPlatformVoiceProcessingAllowed(_:)andisPlatformVoiceProcessingAllowed, matching the underlying ADM accessors and the "allowed" policy meaning. The old names remain as deprecated, renamed forwarders, so existing code keeps compiling with a fix-it.Usage
Publish the microphone with all effects forced to software:
Set it as the room default so it applies whenever the mic track is published:
Guarantee Apple VPIO is never used, then rely on software processing:
Verify what each effect resolved to:
Documentation
Docs/audio.mdgains a new section, "Audio Processing Modes (software, platform, automatic)", covering publish-time, room-default, and runtime configuration, plus how to verify the resolved implementation. The "Disallowing Platform Voice Processing" section is updated to the renamed API.Example app
A companion example demonstrates the full surface, including a live "Voice Processing" and "Runtime Audio Processing" panel with effective-state diagnostics:
Multiplatform/Views/AudioControlsPanel.swiftMultiplatform/Controllers/AppContext.swiftThat branch pins its SDK dependency to this branch (
hiroshi/runtime-vp).Notes
LiveKitWebRTC144.7559.10(already onmain)..platformis rejected forhighPassFilterMode. Use.software.setAudioProcessingOptions(_:), not by toggling the mic.Test plan
swift buildpassesswift test