-
Notifications
You must be signed in to change notification settings - Fork 1
Audio Processing Node
The audio processing node is used to calculate the horizontal angle (yaw) and probability of an audio source in space.
In order to detect the presence of an audio source a noise threshold is calculated. To calculate this threshold we get the median RMS value of previous noise_floor_buffer_length windows. This threshold is updated as new messages are added to the channel.
The rms value of the audio data window generated by mic1 is constantly compared to the noise threshold and only if it's above that threshold the source localisation algorithm is executed.
The algorithm for the audio source localisation is based on the calculation of the difference between the RMS values of microphones across the same axis. By knowing the diffs we can easily calculate the angle using arctan.
i.e.
angle = arctan((rms_mic1 - rms_mic3) / (rms_mic2 - rms_mic4))
A buffer of source_loc_buffer_length previous angles is used to calculate the probability. The occurrences of the angle with the higher frequency (similarity_distance, in radians, is used for grouping) is divided by the buffer length to get the probability of the localisation.
| Type | Topic | Message |
|---|---|---|
| input | /pandora_audio/kinect_audio_capture_stream | AudioData.msg |
| output | /data_fusion_subscriber | GeneralAlertMsg.msg |
processing_params.yaml
window_size: 1024
noise_floor_buffer_length: 64
source_loc_buffer_length: 16
similarity_distance: 0.523
roslaunch pandora_kinect_audio_processing process.launch
Note that the audio capture node (process.launch) should be already started