Skip to content

Audio Processing Node

Nikolaos Tsipas edited this page Aug 28, 2014 · 2 revisions

The audio processing node is used to calculate the horizontal angle (yaw) and probability of an audio source in space.

Audio Detection

In order to detect the presence of an audio source a noise threshold is calculated. To calculate this threshold we get the median RMS value of previous noise_floor_buffer_length windows. This threshold is updated as new messages are added to the channel.

The rms value of the audio data window generated by mic1 is constantly compared to the noise threshold and only if it's above that threshold the source localisation algorithm is executed.

Audio Source Localisation

The algorithm for the audio source localisation is based on the calculation of the difference between the RMS values of microphones across the same axis. By knowing the diffs we can easily calculate the angle using arctan.

i.e.

angle = arctan((rms_mic1 - rms_mic3) / (rms_mic2 - rms_mic4)) 

A buffer of source_loc_buffer_length previous angles is used to calculate the probability. The occurrences of the angle with the higher frequency (similarity_distance, in radians, is used for grouping) is divided by the buffer length to get the probability of the localisation.

Comms

Type Topic Message
input /pandora_audio/kinect_audio_capture_stream AudioData.msg
output /data_fusion_subscriber GeneralAlertMsg.msg

Configuration

processing_params.yaml

window_size: 1024
noise_floor_buffer_length: 64
source_loc_buffer_length: 16
similarity_distance: 0.523

How to run

roslaunch pandora_kinect_audio_processing process.launch

Note that the audio capture node (process.launch) should be already started

Metapackages

###pandora_audio

  • [Audio capture node](Audio Capture Node)
  • [Audio monitoring node](Audio Monitoring Node)
  • [Audio recording node](Audio Recording Node)
  • [Audio processing node](Audio Processing Node)

Clone this wiki locally