You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Audio: MFCC: Add Voice Activity Detection based on Mel spectrum
This patch adds a new mfcc_vad module that implements VAD operating
on the Mel log spectrum values produced by the MFCC component. The
VAD is very simple and is not very selective for voice vs. other
signals. But the continuously updated background noise estimate
prevents stationary noises to trigger the VAD.
The algorithm tracks a per-bin noise floor (instant-down, slow-rise)
and computes a A-weighted energy delta. The used weight emphasizes
speech frequencies. Speech is declared when the delta exceeds a
threshold (0.30 in Q9.23) with a 20-frame hangover to prevent rapid
toggling.
The VAD flag is inserted into the output stream as the first value
after the magic header word in all format paths (S16, S24, S32).
A new Kconfig option CONFIG_COMP_MFCC_VAD (depends on COMP_MFCC,
default n) gates compilation of the VAD code and the stream format
change.
Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
0 commit comments