Skip to main content

Voice Activity Detection

About Voice Activity Detection

Voice activity detection (VAD) is a technology used to identify and distinguish between speech and non-speech segments in an audio signal. It plays a crucial role in various applications such as speech recognition, speaker identification, and audio coding. VAD algorithms analyze the characteristics of the audio signal, such as energy, spectral content, and pitch, to determine whether it contains speech or silence. By accurately detecting voice activity, VAD systems can efficiently process and analyze only the relevant speech segments, reducing computational complexity and improving overall system performance.

Switchboard Editor example

This example plays a recording of a conversation. The Switchboard VoiceActivityDetectorNode detects the speech segments and displays the detection state accordingly.

Code Example

{
"nodes": {
{ "id": "vadNode", "type": "VoiceActivityDetectorNode" },
{ "id": "splitterNode", "type": "BusSplitterNode" }
},
"connections": {
{ "sourceNode": "inputNode", "destinationNode": "splitterNode" },
{ "sourceNode": "splitterNode", "destinationNode": "vadNode" },
{ "sourceNode": "splitterNode", "destinationNode": "outputNode" }
}
}