Skip to main content

SileroVAD Extension

Silero Voice Activity Detection (VAD) is a machine learning-based tool designed to identify segments of audio that contain human speech. It is highly efficient, lightweight, and capable of operating in real-time, making it ideal for applications such as speech recognition, telecommunication systems, and audio processing pipelines.

Dependencies

The SileroVAD extension requires the ONNX extension to be loaded.

SileroVAD Node

The voice activity detector (VAD) node uses SileroVAD to detect the presence of voice activity in the audio signal.

Configuration

NameTypeDescription
frameSizeintThe size of the audio buffer (in number of frames) used for processing.
thresholdfloatThe sensitivity threshold for detecting voice activity. Higher values make detection stricter.
minSilenceDurationMsintThe minimum duration of silence (in milliseconds) required to consider speech as ended.
speechPadMsintThe amount of padding (in milliseconds) added before and after detected speech segments.

Values

This object does not provide any values.

Actions

This object does not provide any actions.

Events

NameDataDescription
startDoubleIndicates when voice activity starts.
endDoubleIndicates when voice activity ends.

Download

You can find the download links for this extension on our Downloads page.