SileroVAD Extension
Silero Voice Activity Detection (VAD) is a machine learning-based tool designed to identify segments of audio that contain human speech. It is highly efficient, lightweight, and capable of operating in real-time, making it ideal for applications such as speech recognition, telecommunication systems, and audio processing pipelines.
Dependencies
The SileroVAD extension requires the ONNX extension to be loaded.
SileroVAD Node
The voice activity detector (VAD) node uses SileroVAD to detect the presence of voice activity in the audio signal.
Configuration
Name | Type | Description |
---|---|---|
frameSize | int | The size of the audio buffer (in number of frames) used for processing. |
threshold | float | The sensitivity threshold for detecting voice activity. Higher values make detection stricter. |
minSilenceDurationMs | int | The minimum duration of silence (in milliseconds) required to consider speech as ended. |
speechPadMs | int | The amount of padding (in milliseconds) added before and after detected speech segments. |
Values
This object does not provide any values.
Actions
This object does not provide any actions.
Events
Name | Data | Description |
---|---|---|
start | Double | Indicates when voice activity starts. |
end | Double | Indicates when voice activity ends. |
Download
You can find the download links for this extension on our Downloads page.