Voice | Switchboard Documentation

📄️ Acoustic Echo Cancellation (AEC)

Acoustic echo cancellation is a technology that is used to eliminate the echo that occurs during audio communication. Echo is a common problem that occurs

📄️ Noise Filter

A noise filter node reduces unwanted background noise from an audio stream, improving clarity in recordings and live communications. It is especially useful in environments with consistent ambient sounds like hums, hisses, or background chatter. Common features include adjustable noise reduction levels, real-time processing, and optional voice activity detection (VAD) for further enhancement.

📄️ Real-Time Communication (RTC)

An RTC (Real-Time Communication) node enables low-latency, two-way audio transmission over a network, supporting use cases like voice chat, conferencing, and live collaboration. It typically handles audio encoding/decoding, jitter buffering, packet loss concealment, and synchronization. Many RTC nodes also support integration with signaling protocols and NAT traversal techniques (e.g., STUN, TURN).

📄️ Speech-to-Text (STT)

A speech-to-text (STT) node transcribes spoken audio into written text, enabling voice control, transcription services, and accessibility features. It typically supports real-time and batch processing, multiple languages, speaker diarization, and punctuation handling. Some implementations also offer confidence scoring and formatting options.

📄️ Text-to-Speech (TTS)

A text-to-speech (TTS) node converts written text into spoken audio, enabling voice-based interaction in applications like virtual assistants, screen readers, and automated announcements. Typical features include support for multiple languages, voice styles (e.g., male, female, neural), adjustable speaking rate, and pitch control.

📄️ Voice Activity Detector (VAD)

A VAD node analyzes incoming audio streams to determine the presence or absence of human speech. It's commonly used to optimize bandwidth, trigger recordings, or control processing pipelines in voice-driven applications. Features often include configurable sensitivity, noise robustness, and low-latency detection.