Skip to main content

ONNX Extension

ONNX (Open Neural Network Exchange) is an open standard for representing machine learning models. It serves as a bridge between different deep learning frameworks, allowing models to be exchanged and used across platforms. ONNX is useful for various audio processing use-cases, including speech recognition, audio classification, music analysis, sound synthesis, and more. By converting models into the ONNX format, developers can leverage the interoperability it offers, enabling seamless integration and deployment of audio processing models across different frameworks, tools, and hardware platforms, thereby fostering collaboration and facilitating the adoption of deep learning models in the audio domain.

The ONNX (Open Neural Network Exchange) extension enables seamless integration of pre-trained machine learning models into the audio SDK. It allows developers to leverage advanced AI capabilities, such as speech recognition, noise suppression, or audio classification, by importing models in the ONNX format, ensuring high performance and interoperability across platforms.

Nodes​

The ONNX extension is intended for developers creating custom extensions that utilize ONNX machine learning models. It does not directly provide nodes for end-users but serves as a foundation for building AI-powered audio solutions.

Running ONNX models in Switchboard​

Switchboard SDK seamlessly integrates with your existing machine learning infrastructure, making it effortless to incorporate your models into a production environment. It provides a smooth transition from the development and research phase to real-world deployment. It enables real-time inference, allowing your machine learning models to process audio data instantaneously.

Switchboard SDK provides several features that make using machine learning models for audio processing easier. Some of these features are:

  • Model Integration: seamless integration with ONNX, allowing developers to easily load and use pre-trained models for audio processing tasks.
  • Preprocessing Functions: built-in preprocessing functions specifically designed for audio data. These functions include spectrogram computation, MFCC extraction, audio normalization, or resampling, which simplify the data preparation stage before feeding it into the models.
  • Inference and Prediction: efficient and optimized inference capabilities, allowing developers to run audio data through the machine learning models and obtain predictions or feature representations in real-time.
  • Visualization and Debugging Tools: visualization tools to facilitate the inspection of intermediate model outputs or audio features, aiding in model debugging and understanding the internal workings of the audio graph.
  • Integration with Audio I/O: interfaces to handle audio input/output, allowing seamless integration with audio devices, file formats, or streaming services for real-time audio processing.

Node types​

The ONNX Extension provides the following audio nodes for a Switchboard SDK audio graph:

NodeDescription
ONNX.MLSourceA source node that runs the ONNX model and generates audio data with the specified post-processing. Ideal for use with generative audio use cases.
ONNX.MLSinkA sink node that runs the ONNX model and receives audio data with the specified pre-processing. Ideal to be used for classifier type models.
ONNX.MLProcessorA processor node that runs the ONNX model and receives audio data with the specified pre-processing, then generates audio data with the specified post-processing. Ideal for applications in audio transformation, such as Noise Reduction, Source Separation, and Voice Conversion.

Demo​

Download​

You can find the download links for this extension on our Downloads page.