ONNX Runtime for Inferencing
ONNX Runtime provides a performant solution to inference models from varying source frameworks (PyTorch, Hugging Face, TensorFlow) on different software and hardware stacks. ONNX Runtime Inference takes advantage of hardware accelerators, supports APIs in multiple languages (Python, C++, C#, C, Java, and more), and works on cloud servers, edge and mobile devices, and in web browsers.
Learn how to install ONNX Runtime for inferencing →
Benefits
Improve inference latency, throughput, memory utilization, and binary size
Run on different hardware using device-specific accelerators
Use a common interface to run models trained in different frameworks
Deploy a classic ML Python model in a C#/C++/Java app
ONNX Runtime Mobile
ONNX Runtime Mobile runs models on mobile devices using the same API used for cloud-based inferencing. Developers can use their mobile language and development environment of choice to add AI to Android, iOS, react-native, MAUI/Xamarin applications in Swift, Objective-C, Java, Kotlin, JavaScript, C, and C++.
Examples
Image Classification
This example app uses image classification to continuously classify the objects detected from the device's camera in real-time and displays the most probable inference results on the screen.Speech Recognition
This example app uses speech recognition to transcribe speech from the audio recorded by the device.Object Detection
This example app uses object detection to continuously detect the objects in the frames seen by the iOS device's back camera and display the detected object's bounding boxes, detected class, and corresponding inference confidence.Question Answering
This example app showcases usage of question answering models with pre and post processing.ONNX Runtime Web
ONNX Runtime Web allows JavaScript developers to run and deploy machine learning models in browsers, which provides cross-platform portability with a common implementation. This can simplify the distribution experience as it avoids additional libraries and driver installations.
Video Tutorial: Inference in JavaScript with ONNX Runtime Web →
Examples
ONNX Runtime Web Demo is an interactive demo portal that showcases live use of ONNX
Runtime Web in VueJS. View these examples to experience the power of ONNX Runtime Web.
Image Classification
This example demonstrates how to use a GitHub repository template to build an image classification web app using ONNX Runtime Web.Speech Recognition
This example demonstrates how to run whisper tiny.en in your browser using ONNX Runtime Web and the browser's audio interfaces.Natural Language Processing (NLP)
This example demonstrates how to create custom Excel functions to implement BERT NLP models with ONNX Runtime Web to enable deep learning in spreadsheet tasks.On-Device Training
ONNX Runtime on-device training extends the Inference ecosystem to leverage data on the device to train models.
Learn more about on-device training →