Install ONNX Runtime generate() API
Pre-requisites
CUDA
If you are installing the CUDA variant of onnxruntime-genai, the CUDA toolkit must be installed.
The CUDA toolkit can be downloaded from the CUDA Toolkit Archive.
Ensure that the CUDA_PATH
environment variable is set to the location of your CUDA installation.
Python packages
Note: only one of these packages should be installed in your application.
CPU
pip install numpy
pip install onnxruntime-genai --pre
DirectML
Append -directml
for the library that is optimized for DirectML on Windows
pip install numpy
pip install onnxruntime-genai-directml --pre
CUDA
Append -cuda
for the library that is optimized for CUDA environments
CUDA 11
pip install numpy
pip install onnxruntime-genai-cuda --pre --index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-genai/pypi/simple/
CUDA 12
pip install numpy
pip install onnxruntime-genai-cuda --pre --index-url=https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/
Nuget packages
Note: only one of these packages should be installed in your application.
dotnet add package Microsoft.ML.OnnxRuntimeGenAI --prerelease
For the package that has been optimized for CUDA:
dotnet add package Microsoft.ML.OnnxRuntimeGenAI.Cuda --prerelease
For the package that has been optimized for DirectML:
dotnet add package Microsoft.ML.OnnxRuntimeGenAI.DirectML --prerelease