📦 Installation Guide¶
This guide covers all installation options for the inference-models package.
🧩 Composable Extras¶
The inference-models package uses composable extras to give you fine-grained control over dependencies. Instead of installing everything at once, you can mix and match components based on your needs:
- Backend extras (
torch-cu128,onnx-cpu,trt10) - Choose your inference runtime - Model extras (
mediapipe) - Add support for specific model families
This modular approach keeps installations lightweight and avoids dependency conflicts. For example, you can combine torch-cu128 (PyTorch with CUDA 12.8) + onnx-cu12 (ONNX Runtime) + trt10 (TensorRT) in a single installation for maximum flexibility.
✅ Prerequisites¶
- Python 3.10 - 3.12
- pip or uv package manager
- For GPU support: CUDA-compatible GPU with appropriate drivers
🚀 Recommended: Using uv¶
We recommend using uv for faster and more reliable installations:
Learn more about uv at docs.astral.sh/uv.
📋 What Gets Installed¶
Base Installation¶
The base inference-models package includes:
- PyTorch (CPU) - Deep learning framework
- Hugging Face Transformers - Transformer models support
- OpenCV - Computer vision utilities
- Supervision - Vision utilities and annotations
For information about which extras are required for specific model architectures, see the Supported Models documentation.
Optional Extras¶
Install additional backends and specialized models using extras:
| Extra | What It Provides | When to Use |
|---|---|---|
| Backend Extras | ||
torch-cpu |
PyTorch CPU-only | CPU-only environments, development |
torch-cu118 |
PyTorch + CUDA 11.8 | NVIDIA GPUs with CUDA 11.8 (legacy) |
torch-cu124 |
PyTorch + CUDA 12.4 | NVIDIA GPUs with CUDA 12.4 |
torch-cu126 |
PyTorch + CUDA 12.6 | NVIDIA GPUs with CUDA 12.6 |
torch-cu128 |
PyTorch + CUDA 12.8 | NVIDIA GPUs with CUDA 12.8 |
torch-jp6-cu126 |
PyTorch for Jetson JetPack 6 | NVIDIA Jetson devices (see Hardware Compatibility) |
onnx-cpu |
ONNX Runtime CPU | CPU inference, Roboflow models |
onnx-cu118 |
ONNX Runtime + CUDA 11.8 | GPU inference with CUDA 11.8 |
onnx-cu12 |
ONNX Runtime + CUDA 12.x | GPU inference with CUDA 12.x |
onnx-jp6-cu126 |
ONNX Runtime for Jetson | NVIDIA Jetson devices (see Hardware Compatibility) |
trt10 |
TensorRT 10 | Maximum GPU performance, production |
| Model Extras | ||
mediapipe |
MediaPipe models | Face detection, pose estimation |
💻 Basic Installation¶
CPU Installation¶
For CPU-only environments:
This installs the base package with PyTorch CPU support.
CPU with ONNX Backend¶
For running models trained on Roboflow platform (recommended for CPU):
# Using uv
uv pip install "inference-models[onnx-cpu]"
# Using pip
pip install "inference-models[onnx-cpu]"
🎮 GPU Installation¶
TensorRT Version Compatibility
TensorRT engines are sensitive to version compatibility. A TensorRT engine compiled with a specific TensorRT version may not work with a different runtime version.
- Roboflow platform provides TensorRT packages compiled with TensorRT 10.12.0.36 and maintains forward compatibility within the 10.x series
- Custom compiled engines are not guaranteed to be forward compatible - match the exact TensorRT version used during compilation
- Best practice: Match your TensorRT version with other dependencies in your environment
When installing the trt10 extra, we recommend pinning to tensorrt==10.12.0.36 for compatibility with Roboflow-provided engines.
CUDA 12.8¶
# Using uv (recommended)
uv pip install "inference-models[torch-cu128,onnx-cu12,trt10]" "tensorrt==10.12.0.36"
# Using pip
pip install "inference-models[torch-cu128,onnx-cu12,trt10]" "tensorrt==10.12.0.36"
CUDA 12.6¶
CUDA 12.4¶
CUDA 11.8 (Legacy)¶
🤖 Jetson Installation¶
For NVIDIA Jetson devices, see the Hardware Compatibility guide for detailed installation instructions and platform-specific requirements.
Jetson with JetPack 6 (CUDA 12.6)¶
Jetson TensorRT
Jetson installations should use the pre-compiled TensorRT package shipped with JetPack.
Do not install the trt10 extra on Jetson devices.
🔧 Additional Features¶
MediaPipe Models¶
Enables MediaPipe-based models including Face Detection:
SAM2 Real-Time¶
SAM2 Real-Time requires manual installation from GitHub:
# First install inference-models with any CUDA backend
pip install "inference-models[torch-cu128]" # or torch-cu126, torch-cu124, etc.
# Then install SAM2 Real-Time
pip install git+https://github.com/Gy920/segment-anything-2-real-time.git
PyPI Restriction
Due to PyPI restrictions on Git dependencies, SAM2 Real-Time must be installed separately.
🔗 Combining Extras¶
You can combine multiple extras in a single installation:
# GPU with multiple backends and additional models
uv pip install "inference-models[torch-cu128,onnx-cu12,trt10,mediapipe]" "tensorrt==10.12.0.36"
Conflicting Extras
Some extras cannot be installed together:
- Only one
torch-*extra at a time - Only one
onnx-*extra at a time
The library will prevent conflicting installations when using uv.
🔒 Reproducible Installations¶
For production deployments requiring strict dependency control, use the uv.lock file:
# Clone the repository
git clone https://github.com/roboflow/inference.git
cd inference/inference_models
# Install from lock file
uv sync --frozen
See the official Docker builds for examples.
✅ Verifying Installation¶
Test your installation:
from inference_models import AutoModel
# This will show available backends
AutoModel.describe_compute_environment()
# Try loading a model
model = AutoModel.from_pretrained("rfdetr-base")
print("Installation successful!")
🔧 Troubleshooting¶
Missing Dependencies Error¶
If you see an error about missing dependencies when loading a model:
- Check which backend the model requires
- Install the appropriate extra (e.g.,
onnx-cpu,trt10)
CUDA Version Mismatch¶
Rule of thumb: Match the major CUDA version between your system and the installed extras. Do not install packages built for a newer CUDA version than what's installed on your system, as they may require CUDA symbols from *.so libraries that aren't available in older installations.
Check your CUDA version:
# Check CUDA compiler version (most reliable)
nvcc --version
# Check where CUDA is installed
ls -la /usr/local/cuda
nvidia-smi Can Be Misleading
nvidia-smi shows the driver version and maximum supported CUDA version, not the actual CUDA toolkit version installed. Always verify with nvcc --version or check the /usr/local/cuda symlink.
Install matching extras:
# For CUDA 12.x
uv pip install "inference-models[torch-cu128,onnx-cu12]"
# For CUDA 11.8
uv pip install "inference-models[torch-cu118,onnx-cu118]"
🚀 Next Steps¶
- Quick Overview - Learn basic usage and concepts
- Understand Core Concepts - Understand the design
- Models Overview - Explore available models