set_onnx_execution_provider_defaults¶
inference_models.models.common.onnx.set_onnx_execution_provider_defaults
¶
set_onnx_execution_provider_defaults(providers, model_package_path, device, enable_fp16=True, default_onnx_trt_options=True)
Configure ONNX Runtime execution providers with default options.
Applies default configuration options to ONNX Runtime execution providers, particularly for TensorRT and CUDA providers. This includes setting up TensorRT engine caching, FP16 precision, and device selection.
Parameters:
-
(providers¶List[Union[str, tuple]]) –List of execution provider names or (name, options) tuples. Example: ["CUDAExecutionProvider", "CPUExecutionProvider"]
-
(model_package_path¶str) –Path to model package directory, used for TensorRT engine cache storage.
-
(device¶device) –PyTorch device specifying which GPU to use. The device index is used to configure the execution provider.
-
(enable_fp16¶bool, default:True) –Enable FP16 (half precision) for TensorRT execution provider. Default: True.
-
(default_onnx_trt_options¶bool, default:True) –Apply default TensorRT options (engine caching, FP16). If False, TensorRT provider is used without modifications. Default: True.
Returns:
-
List[Union[str, tuple[str, dict[str, Any]]]]–List of execution providers with configured options. Each element is either
-
List[Union[str, tuple[str, dict[str, Any]]]]–a string (provider name) or a tuple of (provider_name, options_dict).
Examples:
Configure providers for CUDA inference:
>>> from inference_models.developer_tools import set_onnx_execution_provider_defaults
>>> import torch
>>>
>>> providers = ["CUDAExecutionProvider", "CPUExecutionProvider"]
>>> configured = set_onnx_execution_provider_defaults(
... providers=providers,
... model_package_path="/path/to/model",
... device=torch.device("cuda:0"),
... enable_fp16=True
... )
>>> # Returns: [("CUDAExecutionProvider", {"device_id": 0}), "CPUExecutionProvider"]
Configure TensorRT with custom options:
>>> providers = ["TensorrtExecutionProvider"]
>>> configured = set_onnx_execution_provider_defaults(
... providers=providers,
... model_package_path="/cache/models/yolov8n",
... device=torch.device("cuda:1"),
... enable_fp16=False,
... default_onnx_trt_options=True
... )
>>> # TensorRT provider will cache engines in /cache/models/yolov8n
Note
- TensorRT provider gets: engine caching, cache path, FP16 setting, device ID
- CUDA provider gets: device ID
- Other providers are passed through unchanged
- Engine caching significantly speeds up subsequent model loads
See Also
run_onnx_session_via_iobinding(): Run ONNX sessions with configured providers