Developer Tools¶
Advanced utilities for custom model development.
Overview¶
The inference_models.developer_tools module provides utilities for developers creating custom models that integrate with the inference_models package.
Base functions¶
- get_model_package_contents - Load files from model packages
- x_ray_runtime_environment - Inspect runtime environment
- download_files_to_directory - Download files to a directory
- get_selected_onnx_execution_providers - Get ONNX execution providers
- get_model_from_provider - Get model metadata from provider
- register_model_provider - Register a custom model provider
Backend-Specific Utilities¶
CUDA Utilities¶
Low-level CUDA context management for custom models using CUDA/TensorRT.
- use_primary_cuda_context - Use primary CUDA context for operations
- use_cuda_context - Context manager for CUDA operations
ONNX Utilities¶
Utilities for working with ONNX Runtime in custom models.
- set_onnx_execution_provider_defaults - Configure ONNX execution provider defaults
- run_onnx_session_with_batch_size_limit - Run ONNX session with batch size constraints
- run_onnx_session_via_iobinding - Run ONNX session using IO binding for performance
PyTorch Utilities¶
Utilities for PyTorch-based custom models.
- generate_batch_chunks - Split batches into chunks for memory management
TensorRT Utilities¶
Utilities for TensorRT-based custom models.
- get_trt_engine_inputs_and_outputs - Inspect TensorRT engine inputs and outputs
- infer_from_trt_engine - Run inference using TensorRT engine
- load_trt_model - Load TensorRT engine from file
Entities¶
- RuntimeXRayResult - Runtime environment inspection result
- ModelMetadata - Model metadata structure
- ModelDependency - Model dependency specification
- ModelPackageMetadata - Model package metadata
- TorchScriptPackageDetails - TorchScript package details
- ONNXPackageDetails - ONNX package details
- TRTPackageDetails - TensorRT package details
- JetsonEnvironmentRequirements - Jetson environment requirements
- ServerEnvironmentRequirements - Server environment requirements
- FileDownloadSpecs - File download specifications
Usage¶
Basic Usage¶
from inference_models.developer_tools import (
get_model_package_contents,
x_ray_runtime_environment,
register_model_provider,
)
Backend-Specific Usage¶
Backend-specific utilities are available as lazy imports:
from inference_models.developer_tools import (
use_primary_cuda_context, # CUDA utilities
set_onnx_execution_provider_defaults, # ONNX utilities
generate_batch_chunks, # PyTorch utilities
load_trt_model, # TensorRT utilities
)
Lazy Loading
Backend-specific utilities are lazily loaded only when accessed. This means they won't cause import errors if the required dependencies (e.g., tensorrt, onnxruntime) are not installed, as long as you don't try to use them.