load_trt_model¶

inference_models.models.common.trt.load_trt_model ¶

load_trt_model(model_path, engine_host_code_allowed=False)

Load a TensorRT engine from a serialized engine file.

Deserializes a TensorRT engine from a .plan file and returns the engine object ready for inference. Handles errors during deserialization and provides detailed error messages.

Parameters:

model_path ¶
(str) –

Path to the serialized TensorRT engine file (.plan).
engine_host_code_allowed ¶
(bool, default: False ) –

Allow the engine to execute host code. Security risk - only enable if you trust the engine source. Default: False.

Returns:

ICudaEngine –

TensorRT CUDA engine (ICudaEngine) ready for creating execution contexts
ICudaEngine –

and running inference.

Raises:

CorruptedModelPackageError –

If the engine file cannot be loaded due to: - File not found - Incompatible TensorRT version - Incompatible CUDA version - Corrupted engine file - Runtime deserialization errors
MissingDependencyError –

If TensorRT is not installed.

Examples:

Load TensorRT engine:

>>> from inference_models.developer_tools import load_trt_model
>>>
>>> engine = load_trt_model("model.plan")
>>> print(f"Engine loaded: {engine.name}")
>>>
>>> # Create execution context
>>> context = engine.create_execution_context()

Load engine with host code allowed:

>>> # Only if you trust the engine source!
>>> engine = load_trt_model(
...     "custom_model.plan",
...     engine_host_code_allowed=True
... )

Complete inference setup:

>>> from inference_models.developer_tools import (
...     load_trt_model,
...     get_trt_engine_inputs_and_outputs
... )
>>>
>>> # Load engine
>>> engine = load_trt_model("yolov8n.plan")
>>>
>>> # Get input/output info
>>> inputs, outputs = get_trt_engine_inputs_and_outputs(engine)
>>> print(f"Inputs: {inputs}")
>>> print(f"Outputs: {outputs}")
>>>
>>> # Create context for inference
>>> context = engine.create_execution_context()

Note

Requires TensorRT to be installed (TensorRT 10.x recommended)
Engine files are platform and TensorRT version specific
Engines built on one GPU architecture may not work on another
Engine files typically have .plan or .engine extension
Provides detailed error messages from TensorRT runtime

load_trt_model¶

inference_models.models.common.trt.load_trt_model ¶

`model_path` ¶

`engine_host_code_allowed` ¶

load_trt_model¶

inference_models.models.common.trt.load_trt_model ¶

model_path ¶

engine_host_code_allowed ¶

`model_path` ¶

`engine_host_code_allowed` ¶