Model Runtime Errors¶
Base Class: ModelRuntimeError
Model runtime errors occur during model execution when the model encounters issues processing input data or when runtime constraints are violated. These errors happen during the inference phase and are typically caused by incompatible input formats, device mismatches, or model execution failures.
ModelRuntimeError¶
Error during model execution or inference.
Overview¶
This error occurs when a model fails during runtime execution. This can happen due to invalid input formats, device incompatibilities, batch size mismatches, or internal model execution failures.
When It Occurs¶
Scenario 1: Invalid input type or format
-
Unsupported input type (not list, np.ndarray, or torch.Tensor)
-
Empty input list provided
-
Wrong tensor dimensions (expected 3 or 4, got different)
-
Wrong number of color channels (expected 3)
-
Unknown batch element type
Scenario 2: Device mismatch
-
TensorRT model loaded on CPU device (requires CUDA)
-
Model and input on different devices
-
CUDA not available when required
Scenario 3: Batch size inconsistency
-
Different batch sizes across input tensors
-
Incompatible batch dimensions
-
Batch size exceeds model limits
Scenario 4: Input shape mismatch
-
Numpy array with wrong number of dimensions
-
Torch tensor with incorrect shape
-
Batched tensors not allowed but provided
-
Missing required dimensions
What To Check¶
-
Check input type and format:
import numpy as np import torch # Check input type print(f"Input type: {type(images)}") # For numpy arrays if isinstance(images, np.ndarray): print(f"Shape: {images.shape}") print(f"Dimensions: {len(images.shape)}") print(f"Channels: {images.shape[-1] if len(images.shape) == 3 else 'N/A'}") # For torch tensors if isinstance(images, torch.Tensor): print(f"Shape: {images.shape}") print(f"Device: {images.device}") # For lists if isinstance(images, list): print(f"List length: {len(images)}") if images: print(f"First element type: {type(images[0])}") -
Review error message:
-
"Unsupported input type" → Wrong data type
-
"TRT engine only runs on CUDA" → Device mismatch
-
"different batch sizes" → Batch inconsistency
-
"incorrect number of dimensions" → Shape mismatch
-
How To Fix¶
Fix input type issues:
from inference_models import AutoModel
import numpy as np
import torch
model = AutoModel.from_pretrained("yolov8n-640")
# ❌ Wrong - unsupported type
images = "path/to/image.jpg" # String not supported
result = model.predict(images) # ModelRuntimeError!
# ❌ Wrong - unsupported type
images = ["path/to/image.jpg"]
result = model.predict(images)
# ✅ Or numpy array (HWC format, 3 channels)
image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
result = model.predict([image])
# ✅ Or torch tensor (CHW format for single image)
image = torch.rand(3, 640, 640)
result = model.predict(image)
# ✅ Or torch tensor (BCHW format for batch)
images = torch.rand(2, 3, 640, 640)
result = model.predict(images)
Common Input Format Requirements¶
Numpy Arrays¶
Expected format:
- Shape: (height, width, channels) - HWC format
- Channels: 3 (RGB or BGR, default BRG)
- Dtype: uint8
- Value range: 0-255 for uint8
import numpy as np
# ✅ Correct numpy array format
image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
# Or from file
from PIL import Image
image = np.array(Image.open("image.jpg")) # Automatically HWC format
Torch Tensors¶
Expected format:
- Shape (single image): (channels, height, width) - CHW format
- Shape (batch): (batch, channels, height, width) - BCHW format
- Channels: 3 (RGB or BGR, default RGB)
- Dtype: float32
- Value range: 0.0-255.0
import torch
# ✅ Correct torch tensor format (single image)
image = torch.rand(3, 640, 640)
# ✅ Correct torch tensor format (batch)
images = torch.rand(4, 3, 640, 640)
# Convert numpy (HWC) to torch (CHW)
import numpy as np
np_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
torch_image = torch.from_numpy(np_image).permute(2, 0, 1).float() / 255.0
Lists¶
Expected format: - List of numpy arrays: Each array in HWC format - List of torch tensors: Each tensor in CHW format
# ✅ List of numpy arrays
images = [
np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8),
np.random.randint(0, 255, (480, 480, 3), dtype=np.uint8),
]
# ✅ List of torch tensors
images = [
torch.rand(3, 640, 640),
torch.rand(3, 480, 480),
]
Backend-Specific Requirements¶
TensorRT Backend¶
Requirements: - Device: Must be CUDA (GPU)
import torch
from inference_models import AutoModel
# ✅ Correct TensorRT usage
model = AutoModel.from_pretrained(
"yolov8n-640",
backend="trt",
device="cuda"
)
ONNX Backend¶
Requirements: - Device: CPU or CUDA - Execution providers: Must be configured - Flexibility: Works on most platforms
from inference_models import AutoModel
# ✅ ONNX on CPU
model = AutoModel.from_pretrained(
"yolov8n-640",
backend="onnx",
device="cpu",
onnx_execution_providers=["CPUExecutionProvider"]
)
# ✅ ONNX on GPU
model = AutoModel.from_pretrained(
"yolov8n-640",
backend="onnx",
device="cuda",
onnx_execution_providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
)
PyTorch Backend¶
Requirements: - Device: CPU or CUDA - PyTorch installation: Required - Model support: Not all models available