Model Runtime Errors¶

Base Class: ModelRuntimeError

Model runtime errors occur during model execution when the model encounters issues processing input data or when runtime constraints are violated. These errors happen during the inference phase and are typically caused by incompatible input formats, device mismatches, or model execution failures.

ModelRuntimeError¶

Error during model execution or inference.

Overview¶

This error occurs when a model fails during runtime execution. This can happen due to invalid input formats, device incompatibilities, batch size mismatches, or internal model execution failures.

When It Occurs¶

Scenario 1: Invalid input type or format

Unsupported input type (not list, np.ndarray, or torch.Tensor)
Empty input list provided
Wrong tensor dimensions (expected 3 or 4, got different)
Wrong number of color channels (expected 3)
Unknown batch element type

Scenario 2: Device mismatch

TensorRT model loaded on CPU device (requires CUDA)
Model and input on different devices
CUDA not available when required

Scenario 3: Batch size inconsistency

Different batch sizes across input tensors
Incompatible batch dimensions
Batch size exceeds model limits

Scenario 4: Input shape mismatch

Numpy array with wrong number of dimensions
Torch tensor with incorrect shape
Batched tensors not allowed but provided
Missing required dimensions

What To Check¶

Check input type and format:

import numpy as np
import torch

# Check input type
print(f"Input type: {type(images)}")

# For numpy arrays
if isinstance(images, np.ndarray):
    print(f"Shape: {images.shape}")
    print(f"Dimensions: {len(images.shape)}")
    print(f"Channels: {images.shape[-1] if len(images.shape) == 3 else 'N/A'}")

# For torch tensors
if isinstance(images, torch.Tensor):
    print(f"Shape: {images.shape}")
    print(f"Device: {images.device}")

# For lists
if isinstance(images, list):
    print(f"List length: {len(images)}")
    if images:
        print(f"First element type: {type(images[0])}")

Review error message:
- "Unsupported input type" → Wrong data type
- "TRT engine only runs on CUDA" → Device mismatch
- "different batch sizes" → Batch inconsistency
- "incorrect number of dimensions" → Shape mismatch

How To Fix¶

Fix input type issues:

from inference_models import AutoModel
import numpy as np
import torch

model = AutoModel.from_pretrained("yolov8n-640")

# ❌ Wrong - unsupported type
images = "path/to/image.jpg"  # String not supported
result = model.predict(images)  # ModelRuntimeError!

# ❌ Wrong - unsupported type
images = ["path/to/image.jpg"]
result = model.predict(images)

# ✅ Or numpy array (HWC format, 3 channels)
image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
result = model.predict([image])

# ✅ Or torch tensor (CHW format for single image)
image = torch.rand(3, 640, 640)
result = model.predict(image)

# ✅ Or torch tensor (BCHW format for batch)
images = torch.rand(2, 3, 640, 640)
result = model.predict(images)

Common Input Format Requirements¶

Numpy Arrays¶

Expected format: - Shape: (height, width, channels) - HWC format - Channels: 3 (RGB or BGR, default BRG) - Dtype: uint8 - Value range: 0-255 for uint8

import numpy as np

# ✅ Correct numpy array format
image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)

# Or from file
from PIL import Image
image = np.array(Image.open("image.jpg"))  # Automatically HWC format

Torch Tensors¶

Expected format: - Shape (single image): (channels, height, width) - CHW format - Shape (batch): (batch, channels, height, width) - BCHW format - Channels: 3 (RGB or BGR, default RGB) - Dtype: float32 - Value range: 0.0-255.0

import torch

# ✅ Correct torch tensor format (single image)
image = torch.rand(3, 640, 640)

# ✅ Correct torch tensor format (batch)
images = torch.rand(4, 3, 640, 640)

# Convert numpy (HWC) to torch (CHW)
import numpy as np
np_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
torch_image = torch.from_numpy(np_image).permute(2, 0, 1).float() / 255.0

Lists¶

Expected format: - List of numpy arrays: Each array in HWC format - List of torch tensors: Each tensor in CHW format

# ✅ List of numpy arrays
images = [
    np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8),
    np.random.randint(0, 255, (480, 480, 3), dtype=np.uint8),
]

# ✅ List of torch tensors
images = [
    torch.rand(3, 640, 640),
    torch.rand(3, 480, 480),
]

Backend-Specific Requirements¶

TensorRT Backend¶

Requirements: - Device: Must be CUDA (GPU)

import torch
from inference_models import AutoModel

# ✅ Correct TensorRT usage
model = AutoModel.from_pretrained(
    "yolov8n-640",
    backend="trt",
    device="cuda"
)

ONNX Backend¶

Requirements: - Device: CPU or CUDA - Execution providers: Must be configured - Flexibility: Works on most platforms

from inference_models import AutoModel

# ✅ ONNX on CPU
model = AutoModel.from_pretrained(
    "yolov8n-640",
    backend="onnx",
    device="cpu",
    onnx_execution_providers=["CPUExecutionProvider"]
)

# ✅ ONNX on GPU
model = AutoModel.from_pretrained(
    "yolov8n-640",
    backend="onnx",
    device="cuda",
    onnx_execution_providers=["CUDAExecutionProvider", "CPUExecutionProvider"]
)

PyTorch Backend¶

Requirements: - Device: CPU or CUDA - PyTorch installation: Required - Model support: Not all models available

import torch
from inference_models import AutoModel

# ✅ PyTorch on CPU
model = AutoModel.from_pretrained(
    "sam-vit-b",
    backend="torch",
    device="cpu"
)

# ✅ PyTorch on GPU
if torch.cuda.is_available():
    model = AutoModel.from_pretrained(
        "sam-vit-b",
        backend="torch",
        device="cuda"
    )