generate_batch_chunks¶

inference_models.models.common.torch.generate_batch_chunks ¶

generate_batch_chunks(input_batch, chunk_size)

Generate fixed-size chunks from a batch tensor with automatic padding.

Splits a batch tensor into fixed-size chunks along the batch dimension (dim 0). If the last chunk is smaller than chunk_size, it is automatically padded with zeros to maintain consistent chunk sizes.

This is useful for processing large batches through models with fixed batch size requirements or to avoid GPU memory issues.

Parameters:

input_batch ¶
(Tensor) –

Input tensor with batch dimension as the first dimension. Shape: (batch_size, ...).
chunk_size ¶
(int) –

Size of each chunk. All chunks will have this size in the batch dimension, with the last chunk padded if necessary.

Yields:

Tuple[Tensor, int] –

Tuples of (chunk, padding_size) where: - chunk: Tensor of shape (chunk_size, ...) containing the batch chunk - padding_size: Number of padding elements added (0 for full chunks)

Examples:

Process large batch in chunks:

>>> from inference_models.developer_tools import generate_batch_chunks
>>> import torch
>>>
>>> # Large batch of images
>>> batch = torch.randn(100, 3, 640, 640)
>>>
>>> results = []
>>> for chunk, padding in generate_batch_chunks(batch, chunk_size=16):
...     # Process chunk through model
...     output = model(chunk)
...
...     # Remove padding from results
...     if padding > 0:
...         output = output[:-padding]
...
...     results.append(output)
>>>
>>> # Concatenate all results
>>> final_output = torch.cat(results, dim=0)

Handle models with static batch size:

>>> # Model requires exactly batch size of 8
>>> batch = torch.randn(20, 3, 224, 224, device="cuda")
>>>
>>> for chunk, padding in generate_batch_chunks(batch, chunk_size=8):
...     # chunk.shape[0] is always 8 (padded if needed)
...     output = model(chunk)
...
...     # Last chunk has padding=4, so remove it
...     if padding > 0:
...         output = output[:-padding]

Note

Chunks are created as views when possible (no padding needed)
Padding is added with zeros matching the input dtype and device
The last chunk is always padded to chunk_size if needed
Padding size is 0 for all chunks except potentially the last one

generate_batch_chunks¶

inference_models.models.common.torch.generate_batch_chunks ¶

`input_batch` ¶

`chunk_size` ¶

generate_batch_chunks¶

inference_models.models.common.torch.generate_batch_chunks ¶

input_batch ¶

chunk_size ¶

`input_batch` ¶

`chunk_size` ¶