Environment Variables Configuration¶
This guide covers all environment variables available in inference-models for configuring model loading, caching, API access, and runtime behavior.
Quick Start¶
Set environment variables before importing inference-models:
# Set API key
export ROBOFLOW_API_KEY="your_api_key_here"
# Set model cache directory
export INFERENCE_HOME="/path/to/cache"
# Set device
export DEFAULT_DEVICE="cuda:0"
Or use a .env file in your project root:
Core Configuration¶
API Authentication¶
ROBOFLOW_API_KEY (or API_KEY)
Your Roboflow API key for accessing models.
Get your API key from: https://docs.roboflow.com/api-reference/authentication
ROBOFLOW_ENVIRONMENT
Environment to use: prod (default) or staging.
ROBOFLOW_API_HOST
Override API host URL (auto-set based on environment).
Model Cache¶
INFERENCE_HOME
Directory where downloaded models are cached. Default: /tmp/cache
Device Selection¶
DEFAULT_DEVICE
Default device for model inference: cpu, cuda, cuda:0, etc.
API Configuration¶
Request Settings¶
API_CALLS_TIMEOUT
Timeout for API calls in seconds. Default: 5
API_CALLS_MAX_TRIES
Maximum retry attempts for API calls. Default: 3
IDEMPOTENT_API_REQUEST_CODES_TO_RETRY
HTTP status codes to retry (comma-separated). Default: 408,429,502,503,504
Backend Configuration¶
ONNX Runtime¶
ONNXRUNTIME_EXECUTION_PROVIDERS
Override ONNX execution providers, comma separated, no spaces.
Default: CUDAExecutionProvider,OpenVINOExecutionProvider,CoreMLExecutionProvider,CPUExecutionProvider
Prediction Parameter Defaults¶
These environment variables control the default values for prediction parameters across all models. Individual models may override these defaults. See Prediction Parameters for detailed information about each parameter.
General Detection Parameters¶
INFERENCE_MODELS_DEFAULT_CONFIDENCE
Default confidence threshold for filtering predictions. Default: 0.4
INFERENCE_MODELS_DEFAULT_IOU_THRESHOLD
Default IoU threshold for Non-Maximum Suppression. Default: 0.3
INFERENCE_MODELS_DEFAULT_MAX_DETECTIONS
Default maximum number of detections to return. Default: 300
INFERENCE_MODELS_DEFAULT_CLASS_AGNOSTIC_NMS
Default for class-agnostic NMS. Default: false
General Vision-Language Model Parameters¶
INFERENCE_MODELS_DEFAULT_MAX_NEW_TOKENS
Default maximum number of tokens to generate. Default: 4096
INFERENCE_MODELS_DEFAULT_NUM_BEAMS
Default number of beams for beam search. Default: 3
INFERENCE_MODELS_DEFAULT_DO_SAMPLE
Default for sampling during generation. Default: false
INFERENCE_MODELS_DEFAULT_SKIP_SPECIAL_TOKENS
Default for skipping special tokens in output. Default: false
Model-Specific Overrides¶
Individual models can override the general defaults. Below are model-specific environment variables:
DeepLabV3+¶
INFERENCE_MODELS_DEEP_LAB_V3_PLUS_DEFAULT_CONFIDENCE
Default: 0.5
DINOv3¶
INFERENCE_MODELS_DINOV3_DEFAULT_CONFIDENCE
Default: 0.5
EasyOCR¶
INFERENCE_MODELS_EASYOCR_DEFAULT_CONFIDENCE
Default: 0.3
Florence-2¶
INFERENCE_MODELS_FLORENCE2_DEFAULT_MAX_NEW_TOKENS
Default: Inherits from INFERENCE_MODELS_DEFAULT_MAX_NEW_TOKENS
INFERENCE_MODELS_FLORENCE2_DEFAULT_NUM_BEAMS
Default: Inherits from INFERENCE_MODELS_DEFAULT_NUM_BEAMS
INFERENCE_MODELS_FLORENCE2_DEFAULT_DO_SAMPLE
Default: Inherits from INFERENCE_MODELS_DEFAULT_DO_SAMPLE
Grounding DINO¶
INFERENCE_MODELS_GROUNDING_DINO_DEFAULT_BOX_CONFIDENCE
Default: 0.5
INFERENCE_MODELS_GROUNDING_DINO_DEFAULT_MAX_DETECTIONS
Default: Inherits from INFERENCE_MODELS_DEFAULT_MAX_DETECTIONS
INFERENCE_MODELS_GROUNDING_DINO_DEFAULT_IOU_THRESHOLD
Default: 0.5
MediaPipe Face Detector¶
INFERENCE_MODELS_MEDIAPIPE_FACE_DETECTOR_DEFAULT_CONFIDENCE
Default: 0.25
Moondream2¶
INFERENCE_MODELS_MOONDREAM2_DEFAULT_MAX_NEW_TOKENS
Default: 700
OWLv2¶
INFERENCE_MODELS_OWLV2_DEFAULT_CONFIDENCE
Default: 0.99
INFERENCE_MODELS_OWLV2_DEFAULT_IOU_THRESHOLD
Default: Inherits from INFERENCE_MODELS_DEFAULT_IOU_THRESHOLD
INFERENCE_MODELS_OWLV2_DEFAULT_MAX_DETECTIONS
Default: Inherits from INFERENCE_MODELS_DEFAULT_MAX_DETECTIONS
INFERENCE_MODELS_OWLV2_DEFAULT_CLASS_AGNOSTIC_NMS
Default: Inherits from INFERENCE_MODELS_DEFAULT_CLASS_AGNOSTIC_NMS
PaliGemma¶
INFERENCE_MODELS_PALIGEMMA_DEFAULT_MAX_NEW_TOKENS
Default: 400
INFERENCE_MODELS_PALIGEMMA_DEFAULT_DO_SAMPLE
Default: Inherits from INFERENCE_MODELS_DEFAULT_DO_SAMPLE
INFERENCE_MODELS_PALIGEMMA_DEFAULT_SKIP_SPECIAL_TOKENS
Default: true
Qwen2.5-VL¶
INFERENCE_MODELS_QWEN25_VL_DEFAULT_MAX_NEW_TOKENS
Default: 512
INFERENCE_MODELS_QWEN25_VL_DEFAULT_DO_SAMPLE
Default: Inherits from INFERENCE_MODELS_DEFAULT_DO_SAMPLE
INFERENCE_MODELS_QWEN25_VL_DEFAULT_SKIP_SPECIAL_TOKENS
Default: true
Qwen3-VL¶
INFERENCE_MODELS_QWEN3_VL_DEFAULT_MAX_NEW_TOKENS
Default: 512
INFERENCE_MODELS_QWEN3_VL_DEFAULT_DO_SAMPLE
Default: Inherits from INFERENCE_MODELS_DEFAULT_DO_SAMPLE
ResNet¶
INFERENCE_MODELS_RESNET_DEFAULT_CONFIDENCE
Default: Inherits from INFERENCE_MODELS_DEFAULT_CONFIDENCE
RF-DETR¶
INFERENCE_MODELS_RFDETR_DEFAULT_CONFIDENCE
Default: Inherits from INFERENCE_MODELS_DEFAULT_CONFIDENCE
Roboflow Instant¶
INFERENCE_MODELS_ROBOFLOW_INSTANT_DEFAULT_CONFIDENCE
Default: 0.99
INFERENCE_MODELS_ROBOFLOW_INSTANT_DEFAULT_IOU_THRESHOLD
Default: 0.3
INFERENCE_MODELS_ROBOFLOW_INSTANT_MAX_DETECTIONS
Default: Inherits from INFERENCE_MODELS_DEFAULT_MAX_DETECTIONS
SmolVLM¶
INFERENCE_MODELS_SMOL_VLM_DEFAULT_MAX_NEW_TOKENS
Default: 400
INFERENCE_MODELS_SMOL_VLM_DEFAULT_DO_SAMPLE
Default: Inherits from INFERENCE_MODELS_DEFAULT_DO_SAMPLE
INFERENCE_MODELS_SMOL_VLM_DEFAULT_SKIP_SPECIAL_TOKENS
Default: true
ViT¶
INFERENCE_MODELS_VIT_CLASSIFIER_DEFAULT_CONFIDENCE
Default: Inherits from INFERENCE_MODELS_DEFAULT_CONFIDENCE
YOLACT¶
INFERENCE_MODELS_YOLACT_DEFAULT_CONFIDENCE
Default: Inherits from INFERENCE_MODELS_DEFAULT_CONFIDENCE
INFERENCE_MODELS_YOLACT_DEFAULT_IOU_THRESHOLD
Default: Inherits from INFERENCE_MODELS_DEFAULT_IOU_THRESHOLD
INFERENCE_MODELS_YOLACT_DEFAULT_MAX_DETECTIONS
Default: Inherits from INFERENCE_MODELS_DEFAULT_MAX_DETECTIONS
INFERENCE_MODELS_YOLACT_DEFAULT_CLASS_AGNOSTIC_NMS
Default: Inherits from INFERENCE_MODELS_DEFAULT_CLASS_AGNOSTIC_NMS
YOLO-NAS¶
INFERENCE_MODELS_YOLONAS_DEFAULT_CONFIDENCE
Default: Inherits from INFERENCE_MODELS_DEFAULT_CONFIDENCE
INFERENCE_MODELS_YOLONAS_DEFAULT_IOU_THRESHOLD
Default: Inherits from INFERENCE_MODELS_DEFAULT_IOU_THRESHOLD
INFERENCE_MODELS_YOLONAS_DEFAULT_MAX_DETECTIONS
Default: Inherits from INFERENCE_MODELS_DEFAULT_MAX_DETECTIONS
INFERENCE_MODELS_YOLONAS_DEFAULT_CLASS_AGNOSTIC_NMS
Default: Inherits from INFERENCE_MODELS_DEFAULT_CLASS_AGNOSTIC_NMS
YOLOv5¶
INFERENCE_MODELS_YOLOV5_DEFAULT_CONFIDENCE
Default: Inherits from INFERENCE_MODELS_DEFAULT_CONFIDENCE
INFERENCE_MODELS_YOLOV5_DEFAULT_IOU_THRESHOLD
Default: Inherits from INFERENCE_MODELS_DEFAULT_IOU_THRESHOLD
INFERENCE_MODELS_YOLOV5_DEFAULT_MAX_DETECTIONS
Default: Inherits from INFERENCE_MODELS_DEFAULT_MAX_DETECTIONS
INFERENCE_MODELS_YOLOV5_DEFAULT_CLASS_AGNOSTIC_NMS
Default: Inherits from INFERENCE_MODELS_DEFAULT_CLASS_AGNOSTIC_NMS
YOLOv7¶
INFERENCE_MODELS_YOLOV7_DEFAULT_CONFIDENCE
Default: Inherits from INFERENCE_MODELS_DEFAULT_CONFIDENCE
INFERENCE_MODELS_YOLOV7_DEFAULT_IOU_THRESHOLD
Default: Inherits from INFERENCE_MODELS_DEFAULT_IOU_THRESHOLD
INFERENCE_MODELS_YOLOV7_DEFAULT_MAX_DETECTIONS
Default: Inherits from INFERENCE_MODELS_DEFAULT_MAX_DETECTIONS
INFERENCE_MODELS_YOLOV7_DEFAULT_CLASS_AGNOSTIC_NMS
Default: Inherits from INFERENCE_MODELS_DEFAULT_CLASS_AGNOSTIC_NMS
YOLOv8/v9/v11/v12 (Ultralytics)¶
INFERENCE_MODELS_YOLO_ULTRALYTICS_DEFAULT_CONFIDENCE
Default: Inherits from INFERENCE_MODELS_DEFAULT_CONFIDENCE
INFERENCE_MODELS_YOLO_ULTRALYTICS_DEFAULT_IOU_THRESHOLD
Default: Inherits from INFERENCE_MODELS_DEFAULT_IOU_THRESHOLD
INFERENCE_MODELS_YOLO_ULTRALYTICS_DEFAULT_MAX_DETECTIONS
Default: Inherits from INFERENCE_MODELS_DEFAULT_MAX_DETECTIONS
INFERENCE_MODELS_YOLO_ULTRALYTICS_DEFAULT_CLASS_AGNOSTIC_NMS
Default: Inherits from INFERENCE_MODELS_DEFAULT_CLASS_AGNOSTIC_NMS
INFERENCE_MODELS_YOLO_ULTRALYTICS_DEFAULT_KEY_POINTS_THRESHOLD
Default: 0.0
YOLOv10¶
INFERENCE_MODELS_YOLOV10_DEFAULT_CONFIDENCE
Default: Inherits from INFERENCE_MODELS_DEFAULT_CONFIDENCE
INFERENCE_MODELS_YOLOV10_DEFAULT_MAX_DETECTIONS
Default: Inherits from INFERENCE_MODELS_DEFAULT_MAX_DETECTIONS
Logging¶
LOG_LEVEL
Set the log level for the library. Default: WARNING
VERBOSE_LOG_LEVEL
Set the log level for verbose logging. Default: INFO
DISABLE_VERBOSE_LOGGER
Disable verbose logging. Default: false
DISABLE_INTERACTIVE_PROGRESS_BARS
Disable interactive progress bars. Default: false
Advanced Configuration¶
Input Validation¶
ALLOW_URL_INPUT
Allow URLs as image input. Used by models like OWL-V2, when access to larger
datasets provided as references is needed. Default: true
ALLOW_NON_HTTPS_URL_INPUT
Allow non-HTTPS URLs. Used by models like OWL-V2, when access to larger
datasets provided as references is needed. Default: false
ALLOW_URL_INPUT_WITHOUT_FQDN
Allow URLs without FQDN. Used by models like OWL-V2, when access to larger
datasets provided as references is needed. Default: false
WHITELISTED_DESTINATIONS_FOR_URL_INPUT
Comma-separated list of allowed destinations for URL input. Used by models like OWL-V2, when access to larger
datasets provided as references is needed. Default: None
BLACKLISTED_DESTINATIONS_FOR_URL_INPUT
Comma-separated list of allowed destinations for URL input. Used by models like OWL-V2, when access to larger
datasets provided as references is needed. Default: None
ALLOW_LOCAL_STORAGE_ACCESS_FOR_REFERENCE_DATA
Allow local storage access for reference data. Used by models like OWL-V2, when access to larger
datasets provided as references is needed. Default: true