Skip to content

MediaPipe Face Detection

MediaPipe Face Detection is a lightweight face detection model developed by Google. It detects faces and predicts 6 facial keypoints (eyes, nose, mouth, ears) optimized for mobile and edge devices.

Overview

Resources: MediaPipe Documentation | GitHub Repository

MediaPipe Face Detection provides fast and efficient face detection with keypoint localization. Key features include:

  • Lightweight architecture - Optimized for mobile and edge deployment
  • 6 facial keypoints - Eyes, nose, mouth, and ears
  • Fast inference - Efficient processing on CPU
  • Single-stage detection - Detects faces and keypoints simultaneously
  • TensorFlow Lite backend - Efficient mobile-optimized runtime

License

Apache 2.0

Open Source License

MediaPipe Face Detection is licensed under Apache 2.0, making it free for both commercial and non-commercial use without restrictions.

Learn more: Apache 2.0 License

Pre-trained Model IDs

MediaPipe Face Detection has one pre-trained model available via the Roboflow API and requires a Roboflow API key.

Getting a Roboflow API Key

To use MediaPipe models, you'll need a Roboflow account (free) and API key.

Model Model ID Use Case
Face Detector mediapipe/face-detector Face detection with 6 keypoints

Detected keypoints: right-eye, left-eye, nose, mouth, right-ear, left-ear

Supported Backends

Backend Extras Required
mediapipe mediapipe

Roboflow Platform Compatibility

Feature Supported
Training ❌ Pre-trained model only
Upload Weights ❌ Pre-trained model only
Serverless API (v2) Deploy via hosted API
Workflows ✅ Use in Workflows via Keypoint Detection block
Edge Deployment (Jetson) ✅ Deploy on NVIDIA Jetson devices
Self-Hosting ✅ Deploy with inference-models

Usage Example

import cv2
import supervision as sv
from inference_models import AutoModel

# Load model (requires Roboflow API key)
model = AutoModel.from_pretrained(
    "mediapipe/face-detector",
    api_key="your_roboflow_api_key"
)
image = cv2.imread("path/to/image.jpg")

# Run inference - returns tuple of (List[KeyPoints], List[Detections])
results = model(image)
key_points_list, detections_list = results

# Convert to supervision format for visualization
key_points = key_points_list[0].to_supervision()
detections = detections_list[0].to_supervision()

# Annotate image with bounding boxes and keypoints
box_annotator = sv.BoxAnnotator()
vertex_annotator = sv.VertexAnnotator()

# EdgeAnnotator can use skeleton from model or auto-detect based on keypoint count
edge_annotator = sv.EdgeAnnotator(edges=model.skeletons[0])

annotated_image = box_annotator.annotate(image.copy(), detections)
annotated_image = edge_annotator.annotate(annotated_image, key_points)
annotated_image = vertex_annotator.annotate(annotated_image, key_points)

# Save or display
cv2.imwrite("annotated.jpg", annotated_image)

Output Format

Returns: Tuple[List[KeyPoints], List[Detections]] (one element per image in batch)

Keypoint names: right-eye, left-eye, nose, mouth, right-ear, left-ear (6 keypoints)

Skeleton connections: Access via model.skeletons[0] which returns [(5, 1), (1, 2), (4, 0), (0, 2), (2, 3)]. The model.skeletons list has one element per detection class; since this model has only one class (face), always use index 0.

Pass to EdgeAnnotator: sv.EdgeAnnotator(edges=model.skeletons[0]) or omit for auto-detection.

Use Cases

MediaPipe Face Detection is ideal for:

  • Face detection - Locate faces in images
  • Facial landmark detection - Get key facial feature positions
  • Mobile/edge deployment - Lightweight model for resource-constrained devices
  • Fast processing - Efficient CPU inference
  • Preprocessing for face analysis - Detect faces before running face recognition or analysis