MediaPipe Face Detection¶
MediaPipe Face Detection is a lightweight face detection model developed by Google. It detects faces and predicts 6 facial keypoints (eyes, nose, mouth, ears) optimized for mobile and edge devices.
Overview¶
Resources: MediaPipe Documentation | GitHub Repository
MediaPipe Face Detection provides fast and efficient face detection with keypoint localization. Key features include:
- Lightweight architecture - Optimized for mobile and edge deployment
- 6 facial keypoints - Eyes, nose, mouth, and ears
- Fast inference - Efficient processing on CPU
- Single-stage detection - Detects faces and keypoints simultaneously
- TensorFlow Lite backend - Efficient mobile-optimized runtime
License¶
Apache 2.0
Open Source License
MediaPipe Face Detection is licensed under Apache 2.0, making it free for both commercial and non-commercial use without restrictions.
Learn more: Apache 2.0 License
Pre-trained Model IDs¶
MediaPipe Face Detection has one pre-trained model available via the Roboflow API and requires a Roboflow API key.
Getting a Roboflow API Key
To use MediaPipe models, you'll need a Roboflow account (free) and API key.
| Model | Model ID | Use Case |
|---|---|---|
| Face Detector | mediapipe/face-detector |
Face detection with 6 keypoints |
Detected keypoints: right-eye, left-eye, nose, mouth, right-ear, left-ear
Supported Backends¶
| Backend | Extras Required |
|---|---|
mediapipe |
mediapipe |
Roboflow Platform Compatibility¶
| Feature | Supported |
|---|---|
| Training | ❌ Pre-trained model only |
| Upload Weights | ❌ Pre-trained model only |
| Serverless API (v2) | ✅ Deploy via hosted API |
| Workflows | ✅ Use in Workflows via Keypoint Detection block |
| Edge Deployment (Jetson) | ✅ Deploy on NVIDIA Jetson devices |
| Self-Hosting | ✅ Deploy with inference-models |
Usage Example¶
import cv2
import supervision as sv
from inference_models import AutoModel
# Load model (requires Roboflow API key)
model = AutoModel.from_pretrained(
"mediapipe/face-detector",
api_key="your_roboflow_api_key"
)
image = cv2.imread("path/to/image.jpg")
# Run inference - returns tuple of (List[KeyPoints], List[Detections])
results = model(image)
key_points_list, detections_list = results
# Convert to supervision format for visualization
key_points = key_points_list[0].to_supervision()
detections = detections_list[0].to_supervision()
# Annotate image with bounding boxes and keypoints
box_annotator = sv.BoxAnnotator()
vertex_annotator = sv.VertexAnnotator()
# EdgeAnnotator can use skeleton from model or auto-detect based on keypoint count
edge_annotator = sv.EdgeAnnotator(edges=model.skeletons[0])
annotated_image = box_annotator.annotate(image.copy(), detections)
annotated_image = edge_annotator.annotate(annotated_image, key_points)
annotated_image = vertex_annotator.annotate(annotated_image, key_points)
# Save or display
cv2.imwrite("annotated.jpg", annotated_image)
Output Format¶
Returns: Tuple[List[KeyPoints], List[Detections]] (one element per image in batch)
Keypoint names: right-eye, left-eye, nose, mouth, right-ear, left-ear (6 keypoints)
Skeleton connections: Access via model.skeletons[0] which returns [(5, 1), (1, 2), (4, 0), (0, 2), (2, 3)]. The model.skeletons list has one element per detection class; since this model has only one class (face), always use index 0.
Pass to EdgeAnnotator: sv.EdgeAnnotator(edges=model.skeletons[0]) or omit for auto-detection.
Use Cases¶
MediaPipe Face Detection is ideal for:
- ✅ Face detection - Locate faces in images
- ✅ Facial landmark detection - Get key facial feature positions
- ✅ Mobile/edge deployment - Lightweight model for resource-constrained devices
- ✅ Fast processing - Efficient CPU inference
- ✅ Preprocessing for face analysis - Detect faces before running face recognition or analysis