|
| 1 | +# Face Detection |
| 2 | + |
| 3 | +`Image.FaceDetection` answers "where are the faces in this image?". It returns a list of bounding boxes, confidence scores, and five facial landmarks (right eye, left eye, nose tip, right mouth corner, left mouth corner) per detected face. |
| 4 | + |
| 5 | +## Basic detection |
| 6 | + |
| 7 | +```elixir |
| 8 | +iex> image = Image.open!("group.jpg") |
| 9 | +iex> faces = Image.FaceDetection.detect(image) |
| 10 | +iex> hd(faces) |
| 11 | +%{ |
| 12 | + box: {412, 88, 96, 124}, |
| 13 | + score: 0.94, |
| 14 | + landmarks: [{438.2, 130.1}, {478.7, 129.6}, {458.0, 152.3}, {442.1, 178.5}, {475.0, 178.2}] |
| 15 | +} |
| 16 | +``` |
| 17 | + |
| 18 | +Each detection is a map with: |
| 19 | +- `:box` — `{x, y, width, height}` in pixel coordinates of the original image |
| 20 | +- `:score` — confidence score in `[0.0, 1.0]` |
| 21 | +- `:landmarks` — a list of five `{x, y}` tuples: right eye, left eye, nose tip, right mouth corner, left mouth corner — in that order |
| 22 | + |
| 23 | +Results are sorted by descending confidence. |
| 24 | + |
| 25 | +## Filtering by confidence |
| 26 | + |
| 27 | +The default minimum score is `0.6`. Raise it for stricter detections: |
| 28 | + |
| 29 | +```elixir |
| 30 | +iex> Image.FaceDetection.detect(image, min_score: 0.8) |
| 31 | +``` |
| 32 | + |
| 33 | +`:nms_iou` (default `0.3`) controls how aggressively overlapping boxes are collapsed by non-maximum suppression. Lower values keep fewer overlapping faces. |
| 34 | + |
| 35 | +## Boxes only |
| 36 | + |
| 37 | +When landmarks aren't needed, `boxes/2` skips them: |
| 38 | + |
| 39 | +```elixir |
| 40 | +iex> Image.FaceDetection.boxes(image) |
| 41 | +[{412, 88, 96, 124}, {612, 102, 84, 110}] |
| 42 | +``` |
| 43 | + |
| 44 | +## Drawing detections |
| 45 | + |
| 46 | +`draw_boxes/3` overlays bounding boxes, the score as a percentage label, and the five landmark dots: |
| 47 | + |
| 48 | +```elixir |
| 49 | +iex> faces = Image.FaceDetection.detect(image) |
| 50 | +iex> annotated = Image.FaceDetection.draw_boxes(faces, image) |
| 51 | +iex> Image.write!(annotated, "annotated.jpg") |
| 52 | +``` |
| 53 | + |
| 54 | +Pipeline form: |
| 55 | + |
| 56 | +```elixir |
| 57 | +iex> image |
| 58 | +...> |> Image.FaceDetection.detect() |
| 59 | +...> |> Image.FaceDetection.draw_boxes(image) |
| 60 | +...> |> Image.write!("annotated.jpg") |
| 61 | +``` |
| 62 | + |
| 63 | +Drawing options include `:color`, `:stroke_width`, `:landmark_radius`, `:font_size`, and `:show_landmarks?` (set to `false` to skip the dots). |
| 64 | + |
| 65 | +## Face-aware crop |
| 66 | + |
| 67 | +`crop_largest/2` is a convenience for the common "crop to the most prominent face" case (the wire-in point for face-aware crop bias used by `gravity: :face` in `image_plug`, ImageKit `z-`, and Cloudflare `face-zoom`): |
| 68 | + |
| 69 | +```elixir |
| 70 | +iex> {:ok, portrait} = Image.FaceDetection.crop_largest(image, padding: 0.2) |
| 71 | +``` |
| 72 | + |
| 73 | +The largest face is chosen by bounding-box area. `:padding` is a fraction of each face dimension — `0.0` is a tight crop, `0.5` adds 50% on each side, `1.0` doubles the box. The expanded crop is clipped to the image bounds. |
| 74 | + |
| 75 | +When no face meets the score threshold, `crop_largest/2` returns `{:error, :no_face_detected}`. |
| 76 | + |
| 77 | +## Default model |
| 78 | + |
| 79 | +[YuNet](https://github.com/opencv/opencv_zoo/tree/main/models/face_detection_yunet) (`opencv/face_detection_yunet`) — the OpenCV team's production face detector. Roughly **340 KB on disk**, MIT licensed, real-time on CPU. The 2023-March export produces decoded boxes, keypoints, and scores directly. |
| 80 | + |
| 81 | +Model weights are downloaded on first call and cached. Configure the cache directory with: |
| 82 | + |
| 83 | +```elixir |
| 84 | +config :image_vision, :cache_dir, "/path/to/cache" |
| 85 | +``` |
| 86 | + |
| 87 | +## Using a different model |
| 88 | + |
| 89 | +`detect/2` accepts `:repo` and `:model_file` to swap in a different YuNet ONNX export: |
| 90 | + |
| 91 | +```elixir |
| 92 | +iex> Image.FaceDetection.detect(image, |
| 93 | +...> repo: "opencv/face_detection_yunet", |
| 94 | +...> model_file: "face_detection_yunet_2023mar.onnx" |
| 95 | +...> ) |
| 96 | +``` |
| 97 | + |
| 98 | +### Caveat: post-processor is YuNet 2023-March specific |
| 99 | + |
| 100 | +The output decoder assumes YuNet's 2023-March 12-tensor convention (`cls_*`, `obj_*`, `bbox_*`, `kps_*` at strides 8/16/32, fixed 640×640 input). `SCRFD`, `BlazeFace`, and other face-detector exports produce different output shapes and need a different post-processor — they will not work as a drop-in replacement. |
| 101 | + |
| 102 | +## Dependencies |
| 103 | + |
| 104 | +Face detection requires `:ortex`. Add to `mix.exs`: |
| 105 | + |
| 106 | +```elixir |
| 107 | +{:ortex, "~> 0.1"} |
| 108 | +``` |
0 commit comments