Skip to content

Commit 1412acc

Browse files
committed
Loosen default face detect crop
1 parent 7f12d1a commit 1412acc

6 files changed

Lines changed: 95 additions & 26 deletions

File tree

CHANGELOG.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,17 @@
11
# Changelog
22

3+
## [0.4.0] 2026-05-23
4+
5+
### Updated
6+
7+
* Loosen the default crop in face detection.
8+
9+
## [0.3.0] 2026-05-21
10+
11+
### Updated
12+
13+
* Update `nx` and `exla` to `~> 0.12`.
14+
315
## [0.2.0] 2026-05-02
416

517
### Added

lib/face_detection.ex

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,17 @@ if ImageVision.ortex_configured?() do
126126
repo = Keyword.get(options, :repo, @default_repo)
127127
model_file = Keyword.get(options, :model_file, @default_model_file)
128128

129+
# `vips_thumbnail` (used inside `preprocess/1`) auto-applies
130+
# the EXIF `Orientation` tag, so the model receives the
131+
# upright image. The rest of this function operates on the
132+
# same upright frame so `scale_x` / `scale_y` and the box-
133+
# clamping `max_width` / `max_height` all agree. Without
134+
# this step, iPhone photos (stored as landscape pixels +
135+
# `Orientation = 6/8`) detect the face correctly but report
136+
# box coordinates that, when applied to the un-rotated
137+
# buffer, land elsewhere in the frame.
138+
image = Image.autorotate!(image)
139+
129140
model = load_model(repo, model_file)
130141

131142
{tensor, scale_x, scale_y} = preprocess(image)
@@ -190,7 +201,9 @@ if ImageVision.ortex_configured?() do
190201
* `:padding` is a float in `[0.0, 5.0]` controlling how
191202
much room is kept around the face. `0.0` is a tight crop
192203
to the bounding box; `0.5` adds 50% on each side; `1.0`
193-
doubles the bounding box. Default `0.2`.
204+
doubles the bounding box. Default `0.5` — matches the
205+
Cloudflare Images `face-zoom=0.5` default and tends to
206+
include shoulders for portrait-style photos.
194207
195208
### Returns
196209
@@ -203,7 +216,13 @@ if ImageVision.ortex_configured?() do
203216
@spec crop_largest(image :: Vimage.t(), options :: Keyword.t()) ::
204217
{:ok, Vimage.t()} | {:error, :no_face_detected}
205218
def crop_largest(%Vimage{} = image, options \\ []) do
206-
padding = Keyword.get(options, :padding, 0.2)
219+
padding = Keyword.get(options, :padding, 0.5)
220+
221+
# Detection runs in the EXIF-rotated frame (see comment in
222+
# `detect/2`). The crop has to run in the same frame, so we
223+
# autorotate here too. Cheap no-op when the image has no
224+
# orientation tag or `Orientation = 1`.
225+
image = Image.autorotate!(image)
207226
faces = detect(image, options)
208227

209228
case largest_face(faces) do

mix.exs

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
defmodule ImageVision.MixProject do
22
use Mix.Project
33

4-
@version "0.2.0"
4+
@version "0.4.0"
55
@app_name "image_vision"
66

77
def project do
@@ -63,11 +63,11 @@ defmodule ImageVision.MixProject do
6363
{:ortex, "~> 0.1", optional: true},
6464
#
6565
# Classification and embedding use Bumblebee servings.
66-
{:bumblebee, "~> 0.6", optional: true},
66+
{:bumblebee, "~> 0.7", optional: true},
6767
#
6868
# Nx and EXLA are required for inference.
69-
{:nx, "~> 0.10.0"},
70-
{:exla, "~> 0.10"},
69+
{:nx, "~> 0.12"},
70+
{:exla, "~> 0.12"},
7171

7272
# --- Tooling ---
7373
{:ex_doc, "~> 0.18", only: [:release, :dev, :docs]},

0 commit comments

Comments
 (0)