Skip to content

Commit f217246

Browse files
committed
docs: update inference time documentation
1 parent 6556b74 commit f217246

1 file changed

Lines changed: 92 additions & 42 deletions

File tree

docs/docs/02-benchmarks/inference-time.md

Lines changed: 92 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -3,46 +3,84 @@ title: Inference Time
33
---
44

55
:::warning
6-
Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
6+
Times presented in the tables are measured as consecutive runs of the model.
7+
Initial run times may be up to 2x longer due to model loading and
8+
initialization.
79
:::
810

911
## Classification
1012

1113
:::info
12-
Times presented below are _model inference times only_ and do not include time taken for pre-processing (e.g. image resizing, normalization) or post-processing (e.g. image resizing) which are dependent on input size.
14+
Inference times are measured directly from native C++ code, wrapping only the
15+
model's forward pass, excluding input-dependent pre- and post-processing (e.g.
16+
image resizing, normalization) and any overhead from React Native runtime.
1317
:::
1418

15-
| Model | iPhone 17 Pro (Core ML) [ms] | Google Pixel 10 (XNNPACK) [ms] |
16-
| --------------------------- | :--------------------------: | :----------------------------: |
17-
| EFFICIENTNET_V2_S | 12 | 100 |
18-
| EFFICIENTNET_V2_S_QUANTIZED | 5 | 38 |
19+
:::info
20+
For this model all input images, whether larger or smaller, are resized before
21+
processing. Resizing is typically fast for small images but may be noticeably
22+
slower for very large images, which can increase total time.
23+
:::
24+
25+
| Model / Device | iPhone 17 Pro [ms] | Google Pixel 10 [ms] |
26+
| :------------------------------- | :----------------: | :------------------: |
27+
| EFFICIENTNET_V2_S (XNNPACK FP32) | 70 | 100 |
28+
| EFFICIENTNET_V2_S (XNNPACK INT8) | 22 | 38 |
29+
| EFFICIENTNET_V2_S (Core ML FP32) | 12 | - |
30+
| EFFICIENTNET_V2_S (Core ML FP16) | 5 | - |
1931

2032
## Object Detection
2133

2234
:::info
23-
Times presented below are _model inference times only_ and do not include time taken for pre-processing (e.g. image resizing, normalization) or post-processing (e.g. image resizing) which are dependent on input size.
35+
Inference times are measured directly from native C++ code, wrapping only the
36+
model's forward pass, excluding input-dependent pre- and post-processing (e.g.
37+
image resizing, normalization) and any overhead from React Native runtime.
38+
:::
39+
40+
:::info
41+
For this model all input images, whether larger or smaller, are resized before
42+
processing. Resizing is typically fast for small images but may be noticeably
43+
slower for very large images, which can increase total time.
2444
:::
2545

26-
| Model | iPhone 17 Pro (Core ML) [ms] | Google Pixel 10 (XNNPACK) [ms] |
27-
| ------------------------------ | :--------------------------: | :----------------------------: |
28-
| SSDLITE_320_MOBILENET_V3_LARGE | 8 | 18 |
46+
| Model / Device | iPhone 17 Pro [ms] | Google Pixel 10 [ms] |
47+
| :-------------------------------------------- | :----------------: | :------------------: |
48+
| SSDLITE_320_MOBILENET_V3_LARGE (XNNPACK FP32) | 20 | 18 |
49+
| SSDLITE_320_MOBILENET_V3_LARGE (Core ML FP32) | 18 | - |
50+
| SSDLITE_320_MOBILENET_V3_LARGE (Core ML FP16) | 8 | - |
2951

3052
## Style Transfer
3153

3254
:::info
33-
Times presented below are _model inference times only_ and do not include time taken for pre-processing (e.g. image resizing, normalization) or post-processing (e.g. image resizing) which are dependent on input size.
55+
Inference times are measured directly from native C++ code, wrapping only the
56+
model's forward pass, excluding input-dependent pre- and post-processing (e.g.
57+
image resizing, normalization) and any overhead from React Native runtime.
3458
:::
3559

36-
| Model | iPhone 17 Pro (Core ML) [ms] | Google Pixel 10 (XNNPACK) [ms] |
37-
| -------------------------------------- | :--------------------------: | :----------------------------: |
38-
| STYLE_TRANSFER_CANDY | 100 | 1025 |
39-
| STYLE_TRANSFER_MOSAIC | 100 | 1025 |
40-
| STYLE_TRANSFER_UDNIE | 100 | 1025 |
41-
| STYLE_TRANSFER_RAIN_PRINCESS | 100 | 1025 |
42-
| STYLE_TRANSFER_CANDY_QUANTIZED | 150 | 430 |
43-
| STYLE_TRANSFER_MOSAIC_QUANTIZED | 150 | 430 |
44-
| STYLE_TRANSFER_UDNIE_QUANTIZED | 150 | 430 |
45-
| STYLE_TRANSFER_RAIN_PRINCESS_QUANTIZED | 150 | 430 |
60+
:::info
61+
For this model all input images, whether larger or smaller, are resized before
62+
processing. Resizing is typically fast for small images but may be noticeably
63+
slower for very large images, which can increase total time.
64+
:::
65+
66+
| Model / Device | iPhone 17 Pro [ms] | Google Pixel 10 [ms] |
67+
| :------------------------------------------ | :----------------: | :------------------: |
68+
| STYLE_TRANSFER_CANDY (XNNPACK FP32) | 1192 | 1025 |
69+
| STYLE_TRANSFER_CANDY (XNNPACK INT8) | 272 | 430 |
70+
| STYLE_TRANSFER_CANDY (Core ML FP32) | 100 | - |
71+
| STYLE_TRANSFER_CANDY (Core ML FP16) | 150 | - |
72+
| STYLE_TRANSFER_MOSAIC (XNNPACK FP32) | 1192 | 1025 |
73+
| STYLE_TRANSFER_MOSAIC (XNNPACK INT8) | 272 | 430 |
74+
| STYLE_TRANSFER_MOSAIC (Core ML FP32) | 100 | - |
75+
| STYLE_TRANSFER_MOSAIC (Core ML FP16) | 150 | - |
76+
| STYLE_TRANSFER_UDNIE (XNNPACK FP32) | 1192 | 1025 |
77+
| STYLE_TRANSFER_UDNIE (XNNPACK INT8) | 272 | 430 |
78+
| STYLE_TRANSFER_UDNIE (Core ML FP32) | 100 | - |
79+
| STYLE_TRANSFER_UDNIE (Core ML FP16) | 150 | - |
80+
| STYLE_TRANSFER_RAIN_PRINCESS (XNNPACK FP32) | 1192 | 1025 |
81+
| STYLE_TRANSFER_RAIN_PRINCESS (XNNPACK INT8) | 272 | 430 |
82+
| STYLE_TRANSFER_RAIN_PRINCESS (Core ML FP32) | 100 | - |
83+
| STYLE_TRANSFER_RAIN_PRINCESS (Core ML FP16) | 150 | - |
4684

4785
## OCR
4886

@@ -127,38 +165,50 @@ Benchmark times for text embeddings are highly dependent on the sentence length.
127165
## Image Embeddings
128166

129167
:::info
130-
Times presented below are _model inference times only_ and do not include time taken for pre-processing (e.g. image resizing, normalization) or post-processing (e.g. image resizing) which are dependent on input size.
168+
Inference times are measured directly from native C++ code, wrapping only the
169+
model's forward pass, excluding input-dependent pre- and post-processing (e.g.
170+
image resizing, normalization) and any overhead from React Native runtime.
131171
:::
132172

133173
:::info
134-
Image embedding benchmark times are measured using 224×224 pixel images, as required by the model. All input images, whether larger or smaller, are resized to 224×224 before processing. Resizing is typically fast for small images but may be noticeably slower for very large images, which can increase total time.
174+
For this model all input images, whether larger or smaller, are resized before
175+
processing. Resizing is typically fast for small images but may be noticeably
176+
slower for very large images, which can increase total time.
135177
:::
136178

137-
| Model | iPhone 17 Pro (XNNPACK) [ms] | Google Pixel 10 (XNNPACK) [ms] |
138-
| ------------------------------------- | :--------------------------: | :----------------------------: |
139-
| CLIP_VIT_BASE_PATCH32_IMAGE | 14 | 68 |
140-
| CLIP_VIT_BASE_PATCH32_IMAGE_QUANTIZED | 11 | 31 |
179+
| Model / Device | iPhone 17 Pro [ms] | Google Pixel 10 [ms] |
180+
| :----------------------------------------- | :----------------: | :------------------: |
181+
| CLIP_VIT_BASE_PATCH32_IMAGE (XNNPACK FP32) | 14 | 68 |
182+
| CLIP_VIT_BASE_PATCH32_IMAGE (XNNPACK INT8) | 11 | 31 |
141183

142184
## Semantic Segmentation
143185

144186
:::info
145-
Times presented below are _model inference times only_ and do not include time taken for pre-processing (e.g. image resizing, normalization) or post-processing (e.g. image resizing) which are dependent on input size.
187+
Inference times are measured directly from native C++ code, wrapping only the
188+
model's forward pass, excluding input-dependent pre- and post-processing (e.g.
189+
image resizing, normalization) and any overhead from React Native runtime.
190+
:::
191+
192+
:::info
193+
For this model all input images, whether larger or smaller, are resized before
194+
processing. Resizing is typically fast for small images but may be noticeably
195+
slower for very large images, which can increase total time.
146196
:::
147197

148-
| Model | iPhone 17 Pro (XNNPACK) [ms] | Google Pixel 10 (XNNPACK) [ms] |
149-
| --------------------------------------- | :--------------------------: | :----------------------------: |
150-
| DEEPLAB_V3_RESNET50 | 2000 | 2200 |
151-
| DEEPLAB_V3_RESNET50_QUANTIZED | 118 | 380 |
152-
| DEEPLAB_V3_RESNET101 | 2900 | 3300 |
153-
| DEEPLAB_V3_RESNET101_QUANTIZED | 174 | 660 |
154-
| DEEPLAB_V3_MOBILENET_V3_LARGE | 131 | 153 |
155-
| DEEPLAB_V3_MOBILENET_V3_LARGE_QUANTIZED | 17 | 40 |
156-
| LRASPP_MOBILENET_V3_LARGE | 13 | 36 |
157-
| LRASPP_MOBILENET_V3_LARGE_QUANTIZED | 12 | 20 |
158-
| FCN_RESNET50 | 1800 | 2160 |
159-
| FCN_RESNET50_QUANTIZED | 100 | 320 |
160-
| FCN_RESNET101 | 2600 | 3160 |
161-
| FCN_RESNET101_QUANTIZED | 160 | 620 |
198+
| Model / Device | iPhone 17 Pro [ms] | Google Pixel 10 [ms] |
199+
| :------------------------------------------- | :----------------: | :------------------: |
200+
| DEEPLAB_V3_RESNET50 (XNNPACK FP32) | 2000 | 2200 |
201+
| DEEPLAB_V3_RESNET50 (XNNPACK INT8) | 118 | 380 |
202+
| DEEPLAB_V3_RESNET101 (XNNPACK FP32) | 2900 | 3300 |
203+
| DEEPLAB_V3_RESNET101 (XNNPACK INT8) | 174 | 660 |
204+
| DEEPLAB_V3_MOBILENET_V3_LARGE (XNNPACK FP32) | 131 | 153 |
205+
| DEEPLAB_V3_MOBILENET_V3_LARGE (XNNPACK INT8) | 17 | 40 |
206+
| LRASPP_MOBILENET_V3_LARGE (XNNPACK FP32) | 13 | 36 |
207+
| LRASPP_MOBILENET_V3_LARGE (XNNPACK INT8) | 12 | 20 |
208+
| FCN_RESNET50 (XNNPACK FP32) | 1800 | 2160 |
209+
| FCN_RESNET50 (XNNPACK INT8) | 100 | 320 |
210+
| FCN_RESNET101 (XNNPACK FP32) | 2600 | 3160 |
211+
| FCN_RESNET101 (XNNPACK INT8) | 160 | 620 |
162212

163213
## Text to image
164214

0 commit comments

Comments
 (0)