Skip to content

Commit 12b2500

Browse files
chmjkbmsluszniakNorbertKlockiewicz
authored
feat!: (object detection) RF-detr support, generic model support (#826)
## Description <!-- Provide a concise and descriptive summary of the changes implemented in this PR. --> ### Introduces a breaking change? - [x] Yes - [ ] No ### Type of change - [ ] Bug fix (change which fixes an issue) - [x] New feature (change which adds functionality) - [ ] Documentation update (improves or adds clarity to existing documentation) - [ ] Other (chores, tests, code style improvements etc.) ### Tested on - [x] iOS - [ ] Android ### Testing instructions Run demo app with RF-DETR model ### Screenshots <!-- Add screenshots here, if applicable --> ### Related issues <!-- Link related issues here using #issue-number --> ### Checklist - [x] I have performed a self-review of my code - [ ] I have commented my code, particularly in hard-to-understand areas - [x] I have updated the documentation accordingly - [x] My changes generate no new warnings ### Additional notes <!-- Include any additional information, assumptions, or context that reviewers might need to understand this PR. --> --------- Co-authored-by: Mateusz Słuszniak <mateusz.sluszniak@swmansion.com> Co-authored-by: Mateusz Sluszniak <56299341+msluszniak@users.noreply.github.com> Co-authored-by: Norbert Klockiewicz <Nklockiewicz12@gmail.com>
1 parent 10e0a80 commit 12b2500

File tree

79 files changed

+1268
-753
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

79 files changed

+1268
-753
lines changed

.cspell-wordlist.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -122,3 +122,8 @@ worklet
122122
worklets
123123
BGRA
124124
RGBA
125+
DETR
126+
detr
127+
metaprogramming
128+
ktlint
129+
lefthook

apps/computer-vision/app/object_detection/index.tsx

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ import { getImage } from '../../utils';
44
import {
55
Detection,
66
useObjectDetection,
7-
SSDLITE_320_MOBILENET_V3_LARGE,
7+
RF_DETR_NANO,
88
} from 'react-native-executorch';
99
import { View, StyleSheet, Image } from 'react-native';
1010
import ImageWithBboxes from '../../components/ImageWithBboxes';
@@ -20,11 +20,11 @@ export default function ObjectDetectionScreen() {
2020
height: number;
2121
}>();
2222

23-
const ssdLite = useObjectDetection({ model: SSDLITE_320_MOBILENET_V3_LARGE });
23+
const rfDetr = useObjectDetection({ model: RF_DETR_NANO });
2424
const { setGlobalGenerating } = useContext(GeneratingContext);
2525
useEffect(() => {
26-
setGlobalGenerating(ssdLite.isGenerating);
27-
}, [ssdLite.isGenerating, setGlobalGenerating]);
26+
setGlobalGenerating(rfDetr.isGenerating);
27+
}, [rfDetr.isGenerating, setGlobalGenerating]);
2828

2929
const handleCameraPress = async (isCamera: boolean) => {
3030
const image = await getImage(isCamera);
@@ -42,19 +42,19 @@ export default function ObjectDetectionScreen() {
4242
const runForward = async () => {
4343
if (imageUri) {
4444
try {
45-
const output = await ssdLite.forward(imageUri);
45+
const output = await rfDetr.forward(imageUri);
4646
setResults(output);
4747
} catch (e) {
4848
console.error(e);
4949
}
5050
}
5151
};
5252

53-
if (!ssdLite.isReady) {
53+
if (!rfDetr.isReady) {
5454
return (
5555
<Spinner
56-
visible={!ssdLite.isReady}
57-
textContent={`Loading the model ${(ssdLite.downloadProgress * 100).toFixed(0)} %`}
56+
visible={!rfDetr.isReady}
57+
textContent={`Loading the model ${(rfDetr.downloadProgress * 100).toFixed(0)} %`}
5858
/>
5959
);
6060
}

docs/docs/03-hooks/02-computer-vision/useObjectDetection.md

Lines changed: 53 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,7 @@
22
title: useObjectDetection
33
---
44

5-
Object detection is a computer vision technique that identifies and locates objects within images or video. It’s commonly used in applications like image recognition, video surveillance or autonomous driving.
6-
`useObjectDetection` is a hook that allows you to seamlessly integrate object detection into your React Native applications.
5+
Object detection is a computer vision technique that identifies and locates objects within images. Unlike image classification, which assigns a single label to the whole image, object detection returns a list of detected objects — each with a bounding box, a class label, and a confidence score. React Native ExecuTorch offers a dedicated hook `useObjectDetection` for this task.
76

87
:::warning
98
It is recommended to use models provided by us, which are available at our [Hugging Face repository](https://huggingface.co/collections/software-mansion/object-detection-68d0ea936cd0906843cbba7d). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
@@ -16,32 +15,37 @@ It is recommended to use models provided by us, which are available at our [Hugg
1615

1716
## High Level Overview
1817

19-
```tsx
18+
```typescript
2019
import {
2120
useObjectDetection,
2221
SSDLITE_320_MOBILENET_V3_LARGE,
2322
} from 'react-native-executorch';
2423

25-
function App() {
26-
const ssdlite = useObjectDetection({ model: SSDLITE_320_MOBILENET_V3_LARGE });
24+
const model = useObjectDetection({
25+
model: SSDLITE_320_MOBILENET_V3_LARGE,
26+
});
2727

28-
// ...
29-
for (const detection of await ssdlite.forward('https://url-to-image.jpg')) {
30-
console.log('Bounding box: ', detection.bbox);
31-
console.log('Bounding label: ', detection.label);
32-
console.log('Bounding score: ', detection.score);
33-
}
34-
// ...
28+
const imageUri = 'file:///Users/.../photo.jpg';
29+
30+
try {
31+
const detections = await model.forward(imageUri);
32+
// detections is an array of Detection objects
33+
} catch (error) {
34+
console.error(error);
3535
}
3636
```
3737

3838
### Arguments
3939

4040
`useObjectDetection` takes [`ObjectDetectionProps`](../../06-api-reference/interfaces/ObjectDetectionProps.md) that consists of:
4141

42-
- `model` containing [`modelSource`](../../06-api-reference/interfaces/ObjectDetectionProps.md#modelsource).
42+
- `model` - An object containing:
43+
- `modelName` - The name of a built-in model. See [`ObjectDetectionModelSources`](../../06-api-reference/interfaces/ObjectDetectionProps.md) for the list of supported models.
44+
- `modelSource` - The location of the model binary (a URL or a bundled resource).
4345
- An optional flag [`preventLoad`](../../06-api-reference/interfaces/ObjectDetectionProps.md#preventload) which prevents auto-loading of the model.
4446

47+
The hook is generic over the model config — TypeScript automatically infers the correct label type based on the `modelName` you provide. No explicit generic parameter is needed.
48+
4549
You need more details? Check the following resources:
4650

4751
- For detailed information about `useObjectDetection` arguments check this section: [`useObjectDetection` arguments](../../06-api-reference/functions/useObjectDetection.md#parameters).
@@ -50,54 +54,56 @@ You need more details? Check the following resources:
5054

5155
### Returns
5256

53-
`useObjectDetection` returns an object called `ObjectDetectionType` containing bunch of functions to interact with object detection models. To get more details please read: [`ObjectDetectionType` API Reference](../../06-api-reference/interfaces/ObjectDetectionType.md).
57+
`useObjectDetection` returns an [`ObjectDetectionType`](../../06-api-reference/interfaces/ObjectDetectionType.md) object containing:
5458

55-
## Running the model
56-
57-
To run the model, you can use the [`forward`](../../06-api-reference/interfaces/ObjectDetectionType.md#forward) method. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image (whole URI or only raw base64). The function returns an array of [`Detection`](../../06-api-reference/interfaces/Detection.md) objects. Each object contains coordinates of the bounding box, the label of the detected object, and the confidence score. For more information, please refer to the reference or type definitions.
59+
- `isReady` - Whether the model is loaded and ready to process images.
60+
- `isGenerating` - Whether the model is currently processing an image.
61+
- `error` - An error object if the model failed to load or encountered a runtime error.
62+
- `downloadProgress` - A value between 0 and 1 representing the download progress of the model binary.
63+
- `forward` - A function to run inference on an image.
5864

59-
## Detection object
65+
## Running the model
6066

61-
The detection object is specified as follows:
67+
To run the model, use the [`forward`](../../06-api-reference/interfaces/ObjectDetectionType.md#forward) method. It accepts two arguments:
6268

63-
```typescript
64-
interface Bbox {
65-
x1: number;
66-
y1: number;
67-
x2: number;
68-
y2: number;
69-
}
69+
- `imageSource` (required) - The image to process. Can be a remote URL, a local file URI, or a base64-encoded image (whole URI or only raw base64).
70+
- `detectionThreshold` (optional) - A number between 0 and 1 representing the minimum confidence score for a detection to be included in the results. Defaults to `0.7`.
7071

71-
interface Detection {
72-
bbox: Bbox;
73-
label: keyof typeof CocoLabels;
74-
score: number;
75-
}
76-
```
72+
`forward` returns a promise resolving to an array of [`Detection`](../../06-api-reference/interfaces/Detection.md) objects, each containing:
7773

78-
The `bbox` property contains information about the bounding box of detected objects. It is represented as two points: one at the bottom-left corner of the bounding box (`x1`, `y1`) and the other at the top-right corner (`x2`, `y2`).
79-
The `label` property contains the name of the detected object, which corresponds to one of the [`CocoLabels`](../../06-api-reference/enumerations/CocoLabel.md). The `score` represents the confidence score of the detected object.
74+
- `bbox` - A [`Bbox`](../../06-api-reference/interfaces/Bbox.md) object with `x1`, `y1` (top-left corner) and `x2`, `y2` (bottom-right corner) coordinates in the original image's pixel space.
75+
- `label` - The class name of the detected object, typed to the label map of the chosen model.
76+
- `score` - The confidence score of the detection, between 0 and 1.
8077

8178
## Example
8279

83-
```tsx
84-
import {
85-
useObjectDetection,
86-
SSDLITE_320_MOBILENET_V3_LARGE,
87-
} from 'react-native-executorch';
80+
```typescript
81+
import { useObjectDetection, RF_DETR_NANO } from 'react-native-executorch';
8882

8983
function App() {
90-
const ssdlite = useObjectDetection({ model: SSDLITE_320_MOBILENET_V3_LARGE });
84+
const model = useObjectDetection({
85+
model: RF_DETR_NANO,
86+
});
87+
88+
const handleDetect = async () => {
89+
if (!model.isReady) return;
90+
91+
const imageUri = 'file:///Users/.../photo.jpg';
9192

92-
const runModel = async () => {
93-
const detections = await ssdlite.forward('https://url-to-image.jpg');
93+
try {
94+
const detections = await model.forward(imageUri, 0.5);
9495

95-
for (const detection of detections) {
96-
console.log('Bounding box: ', detection.bbox);
97-
console.log('Bounding label: ', detection.label);
98-
console.log('Bounding score: ', detection.score);
96+
for (const detection of detections) {
97+
console.log('Label:', detection.label);
98+
console.log('Score:', detection.score);
99+
console.log('Bounding box:', detection.bbox);
100+
}
101+
} catch (error) {
102+
console.error(error);
99103
}
100104
};
105+
106+
// ...
101107
}
102108
```
103109

@@ -106,3 +112,4 @@ function App() {
106112
| Model | Number of classes | Class list |
107113
| ----------------------------------------------------------------------------------------------------------------------------- | ----------------- | -------------------------------------------------------- |
108114
| [SSDLite320 MobileNetV3 Large](https://huggingface.co/software-mansion/react-native-executorch-ssdlite320-mobilenet-v3-large) | 91 | [COCO](../../06-api-reference/enumerations/CocoLabel.md) |
115+
| [RF-DETR Nano](https://huggingface.co/software-mansion/react-native-executorch-rf-detr-nano) | 80 | [COCO](../../06-api-reference/enumerations/CocoLabel.md) |

0 commit comments

Comments
 (0)