You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat!: (object detection) RF-detr support, generic model support (#826)
## Description
<!-- Provide a concise and descriptive summary of the changes
implemented in this PR. -->
### Introduces a breaking change?
- [x] Yes
- [ ] No
### Type of change
- [ ] Bug fix (change which fixes an issue)
- [x] New feature (change which adds functionality)
- [ ] Documentation update (improves or adds clarity to existing
documentation)
- [ ] Other (chores, tests, code style improvements etc.)
### Tested on
- [x] iOS
- [ ] Android
### Testing instructions
Run demo app with RF-DETR model
### Screenshots
<!-- Add screenshots here, if applicable -->
### Related issues
<!-- Link related issues here using #issue-number -->
### Checklist
- [x] I have performed a self-review of my code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [x] I have updated the documentation accordingly
- [x] My changes generate no new warnings
### Additional notes
<!-- Include any additional information, assumptions, or context that
reviewers might need to understand this PR. -->
---------
Co-authored-by: Mateusz Słuszniak <mateusz.sluszniak@swmansion.com>
Co-authored-by: Mateusz Sluszniak <56299341+msluszniak@users.noreply.github.com>
Co-authored-by: Norbert Klockiewicz <Nklockiewicz12@gmail.com>
Copy file name to clipboardExpand all lines: docs/docs/03-hooks/02-computer-vision/useObjectDetection.md
+53-46Lines changed: 53 additions & 46 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,8 +2,7 @@
2
2
title: useObjectDetection
3
3
---
4
4
5
-
Object detection is a computer vision technique that identifies and locates objects within images or video. It’s commonly used in applications like image recognition, video surveillance or autonomous driving.
6
-
`useObjectDetection` is a hook that allows you to seamlessly integrate object detection into your React Native applications.
5
+
Object detection is a computer vision technique that identifies and locates objects within images. Unlike image classification, which assigns a single label to the whole image, object detection returns a list of detected objects — each with a bounding box, a class label, and a confidence score. React Native ExecuTorch offers a dedicated hook `useObjectDetection` for this task.
7
6
8
7
:::warning
9
8
It is recommended to use models provided by us, which are available at our [Hugging Face repository](https://huggingface.co/collections/software-mansion/object-detection-68d0ea936cd0906843cbba7d). You can also use [constants](https://github.com/software-mansion/react-native-executorch/blob/main/packages/react-native-executorch/src/constants/modelUrls.ts) shipped with our library.
@@ -16,32 +15,37 @@ It is recommended to use models provided by us, which are available at our [Hugg
-`modelName` - The name of a built-in model. See [`ObjectDetectionModelSources`](../../06-api-reference/interfaces/ObjectDetectionProps.md) for the list of supported models.
44
+
-`modelSource` - The location of the model binary (a URL or a bundled resource).
43
45
- An optional flag [`preventLoad`](../../06-api-reference/interfaces/ObjectDetectionProps.md#preventload) which prevents auto-loading of the model.
44
46
47
+
The hook is generic over the model config — TypeScript automatically infers the correct label type based on the `modelName` you provide. No explicit generic parameter is needed.
48
+
45
49
You need more details? Check the following resources:
46
50
47
51
- For detailed information about `useObjectDetection` arguments check this section: [`useObjectDetection` arguments](../../06-api-reference/functions/useObjectDetection.md#parameters).
@@ -50,54 +54,56 @@ You need more details? Check the following resources:
50
54
51
55
### Returns
52
56
53
-
`useObjectDetection` returns an object called `ObjectDetectionType` containing bunch of functions to interact with object detection models. To get more details please read: [`ObjectDetectionType` API Reference](../../06-api-reference/interfaces/ObjectDetectionType.md).
57
+
`useObjectDetection` returns an [`ObjectDetectionType`](../../06-api-reference/interfaces/ObjectDetectionType.md) object containing:
54
58
55
-
## Running the model
56
-
57
-
To run the model, you can use the [`forward`](../../06-api-reference/interfaces/ObjectDetectionType.md#forward) method. It accepts one argument, which is the image. The image can be a remote URL, a local file URI, or a base64-encoded image (whole URI or only raw base64). The function returns an array of [`Detection`](../../06-api-reference/interfaces/Detection.md) objects. Each object contains coordinates of the bounding box, the label of the detected object, and the confidence score. For more information, please refer to the reference or type definitions.
59
+
-`isReady` - Whether the model is loaded and ready to process images.
60
+
-`isGenerating` - Whether the model is currently processing an image.
61
+
-`error` - An error object if the model failed to load or encountered a runtime error.
62
+
-`downloadProgress` - A value between 0 and 1 representing the download progress of the model binary.
63
+
-`forward` - A function to run inference on an image.
58
64
59
-
## Detection object
65
+
## Running the model
60
66
61
-
The detection object is specified as follows:
67
+
To run the model, use the [`forward`](../../06-api-reference/interfaces/ObjectDetectionType.md#forward) method. It accepts two arguments:
62
68
63
-
```typescript
64
-
interfaceBbox {
65
-
x1:number;
66
-
y1:number;
67
-
x2:number;
68
-
y2:number;
69
-
}
69
+
-`imageSource` (required) - The image to process. Can be a remote URL, a local file URI, or a base64-encoded image (whole URI or only raw base64).
70
+
-`detectionThreshold` (optional) - A number between 0 and 1 representing the minimum confidence score for a detection to be included in the results. Defaults to `0.7`.
70
71
71
-
interfaceDetection {
72
-
bbox:Bbox;
73
-
label:keyoftypeofCocoLabels;
74
-
score:number;
75
-
}
76
-
```
72
+
`forward` returns a promise resolving to an array of [`Detection`](../../06-api-reference/interfaces/Detection.md) objects, each containing:
77
73
78
-
The `bbox` property contains information about the bounding box of detected objects. It is represented as two points: one at the bottom-left corner of the bounding box (`x1`, `y1`) and the other at the top-right corner (`x2`, `y2`).
79
-
The `label` property contains the name of the detected object, which corresponds to one of the [`CocoLabels`](../../06-api-reference/enumerations/CocoLabel.md). The `score` represents the confidence score of the detected object.
74
+
-`bbox` - A [`Bbox`](../../06-api-reference/interfaces/Bbox.md) object with `x1`, `y1` (top-left corner) and `x2`, `y2` (bottom-right corner) coordinates in the original image's pixel space.
75
+
-`label` - The class name of the detected object, typed to the label map of the chosen model.
76
+
-`score` - The confidence score of the detection, between 0 and 1.
0 commit comments