Skip to content

Latest commit

 

History

History
133 lines (97 loc) · 8.28 KB

File metadata and controls

133 lines (97 loc) · 8.28 KB
title useObjectDetection

Object detection is a computer vision technique that identifies and locates objects within images. Unlike image classification, which assigns a single label to the whole image, object detection returns a list of detected objects — each with a bounding box, a class label, and a confidence score. React Native ExecuTorch offers a dedicated hook useObjectDetection for this task.

:::warning It is recommended to use models provided by us, which are available at our Hugging Face repository. You can also use constants shipped with our library. :::

API Reference

High Level Overview

import {
  useObjectDetection,
  SSDLITE_320_MOBILENET_V3_LARGE,
} from 'react-native-executorch';

const model = useObjectDetection({
  model: SSDLITE_320_MOBILENET_V3_LARGE,
});

const imageUri = 'file:///Users/.../photo.jpg';

try {
  const detections = await model.forward(imageUri);
  // detections is an array of Detection objects
} catch (error) {
  console.error(error);
}

Arguments

useObjectDetection takes ObjectDetectionProps that consists of:

  • model - An object containing:
    • modelName - The name of a built-in model. See ObjectDetectionModelSources for the list of supported models.
    • modelSource - The location of the model binary (a URL or a bundled resource).
  • An optional flag preventLoad which prevents auto-loading of the model.

The hook is generic over the model config — TypeScript automatically infers the correct label type based on the modelName you provide. No explicit generic parameter is needed.

You need more details? Check the following resources:

Returns

useObjectDetection returns an ObjectDetectionType object containing:

  • isReady - Whether the model is loaded and ready to process images.
  • isGenerating - Whether the model is currently processing an image.
  • error - An error object if the model failed to load or encountered a runtime error.
  • downloadProgress - A value between 0 and 1 representing the download progress of the model binary.
  • forward - A function to run inference on an image.
  • getAvailableInputSizes - A function that returns available input sizes for multi-method models (YOLO). Returns undefined for single-method models.
  • runOnFrame - A synchronous worklet function for real-time VisionCamera frame processing. See VisionCamera Integration for usage.

Running the model

To run the model, use the forward method. It accepts two arguments:

  • input (required) - The image to process. Can be a remote URL, a local file URI, a base64-encoded image (whole URI or only raw base64), or a PixelData object (raw RGB pixel buffer).
  • options (optional) - An ObjectDetectionOptions object with the following properties:
    • detectionThreshold (optional) - A number between 0 and 1 representing the minimum confidence score. Defaults to model-specific value (typically 0.7).
    • iouThreshold (optional) - IoU threshold for non-maximum suppression (0-1). Defaults to model-specific value (typically 0.55).
    • inputSize (optional) - For multi-method models like YOLO, specify the input resolution (384, 512, or 640). Defaults to 384 for YOLO models.
    • classesOfInterest (optional) - Array of class labels to filter detections. Only detections matching these classes will be returned.

forward returns a promise resolving to an array of Detection objects, each containing:

  • bbox - A Bbox object with x1, y1 (top-left corner) and x2, y2 (bottom-right corner) coordinates in the original image's pixel space.
  • label - The class name of the detected object, typed to the label map of the chosen model.
  • score - The confidence score of the detection, between 0 and 1.

Example

import { useObjectDetection, YOLO26N } from 'react-native-executorch';

function App() {
  const model = useObjectDetection({
    model: YOLO26N,
  });

  const handleDetect = async () => {
    if (!model.isReady) return;

    const imageUri = 'file:///Users/.../photo.jpg';

    try {
      const detections = await model.forward(imageUri, {
        detectionThreshold: 0.5,
        inputSize: 640,
      });

      console.log('Detected:', detections.length, 'objects');
    } catch (error) {
      console.error(error);
    }
  };

  // ...
}

VisionCamera integration

See the full guide: VisionCamera Integration.

Supported models

Model Number of classes Class list Multi-size Support
SSDLite320 MobileNetV3 Large 91 COCO No (fixed: 320×320)
RF-DETR Nano 80 COCO No (fixed: 384×384)
YOLO26N 80 COCO YOLO Yes (384/512/640)
YOLO26S 80 COCO YOLO Yes (384/512/640)
YOLO26M 80 COCO YOLO Yes (384/512/640)
YOLO26L 80 COCO YOLO Yes (384/512/640)
YOLO26X 80 COCO YOLO Yes (384/512/640)

:::tip YOLO models support multiple input sizes (384px, 512px, 640px). Smaller sizes are faster but less accurate, while larger sizes are more accurate but slower. Choose based on your speed/accuracy requirements. :::