Skip to content

jamenamcinteer/react-native-vision-camera-ocr-plus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

126 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

react-native-vision-camera-ocr-plus

CI Status npm version

On-device OCR and text translation for React Native, powered by VisionCamera and Nitro Modules. Uses Google ML Kit under the hood for both text recognition and on-device translation.

Features

  • Live OCR — read text from every camera frame via a VisionCamera frame processor
  • Live translation — recognize and translate camera frame text in real time
  • Photo OCR — recognize text asynchronously from a still image URI
  • Model management — remove downloaded translation language models to free storage
  • Supports Latin, Chinese, Devanagari, Japanese, and Korean scripts
  • Optional scan-region cropping to focus recognition on a sub-area of the frame
  • Configurable frame skipping for performance tuning

Requirements

Requirement Minimum version
React Native 0.81
iOS 15.1
Android Minimum SDK 26
Android Target SDK 36
react-native-vision-camera 5.0.0
react-native-worklets 0.8.x
Expo (if used) 54

Migration from v1.x

Upgrading from v1? See the Migration Guide for a full list of breaking changes and step-by-step instructions.

Installation

npm install react-native-vision-camera-ocr-plus react-native-nitro-modules react-native-vision-camera-worklets react-native-worklets

Peer dependencies

Package Version
react-native-vision-camera >=5.0.0
react-native-nitro-modules *
react-native-vision-camera-worklets *
react-native-worklets >=0.8.0

🔥 Firebase Compatibility

If you have Firebase in your project, you will need to set your iOS Deployment Target to at least 16.0.

⚠️ iOS Simulator (Apple Silicon) – Heads-up

On Apple Silicon Macs, building for the iOS Simulator (arm64) may fail after installing this package.

This is a known limitation of Google ML Kit, which does not currently ship an arm64-simulator slice for some iOS frameworks.
The library works correctly on physical iOS devices and on the iOS Simulator when running under Rosetta.

👉 Full context and discussion

iOS

cd ios && pod install

Android

No additional steps — the library is auto-linked.


Usage

👉 See the example app for a working demo.

<Camera /> — live OCR

A drop-in replacement for VisionCamera's <Camera> that automatically runs OCR on every frame and fires callback with the recognized text.

import { Camera, type Text } from 'react-native-vision-camera-ocr-plus'
import { useCameraDevice } from 'react-native-vision-camera'

export default function App() {
  const device = useCameraDevice('back')

  return (
    <Camera
      style={{ flex: 1 }}
      device={device}
      isActive
      mode="recognize"
      options={{
        language: 'latin',         // 'latin' | 'chinese' | 'devanagari' | 'japanese' | 'korean'
        frameSkipThreshold: 10,    // process every Nth frame (default: 10)
        useLightweightMode: false, // Android only — skip corner points, languages, element data
        // scanRegion: { left: '10%', top: '20%', width: '80%', height: '30%' },
      }}
      callback={(data) => {
        const text = data as Text
        console.log(text.resultText)
        console.log(text.blocks) // TextBlock[]
      }}
    />
  )
}

<Camera /> — live translation

Recognizes and translates text from every camera frame. The callback receives the translated string.

import { Camera } from 'react-native-vision-camera-ocr-plus'
import { useCameraDevice } from 'react-native-vision-camera'

export default function App() {
  const device = useCameraDevice('back')

  return (
    <Camera
      style={{ flex: 1 }}
      device={device}
      isActive
      mode="translate"
      options={{ from: 'fr', to: 'en' }}
      callback={(data) => console.log(data as string)}
    />
  )
}

Hooks — build your own frame processor

Use useTextRecognition or useTranslate to integrate the plugins into a custom frame processor.

useTextRecognition

Returns a TextRecognitionHandle with a worklet-safe scanText function and the raw recognizer HybridObject.

In VisionCamera v5, pixelFormat is configured via useFrameOutput (not a <Camera> prop). Android requires pixelFormat: 'rgb' so the AHardwareBuffer is in RGBA format and can be CPU-locked; on iOS any format works.

import { useTextRecognition, type Text } from 'react-native-vision-camera-ocr-plus'
import { Camera, useFrameOutput, useCameraDevice } from 'react-native-vision-camera'
import { scheduleOnRN } from 'react-native-worklets'

function MyCamera() {
  const device = useCameraDevice('back')
  const { scanText } = useTextRecognition({ language: 'latin', frameSkipThreshold: 5 })

  const frameOutput = useFrameOutput({
    pixelFormat: 'rgb', // required on Android
    onFrame: (frame) => {
      'worklet'
      const result = scanText(frame)
      if (result.resultText) {
        scheduleOnRN(setDetectedText, result.resultText)
      }
      frame.dispose()
    },
  })

  return <Camera device={device} isActive outputs={[frameOutput]} />
}

> **Tip  Scan region:** Pass `scanRegion` in the options to restrict OCR to a portion of the frame.
> The coordinates are percentage strings (`"0%"`  `"100%"`) relative to the display-oriented frame.
> Pair it with a matching `<View>` overlay so the visible box aligns with what is actually scanned:
>
> ```tsx
> const scanRegion = { left: '10%', top: '25%', width: '80%', height: '30%' }
> const { scanText } = useTextRecognition({ language: 'latin', scanRegion })
>
> // Render a matching overlay:
> <View style={{ position: 'absolute', left: '10%', top: '25%', width: '80%', height: '30%',
>                borderWidth: 2, borderColor: 'red' }} />
> ```

#### `useTranslate`

Returns a `TranslatorHandle` with a worklet-safe `scanText` function for OCR and an async `translate` function for translation.

```tsx
import { useTranslate } from 'react-native-vision-camera-ocr-plus'
import { Camera, useFrameOutput, useCameraDevice } from 'react-native-vision-camera'
import { scheduleOnRN } from 'react-native-worklets'

function MyCamera() {
  const device = useCameraDevice('back')
  const { scanText, translate } = useTranslate({ from: 'fr', to: 'en' })
  // To restrict OCR to a region, pass scanRegion:
  // const { scanText, translate } = useTranslate({ from: 'fr', to: 'en', scanRegion: { left: '10%', top: '25%', width: '80%', height: '30%' } })

  const frameOutput = useFrameOutput({
    pixelFormat: 'rgb', // required on Android
    onFrame: (frame) => {
      'worklet'
      const result = scanText(frame)
      if (result.resultText) {
        translate(result.resultText).then((translated) => {
          scheduleOnRN(setTranslated, translated)
        })
      }
      frame.dispose()
    },
  })

  return <Camera device={device} isActive outputs={[frameOutput]} />
}

Low-level factories

The hooks call createTextRecognitionPlugin and createTranslatorPlugin internally. You can use them directly outside of React components:

import { createTextRecognitionPlugin } from 'react-native-vision-camera-ocr-plus'

const { scanText, recognizer } = createTextRecognitionPlugin({ language: 'latin' })

PhotoRecognizer — still image OCR

Asynchronously recognizes text in a still photo URI.

import { PhotoRecognizer, type Text } from 'react-native-vision-camera-ocr-plus'

const result: Text = await PhotoRecognizer({
  uri: 'file:///path/to/photo.jpg',
  orientation: 'portrait', // 'portrait' | 'portraitUpsideDown' | 'landscapeLeft' | 'landscapeRight'
})

console.log(result.resultText)
console.log(result.blocks) // TextBlock[]

The orientation parameter is optional and defaults to 'portrait'. On iOS the file:// scheme is stripped automatically; on Android it is added if missing.

RemoveLanguageModel — free storage

Removes a downloaded ML Kit translation model from the device. Returns true on success.

import { RemoveLanguageModel } from 'react-native-vision-camera-ocr-plus'

const success: boolean = await RemoveLanguageModel('fr')

API Reference

<Camera />

Accepts all standard VisionCamera CameraProps plus:

Prop Type Description
mode 'recognize' | 'translate' Whether to run OCR or OCR + translation
options TextRecognitionOptions | TranslatorOptions Mode-specific options (see below)
callback (data: Text | string) => void Called with Text in recognize mode or a translated string in translate mode

Functions

Function Signature Description
PhotoRecognizer (options: PhotoOptions) => Promise<Text> OCR a still image by URI
RemoveLanguageModel (code: Languages) => Promise<boolean> Delete a downloaded translation model
createTextRecognitionPlugin (options?: TextRecognitionOptions) => TextRecognitionHandle Create a frame-processor OCR plugin
createTranslatorPlugin (options?: TranslatorOptions) => TranslatorHandle Create a frame-processor translation plugin

Hooks

Hook Returns Description
useTextRecognition(options?) TextRecognitionHandle Memoized OCR plugin
useTranslate(options?) TranslatorHandle Memoized translation plugin

Types

type Text = {
  resultText: string
  blocks: BlockData[]
}

type BlockData = {
  blockText: string
  blockFrame: FrameType
  blockCornerPoints?: CornerPointsType
  lines: LineData[]
}

type LineData = {
  lineText: string
  lineFrame: FrameType
  lineCornerPoints?: CornerPointsType
  lineLanguages?: string[]
  elements: ElementData[]
}

type ElementData = {
  elementText: string
  elementFrame: FrameType
  elementCornerPoints?: CornerPointsType
}

type FrameType = {
  boundingCenterX: number
  boundingCenterY: number
  height: number
  width: number
  x: number
  y: number
}

type CornerPointsType = { x: number; y: number }[]

type ScanRegion = {
  left: Percentage   // e.g. "10%"
  top: Percentage
  width: Percentage
  height: Percentage
}

type TextRecognitionOptions = {
  language?: 'latin' | 'chinese' | 'devanagari' | 'japanese' | 'korean'
  scanRegion?: ScanRegion
  frameSkipThreshold?: number   // default: 10
  useLightweightMode?: boolean  // Android only — skips corner points, languages, and element data; default: false
}

type TranslatorOptions = {
  from: Languages
  to: Languages
  scanRegion?: ScanRegion  // restrict OCR to a percentage-based region of the frame
}

type PhotoOptions = {
  uri: string
  orientation?: 'portrait' | 'portraitUpsideDown' | 'landscapeLeft' | 'landscapeRight'
}

type TextRecognitionHandle = {
  scanText: (frame: Frame) => Text  // worklet-safe
  recognizer: TextRecognizer        // raw Nitro HybridObject
}

type TranslatorHandle = {
  scanText: (frame: Frame) => Text        // worklet-safe OCR
  translate: (text: string) => Promise<string>  // JS-thread translation
  recognizer: TextRecognizer
  translator: Translator
  from: string
  to: string
}

Languages is a union of BCP-47 language codes: 'af' | 'sq' | 'ar' | 'be' | 'bn' | 'bg' | 'ca' | 'zh' | 'cs' | 'da' | 'nl' | 'en' | ... (full list in src/types.ts).


Structure

android/           Kotlin HybridObject implementations (HybridTextRecognizer, HybridTranslator)
ios/               Swift HybridObject implementations
src/
  specs/
    TextRecognizer.nitro.ts   Nitro HybridObject spec for OCR
    Translator.nitro.ts       Nitro HybridObject spec for translation
  Camera.tsx                  <Camera> component + useTextRecognition / useTranslate hooks
  scanText.ts                 createTextRecognitionPlugin (frame processor factory)
  translateText.ts            createTranslatorPlugin (frame processor factory)
  PhotoRecognizer.ts          Async still-photo OCR
  RemoveLanguageModel.ts      Delete downloaded translation models
  types.ts                    All shared TypeScript types
  index.ts                    Public API surface
nitro.json         Nitrogen config — registers HybridTextRecognizer & HybridTranslator

🧠 Contributing

Contributions, feature requests, and bug reports are always welcome!
Please open an issue or pull request.


☕ Support the Project

If this library helps you build awesome apps, consider supporting future maintenance and development 💛

Your support helps keep the package updated and open source ❤️


📄 License

MIT © Jamena McInteer

About

React Native Vision Camera plugin for on-device text recognition (OCR) and translation using ML Kit. Maintained fork of react-native-vision-camera-text-recognition

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Sponsor this project

 

Contributors