11# flutter_ort_plugin
22
3- Flutter plugin for [ ONNX Runtime] ( https://onnxruntime.ai/ ) inference via Dart FFI. Load ` .onnx ` models and run them natively on Android, iOS, macOS, Linux, and Windows .
3+ Flutter plugin for [ ONNX Runtime] ( https://onnxruntime.ai/ ) inference via Dart FFI. Load ` .onnx ` models and run them natively on Android, iOS, and Linux .
44
55ONNX Runtime version: ** 1.24.1**
66
7+ ## Platform Support
8+
9+ | Platform | Minimum Version | Execution Providers | Status |
10+ | ----------- | -------------------- | --------------------------- | --------------- |
11+ | ** Android** | API 24 (Android 7.0) | WebGPU, NNAPI, XNNPACK, CPU | ✅ Full support |
12+ | ** iOS** | iOS 15.1 | CoreML, CPU | ✅ Full support |
13+ | ** Linux** | Any | CPU only | ✅ CPU support |
14+
715## Installation
816
917``` yaml
@@ -102,24 +110,22 @@ await session.dispose();
102110
103111The plugin auto-detects the best provider per platform :
104112
105- | Platform | Default providers | Supported | Notes |
106- | ------------- | ------------------- | ----------- | -------------------------------------------- |
107- | iOS/macOS | CoreML, CPU | ✅ Fully | CoreML via dedicated config |
108- | Android | XNNPACK , NNAPI, CPU | ✅ Fully | NNAPI with flags, XNNPACK with thread config |
109- | Linux/Windows | CPU | ✅ CPU only | GPU providers via generic API only |
113+ | Platform | Default providers | Supported | Notes |
114+ | -------- | --------------------------- | ----------- | - -------------------------------------------- |
115+ | iOS | CoreML, CPU | ✅ Fully | CoreML via dedicated config |
116+ | Android | WebGPU , NNAPI, XNNPACK, CPU | ✅ Fully | WebGPU via Dawn, NNAPI flags, XNNPACK threads |
117+ | Linux | CPU | ✅ CPU only | CPU execution provider |
110118
111119# ## Provider Implementation Status
112120
113- | Provider | Status | Notes |
114- | ---------- | --------------- | ------------------------------------------------ |
115- | CPU | ✅ Ready | Always available, built-in |
116- | CoreML | ✅ Ready | iOS/macOS acceleration |
117- | NNAPI | ✅ Ready | Android NPU/GPU with FP16/NCHW flags |
118- | XNNPACK | ✅ Ready | Android CPU SIMD optimization with thread config |
119- | QNN | ⚠️ Generic only | Qualcomm - not fully implemented |
120- | CUDA | ⚠️ Generic only | NVIDIA - not fully implemented |
121- | TensorRT | ⚠️ Generic only | NVIDIA - not fully implemented |
122- | All others | ⚠️ Generic only | May work via generic API |
121+ | Provider | Status | Platform | Notes |
122+ | -------- | --------------- | -------- | ----------------------------------------- |
123+ | CPU | ✅ Ready | All | Always available, built-in |
124+ | CoreML | ✅ Ready | iOS | Apple Neural Engine/GPU acceleration |
125+ | WebGPU | ✅ Ready | Android | GPU acceleration via Dawn/WebGPU support |
126+ | NNAPI | ✅ Ready | Android | NPU/GPU with FP16/NCHW/CPU-disabled flags |
127+ | XNNPACK | ✅ Ready | Android | CPU SIMD optimization with thread config |
128+ | QNN | ⚠️ Generic only | Android | Qualcomm NPU via generic API |
123129
124130# ## Automatic (default)
125131
@@ -175,17 +181,70 @@ final session = OrtSessionWrapper.createWithProviders(
175181);
176182` ` `
177183
184+ # ## WebGPU (Android GPU acceleration)
185+
186+ WebGPU provides hardware-accelerated inference on Android devices with GPU support :
187+
188+ ` ` ` dart
189+ final session = OrtSessionWrapper.createWithProviders(
190+ 'model.onnx',
191+ providers: [OrtProvider.webGpu, OrtProvider.cpu],
192+ providerOptions: {
193+ // WebGPU options can be added here if needed
194+ OrtProvider.webGpu: {},
195+ },
196+ );
197+ ` ` `
198+
178199# ## Querying available providers
179200
180201` ` ` dart
181202final providers = OrtProviders(OnnxRuntime.instance);
182203
183204providers.getAvailableProviders();
184- // ['CoreMLExecutionProvider ', 'CPUExecutionProvider']
205+ // ['WebGpuExecutionProvider', 'NnapiExecutionProvider ', 'CPUExecutionProvider']
185206
186- providers.isProviderAvailable(OrtProvider.coreML ); // true
207+ providers.isProviderAvailable(OrtProvider.webGpu ); // true
187208` ` `
188209
210+ # # Performance Tuning
211+
212+ Fine-tune session options for optimal performance on your target device :
213+
214+ ` ` ` dart
215+ import 'package:flutter_ort_plugin/flutter_ort_plugin.dart';
216+
217+ final session = OrtSessionWrapper.create(
218+ 'model.onnx',
219+ sessionConfig: SessionConfig(
220+ intraOpThreads: 4, // Threads within ops (0 = ORT default)
221+ interOpThreads: 1, // Threads across ops (0 = ORT default)
222+ graphOptimizationLevel: GraphOptLevel.all, // Max graph optimizations
223+ executionMode: ExecutionMode.sequential, // Better on mobile
224+ ),
225+ );
226+ ` ` `
227+
228+ # ## Android Big.LITTLE Optimization
229+
230+ For Android devices with heterogeneous cores, limit intra-op threads to avoid contention :
231+
232+ ` ` ` dart
233+ final session = OrtSessionWrapper.create(
234+ 'model.onnx',
235+ sessionConfig: SessionConfig.androidOptimized, // Pre-configured for Android
236+ );
237+ ` ` `
238+
239+ # ## Available Options
240+
241+ | Option | Values | Description |
242+ | ------------------------ | ----------------------------------- | ------------------------------------- |
243+ | `intraOpThreads` | `0` (auto) or integer | Parallelism within a single operation |
244+ | `interOpThreads` | `0` (auto) or integer | Parallelism across independent nodes |
245+ | `graphOptimizationLevel` | `disabled`/`basic`/`extended`/`all` | Graph transformation aggressiveness |
246+ | `executionMode` | `sequential`/`parallel` | Node execution order |
247+
189248# # API Overview
190249
191250# ## High-Level (no FFI pointers)
@@ -260,12 +319,22 @@ rt.releaseSessionOptions(options);
260319
261320# # Example
262321
263- The `example/` app demonstrates the high-level API using MNIST and is split into multiple pages :
322+ The `example/` app demonstrates real-world computer vision inference with YOLO models and includes comprehensive performance tuning :
323+
324+ - **YOLO Setup**: Model selection, provider configuration, and performance tuning UI
325+ - **Camera Detection**: Real-time YOLO inference on camera feed with FPS/inference stats
326+ - **Image Detection**: Static image inference with bounding box overlay
327+ - **Video Detection**: Frame-by-frame inference on video with detection overlay
328+ - **Performance Tuning**: Configure threading, graph optimization, and execution mode
329+ - **Execution Providers**: Test different providers (WebGPU, NNAPI, XNNPACK, CoreML)
264330
265- - **Basic Inference**: load session + run inference
266- - **Execution Providers**: query/select providers
267- - **Isolate vs Sync**: visual UI-freeze comparison
268- - **Benchmark**: run N inferences and inspect statistics
331+ Features demonstrated :
332+
333+ - Dynamic model loading (.onnx/.ort formats)
334+ - Platform-aware provider selection (WebGPU/NNAPI/XNNPACK on Android, CoreML on iOS)
335+ - Session configuration for Android Big.LITTLE optimization
336+ - Provider-specific options (NNAPI flags, XNNPACK threads, CoreML compute units)
337+ - Background isolate inference to prevent UI freezes
269338
270339` ` ` bash
271340cd example
@@ -278,6 +347,26 @@ flutter run
278347dart run ffigen --config ffigen.yaml
279348` ` `
280349
350+ # # Recent Changes
351+
352+ # ## v1.0.3+
353+
354+ - **WebGPU Support**: Added WebGPU execution provider for Android GPU acceleration
355+ - **Session Configuration**: New `SessionConfig` class for fine-tuning performance
356+ - Intra-op/inter-op thread control
357+ - Graph optimization levels (disabled → all)
358+ - Execution modes (sequential/parallel)
359+ - Android Big.LITTLE optimization preset
360+ - **Performance Tuning UI**: Example app now includes comprehensive tuning controls
361+ - **Video Detection**: Fixed playback stuttering with self-scheduling inference loop
362+ - **Provider Summary**: Fixed provider options display to respect manual selection
363+
364+ # ## Provider Priority Updates
365+
366+ - Android now prioritizes GPU providers : WebGPU → NNAPI → XNNPACK → CPU
367+ - iOS : CoreML → CPU
368+ - Linux : CPU only
369+
281370# # License
282371
283372MIT. See [LICENSE](LICENSE).
0 commit comments