Skip to content

Commit 78b33bd

Browse files
Add platform support table, WebGPU documentation, and session performance tuning API
Update README to document platform support matrix with minimum versions and execution providers. Add WebGPU provider documentation with Android GPU acceleration examples. Document new SessionConfig API with thread control (intra/inter-op), graph optimization levels, execution modes, and androidOptimized preset for Big.LITTLE cores. Add performance tuning options table. Update example app description to highlight
1 parent 5c38289 commit 78b33bd

1 file changed

Lines changed: 112 additions & 23 deletions

File tree

README.md

Lines changed: 112 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,17 @@
11
# flutter_ort_plugin
22

3-
Flutter plugin for [ONNX Runtime](https://onnxruntime.ai/) inference via Dart FFI. Load `.onnx` models and run them natively on Android, iOS, macOS, Linux, and Windows.
3+
Flutter plugin for [ONNX Runtime](https://onnxruntime.ai/) inference via Dart FFI. Load `.onnx` models and run them natively on Android, iOS, and Linux.
44

55
ONNX Runtime version: **1.24.1**
66

7+
## Platform Support
8+
9+
| Platform | Minimum Version | Execution Providers | Status |
10+
| ----------- | -------------------- | --------------------------- | --------------- |
11+
| **Android** | API 24 (Android 7.0) | WebGPU, NNAPI, XNNPACK, CPU | ✅ Full support |
12+
| **iOS** | iOS 15.1 | CoreML, CPU | ✅ Full support |
13+
| **Linux** | Any | CPU only | ✅ CPU support |
14+
715
## Installation
816

917
```yaml
@@ -102,24 +110,22 @@ await session.dispose();
102110

103111
The plugin auto-detects the best provider per platform:
104112

105-
| Platform | Default providers | Supported | Notes |
106-
| ------------- | ------------------- | ----------- | -------------------------------------------- |
107-
| iOS/macOS | CoreML, CPU | ✅ Fully | CoreML via dedicated config |
108-
| Android | XNNPACK, NNAPI, CPU | ✅ Fully | NNAPI with flags, XNNPACK with thread config |
109-
| Linux/Windows | CPU | ✅ CPU only | GPU providers via generic API only |
113+
| Platform | Default providers | Supported | Notes |
114+
| -------- | --------------------------- | ----------- | --------------------------------------------- |
115+
| iOS | CoreML, CPU | ✅ Fully | CoreML via dedicated config |
116+
| Android | WebGPU, NNAPI, XNNPACK, CPU | ✅ Fully | WebGPU via Dawn, NNAPI flags, XNNPACK threads |
117+
| Linux | CPU | ✅ CPU only | CPU execution provider |
110118

111119
### Provider Implementation Status
112120

113-
| Provider | Status | Notes |
114-
| ---------- | --------------- | ------------------------------------------------ |
115-
| CPU | ✅ Ready | Always available, built-in |
116-
| CoreML | ✅ Ready | iOS/macOS acceleration |
117-
| NNAPI | ✅ Ready | Android NPU/GPU with FP16/NCHW flags |
118-
| XNNPACK | ✅ Ready | Android CPU SIMD optimization with thread config |
119-
| QNN | ⚠️ Generic only | Qualcomm - not fully implemented |
120-
| CUDA | ⚠️ Generic only | NVIDIA - not fully implemented |
121-
| TensorRT | ⚠️ Generic only | NVIDIA - not fully implemented |
122-
| All others | ⚠️ Generic only | May work via generic API |
121+
| Provider | Status | Platform | Notes |
122+
| -------- | --------------- | -------- | ----------------------------------------- |
123+
| CPU | ✅ Ready | All | Always available, built-in |
124+
| CoreML | ✅ Ready | iOS | Apple Neural Engine/GPU acceleration |
125+
| WebGPU | ✅ Ready | Android | GPU acceleration via Dawn/WebGPU support |
126+
| NNAPI | ✅ Ready | Android | NPU/GPU with FP16/NCHW/CPU-disabled flags |
127+
| XNNPACK | ✅ Ready | Android | CPU SIMD optimization with thread config |
128+
| QNN | ⚠️ Generic only | Android | Qualcomm NPU via generic API |
123129

124130
### Automatic (default)
125131

@@ -175,17 +181,70 @@ final session = OrtSessionWrapper.createWithProviders(
175181
);
176182
```
177183

184+
### WebGPU (Android GPU acceleration)
185+
186+
WebGPU provides hardware-accelerated inference on Android devices with GPU support:
187+
188+
```dart
189+
final session = OrtSessionWrapper.createWithProviders(
190+
'model.onnx',
191+
providers: [OrtProvider.webGpu, OrtProvider.cpu],
192+
providerOptions: {
193+
// WebGPU options can be added here if needed
194+
OrtProvider.webGpu: {},
195+
},
196+
);
197+
```
198+
178199
### Querying available providers
179200

180201
```dart
181202
final providers = OrtProviders(OnnxRuntime.instance);
182203
183204
providers.getAvailableProviders();
184-
// ['CoreMLExecutionProvider', 'CPUExecutionProvider']
205+
// ['WebGpuExecutionProvider', 'NnapiExecutionProvider', 'CPUExecutionProvider']
185206
186-
providers.isProviderAvailable(OrtProvider.coreML); // true
207+
providers.isProviderAvailable(OrtProvider.webGpu); // true
187208
```
188209

210+
## Performance Tuning
211+
212+
Fine-tune session options for optimal performance on your target device:
213+
214+
```dart
215+
import 'package:flutter_ort_plugin/flutter_ort_plugin.dart';
216+
217+
final session = OrtSessionWrapper.create(
218+
'model.onnx',
219+
sessionConfig: SessionConfig(
220+
intraOpThreads: 4, // Threads within ops (0 = ORT default)
221+
interOpThreads: 1, // Threads across ops (0 = ORT default)
222+
graphOptimizationLevel: GraphOptLevel.all, // Max graph optimizations
223+
executionMode: ExecutionMode.sequential, // Better on mobile
224+
),
225+
);
226+
```
227+
228+
### Android Big.LITTLE Optimization
229+
230+
For Android devices with heterogeneous cores, limit intra-op threads to avoid contention:
231+
232+
```dart
233+
final session = OrtSessionWrapper.create(
234+
'model.onnx',
235+
sessionConfig: SessionConfig.androidOptimized, // Pre-configured for Android
236+
);
237+
```
238+
239+
### Available Options
240+
241+
| Option | Values | Description |
242+
| ------------------------ | ----------------------------------- | ------------------------------------- |
243+
| `intraOpThreads` | `0` (auto) or integer | Parallelism within a single operation |
244+
| `interOpThreads` | `0` (auto) or integer | Parallelism across independent nodes |
245+
| `graphOptimizationLevel` | `disabled`/`basic`/`extended`/`all` | Graph transformation aggressiveness |
246+
| `executionMode` | `sequential`/`parallel` | Node execution order |
247+
189248
## API Overview
190249

191250
### High-Level (no FFI pointers)
@@ -260,12 +319,22 @@ rt.releaseSessionOptions(options);
260319

261320
## Example
262321

263-
The `example/` app demonstrates the high-level API using MNIST and is split into multiple pages:
322+
The `example/` app demonstrates real-world computer vision inference with YOLO models and includes comprehensive performance tuning:
323+
324+
- **YOLO Setup**: Model selection, provider configuration, and performance tuning UI
325+
- **Camera Detection**: Real-time YOLO inference on camera feed with FPS/inference stats
326+
- **Image Detection**: Static image inference with bounding box overlay
327+
- **Video Detection**: Frame-by-frame inference on video with detection overlay
328+
- **Performance Tuning**: Configure threading, graph optimization, and execution mode
329+
- **Execution Providers**: Test different providers (WebGPU, NNAPI, XNNPACK, CoreML)
264330

265-
- **Basic Inference**: load session + run inference
266-
- **Execution Providers**: query/select providers
267-
- **Isolate vs Sync**: visual UI-freeze comparison
268-
- **Benchmark**: run N inferences and inspect statistics
331+
Features demonstrated:
332+
333+
- Dynamic model loading (.onnx/.ort formats)
334+
- Platform-aware provider selection (WebGPU/NNAPI/XNNPACK on Android, CoreML on iOS)
335+
- Session configuration for Android Big.LITTLE optimization
336+
- Provider-specific options (NNAPI flags, XNNPACK threads, CoreML compute units)
337+
- Background isolate inference to prevent UI freezes
269338

270339
```bash
271340
cd example
@@ -278,6 +347,26 @@ flutter run
278347
dart run ffigen --config ffigen.yaml
279348
```
280349

350+
## Recent Changes
351+
352+
### v1.0.3+
353+
354+
- **WebGPU Support**: Added WebGPU execution provider for Android GPU acceleration
355+
- **Session Configuration**: New `SessionConfig` class for fine-tuning performance
356+
- Intra-op/inter-op thread control
357+
- Graph optimization levels (disabled → all)
358+
- Execution modes (sequential/parallel)
359+
- Android Big.LITTLE optimization preset
360+
- **Performance Tuning UI**: Example app now includes comprehensive tuning controls
361+
- **Video Detection**: Fixed playback stuttering with self-scheduling inference loop
362+
- **Provider Summary**: Fixed provider options display to respect manual selection
363+
364+
### Provider Priority Updates
365+
366+
- Android now prioritizes GPU providers: WebGPU → NNAPI → XNNPACK → CPU
367+
- iOS: CoreML → CPU
368+
- Linux: CPU only
369+
281370
## License
282371

283372
MIT. See [LICENSE](LICENSE).

0 commit comments

Comments
 (0)