Commit bec2792
Plugin EP event profiling APIs (#27649)
### Description
#### TLDR
This PR ports the existing C++
[EpProfiler](https://github.com/microsoft/onnxruntime/blob/faad20f9d3264c7f3b6d4e4398990e13ee864512/include/onnxruntime/core/framework/execution_provider.h#L359)
interfaces used by provider-bridge EPs to the binary-stable C APIs for
plugin EPs. It introduces C/C++ APIs for creating/querying profiling
events, a container for appending EP events, and callback hooks
(`StartEvent`/`StopEvent`) that give EPs access to ORT event metadata in
real-time.
#### Changes to the original C++ API
The original `EpProfiler` C++ interface was adapted for the C API with
the following intentional changes:
1. **`StartProfiling`** now receives an offset indicating the elapsed
time since profiling started, as opposed to receiving an
absolute/epoch-dependent profiling start time. This prevents EPs from
having to do epoch conversions. Credit to @edgchen1 for the idea.
2. **`StartEvent`/`StopEvent` receive an absolute, epoch-based
correlation ID (`ort_event_correlation_id`)** instead of a relative ORT
event ID. The `PluginEpProfiler` bridge layer automatically converts the
C++ `relative_ort_event_id` (microseconds since profiling start) to an
absolute `ort_event_correlation_id` by adding the epoch-based profiling
start time. This means plugin EPs can use the correlation ID directly
with profiling utilities like CUPTI or ROCTracer without computing the
conversion themselves.
3. **`StopEvent` now receives the completed ORT event as a parameter.**
This allows EPs to optionally inspect ORT event metadata (e.g.,
`op_name`, `event_name`) at the time the event ends, facilitating
annotation of correlated EP events.
4. **`EndProfiling` only allows EPs to *append* events (via
`OrtProfilingEventsContainer`), not read or modify the full events
array.** This is motivated by:
- Prevent any one EP from modifying events generated by ORT or another
EP.
- Certain EPs (VitisAI and WebGPU) already only append events without
reading the entire events array.
- The CUDA EP reads the entire events array solely to merge/sort its own
EP events next to correlated ORT events and add `parent_name`/`op_name`
metadata. However:
- Merging/sorting is mostly unnecessary since trace viewers that load
these files do their own event sorting.
- This merging/sorting step was previously required to augment CUDA EP
events with metadata from the correlated ORT event. However, that can
now be obtained more simply via the new `StopEvent` parameter that
provides the EP with the full correlated ORT event.
- The [merge algorithm used by CUDA
EP](https://github.com/microsoft/onnxruntime/blob/faad20f9d3264c7f3b6d4e4398990e13ee864512/include/onnxruntime/core/common/gpu_profiler_common.h#L391-L397)
**incorrectly** assumes ORT events are sorted by non-decreasing *start*
time, but they are actually sorted by [non-decreasing *end*
time](https://github.com/microsoft/onnxruntime/blob/faad20f9d3264c7f3b6d4e4398990e13ee864512/onnxruntime/core/common/profiler.cc#L91)
(also see
#13706 (comment)).
Fixing this would require sorting the entire Events array before asking
a provider-bridge EP to merge in its events into the global events
array. Not sure this is worth the runtime cost.
#### Naming conventions for ORT event IDs
- **C++ `EpProfiler` interface** (existing): Uses
`relative_ort_event_id` — a timestamp offset in microseconds relative to
profiling start.
- **C API `OrtEpProfilerImpl`** (new in this PR): Uses
`ort_event_correlation_id` — an absolute, epoch-based timestamp in
microseconds computed from `std::chrono::high_resolution_clock`
(platform-defined epoch). Unique across concurrent profiling sessions
within the same process.
- **Conversion**: The `PluginEpProfiler` bridge class (in
`ep_event_profiling.cc`) performs `ort_event_correlation_id =
relative_ort_event_id + profiling_start_time_epoch_us_`, mirroring the
pattern in `GPUTracerManager::PushCorrelation`.
### New C APIs
| API | Description |
|-----|-------------|
| `CreateProfilingEvent` | Create a profiling event with category,
process/thread IDs, name, timestamp, duration, and key-value args |
| `ReleaseProfilingEvent` | Release a profiling event |
| `ProfilingEvent_GetCategory` | Get event category (`SESSION`, `NODE`,
`KERNEL`, `API`) |
| `ProfilingEvent_GetName` | Get event name |
| `ProfilingEvent_GetTimestampUs` | Get event start timestamp (µs) |
| `ProfilingEvent_GetDurationUs` | Get event duration (µs) |
| `ProfilingEvent_GetArgValue` | Get an event argument value by key |
| `ProfilingEventsContainer_AddEvents` | Append an array of EP events to
the output container |
| `OrtEp::CreateProfiler` | Returns an instance of the EP's profiler
implementation |
| `OrtEpProfilerImpl::StartProfiling` | Called by ORT to start a
profiling session. Receives elapsed time offset (ns) since ORT profiling
started |
| `OrtEpProfilerImpl::StartEvent` | Called by ORT to notify that an ORT
event has started. Receives an absolute `ort_event_correlation_id` |
| `OrtEpProfilerImpl::StopEvent` | Called by ORT to notify that an ORT
event has ended. Receives the same `ort_event_correlation_id` and ORT
event metadata |
| `OrtEpProfilerImpl::EndProfiling` | Called by ORT to end the profiling
session and collect EP events into the output container |
| `OrtEpProfilerImpl::Release` | Release the profiler instance |
### New C++ wrapper classes
| Class | Description |
|-------|-------------|
| `Ort::ConstProfilingEvent` | Non-owning const wrapper for reading
fields from an `OrtProfilingEvent` (e.g., in `StopEvent`) |
| `Ort::ProfilingEvent` | Owning wrapper that creates and manages an
`OrtProfilingEvent` (e.g., for `EndProfiling`) |
| `Ort::UnownedProfilingEventsContainer` | Non-owning wrapper for adding
events to an `OrtProfilingEventsContainer` during `EndProfiling` |
### Example EP profiling implementation
This PR updates an example plugin EP to use the new profiling APIs:
- Plugin EP code:
[test/autoep/library/example_plugin_ep_kernel_registry](https://github.com/microsoft/onnxruntime/tree/adrianl/PluginEp_ProfilingApis/onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry)
- `OrtEpProfilerImpl` implementation:
[ep_profiling.h](https://github.com/microsoft/onnxruntime/blob/adrianl/PluginEp_ProfilingApis/onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/ep_profiling.h)
/
[ep_profiling.cc](https://github.com/microsoft/onnxruntime/blob/adrianl/PluginEp_ProfilingApis/onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/ep_profiling.cc)
- `OrtEp::CreateProfiler()` implementation:
[ep.cc](https://github.com/microsoft/onnxruntime/blob/adrianl/PluginEp_ProfilingApis/onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/ep.cc)
### Existing bugs found
Not fixed in this PR.
- The [merge algorithm used by CUDA
EP](https://github.com/microsoft/onnxruntime/blob/faad20f9d3264c7f3b6d4e4398990e13ee864512/include/onnxruntime/core/common/gpu_profiler_common.h#L391-L397)
**incorrectly** assumes ORT events are sorted by non-decreasing *start*
time, but they are actually sorted by [non-decreasing *end*
time](https://github.com/microsoft/onnxruntime/blob/faad20f9d3264c7f3b6d4e4398990e13ee864512/onnxruntime/core/common/profiler.cc#L91)
(also see
#13706 (comment)).
- Run profilers do not handle subgraphs (e.g., subgraph of a
control-flow operator). Has been the case since run profilers were
[introduced](#26846).
### Motivation and Context
Allows plugin EPs to generate profiling events, further closing the
functionality gap between provider-bridge EPs and plugin EPs.
---------
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com>1 parent a997c4f commit bec2792
25 files changed
Lines changed: 1742 additions & 33 deletions
File tree
- cmake
- include/onnxruntime/core
- common
- session
- onnxruntime
- core
- common
- framework
- providers
- cuda
- vitisai
- webgpu
- session/plugin_ep
- test
- autoep
- library/example_plugin_ep_kernel_registry
- kernels
- framework
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2137 | 2137 | | |
2138 | 2138 | | |
2139 | 2139 | | |
| 2140 | + | |
| 2141 | + | |
2140 | 2142 | | |
2141 | 2143 | | |
2142 | 2144 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
377 | 377 | | |
378 | 378 | | |
379 | 379 | | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
380 | 384 | | |
381 | 385 | | |
382 | 386 | | |
| |||
457 | 461 | | |
458 | 462 | | |
459 | 463 | | |
460 | | - | |
| 464 | + | |
461 | 465 | | |
462 | 466 | | |
463 | 467 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
| 14 | + | |
| 15 | + | |
14 | 16 | | |
15 | 17 | | |
16 | 18 | | |
| |||
79 | 81 | | |
80 | 82 | | |
81 | 83 | | |
82 | | - | |
83 | | - | |
84 | | - | |
85 | | - | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
86 | 149 | | |
87 | 150 | | |
88 | 151 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
662 | 662 | | |
663 | 663 | | |
664 | 664 | | |
| 665 | + | |
665 | 666 | | |
666 | 667 | | |
667 | 668 | | |
| |||
1231 | 1232 | | |
1232 | 1233 | | |
1233 | 1234 | | |
| 1235 | + | |
| 1236 | + | |
| 1237 | + | |
| 1238 | + | |
| 1239 | + | |
| 1240 | + | |
| 1241 | + | |
| 1242 | + | |
| 1243 | + | |
| 1244 | + | |
| 1245 | + | |
| 1246 | + | |
| 1247 | + | |
| 1248 | + | |
| 1249 | + | |
| 1250 | + | |
| 1251 | + | |
| 1252 | + | |
| 1253 | + | |
| 1254 | + | |
| 1255 | + | |
| 1256 | + | |
| 1257 | + | |
| 1258 | + | |
| 1259 | + | |
| 1260 | + | |
| 1261 | + | |
| 1262 | + | |
| 1263 | + | |
| 1264 | + | |
| 1265 | + | |
| 1266 | + | |
| 1267 | + | |
| 1268 | + | |
| 1269 | + | |
| 1270 | + | |
| 1271 | + | |
| 1272 | + | |
| 1273 | + | |
| 1274 | + | |
| 1275 | + | |
| 1276 | + | |
| 1277 | + | |
| 1278 | + | |
| 1279 | + | |
| 1280 | + | |
| 1281 | + | |
| 1282 | + | |
| 1283 | + | |
| 1284 | + | |
| 1285 | + | |
| 1286 | + | |
| 1287 | + | |
| 1288 | + | |
| 1289 | + | |
| 1290 | + | |
| 1291 | + | |
| 1292 | + | |
| 1293 | + | |
| 1294 | + | |
| 1295 | + | |
| 1296 | + | |
| 1297 | + | |
| 1298 | + | |
| 1299 | + | |
| 1300 | + | |
| 1301 | + | |
| 1302 | + | |
| 1303 | + | |
| 1304 | + | |
| 1305 | + | |
| 1306 | + | |
| 1307 | + | |
| 1308 | + | |
| 1309 | + | |
| 1310 | + | |
| 1311 | + | |
| 1312 | + | |
| 1313 | + | |
| 1314 | + | |
| 1315 | + | |
| 1316 | + | |
| 1317 | + | |
| 1318 | + | |
| 1319 | + | |
| 1320 | + | |
| 1321 | + | |
| 1322 | + | |
| 1323 | + | |
1234 | 1324 | | |
1235 | 1325 | | |
1236 | 1326 | | |
| |||
Lines changed: 92 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
802 | 802 | | |
803 | 803 | | |
804 | 804 | | |
| 805 | + | |
| 806 | + | |
| 807 | + | |
| 808 | + | |
| 809 | + | |
| 810 | + | |
| 811 | + | |
| 812 | + | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
| 816 | + | |
| 817 | + | |
| 818 | + | |
| 819 | + | |
| 820 | + | |
| 821 | + | |
| 822 | + | |
| 823 | + | |
| 824 | + | |
| 825 | + | |
| 826 | + | |
| 827 | + | |
| 828 | + | |
| 829 | + | |
| 830 | + | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
| 836 | + | |
| 837 | + | |
| 838 | + | |
| 839 | + | |
| 840 | + | |
| 841 | + | |
| 842 | + | |
| 843 | + | |
| 844 | + | |
| 845 | + | |
| 846 | + | |
| 847 | + | |
| 848 | + | |
| 849 | + | |
| 850 | + | |
| 851 | + | |
| 852 | + | |
| 853 | + | |
| 854 | + | |
| 855 | + | |
| 856 | + | |
| 857 | + | |
| 858 | + | |
| 859 | + | |
| 860 | + | |
| 861 | + | |
| 862 | + | |
| 863 | + | |
| 864 | + | |
| 865 | + | |
| 866 | + | |
| 867 | + | |
| 868 | + | |
| 869 | + | |
| 870 | + | |
| 871 | + | |
| 872 | + | |
| 873 | + | |
| 874 | + | |
| 875 | + | |
| 876 | + | |
| 877 | + | |
| 878 | + | |
| 879 | + | |
| 880 | + | |
| 881 | + | |
| 882 | + | |
| 883 | + | |
| 884 | + | |
| 885 | + | |
| 886 | + | |
| 887 | + | |
| 888 | + | |
| 889 | + | |
| 890 | + | |
| 891 | + | |
| 892 | + | |
| 893 | + | |
| 894 | + | |
| 895 | + | |
| 896 | + | |
805 | 897 | | |
806 | 898 | | |
807 | 899 | | |
| |||
0 commit comments