Skip to content

Commit bbb0cd0

Browse files
authored
Fix plugin EP profiling timestamp skew on macOS (#27994)
## Description This fixes a flaky failure in the plugin EP profiling tests on macOS, where reconstructed plugin event timestamps could land a few microseconds outside the correlated ORT parent event interval. The current example plugin profiler reconstructs EP-relative timestamps by combining ORT's profiling-start offset with elapsed time from the EP clock. That reconstruction is close but not exact across clocks, and on macOS the skew was enough to fail the strict containment checks in `KernelPluginEp_SessionProfiling` with cases like `ep_start < parent_start` by a small margin. Instead of weakening the test, this change keeps the strict contract and fixes the profiler output so child EP events are always emitted within the correlated ORT parent event interval. ## Key Changes | File | Change | |------|--------| | `onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/ep_profiling.h` | Stores the correlated ORT parent event start timestamp and duration on each collected EP event, and adds the helper signature updates needed to propagate that metadata. | | `onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/ep_profiling.cc` | Captures parent event timing from `Ort::ConstProfilingEvent`, attaches it to EP events during `StopEventImpl`, and clamps the reconstructed EP start/end interval to the parent ORT interval before emitting the final profiling event. | ## Why This Change Is Needed - The plugin EP profiling tests intentionally require strict nesting: EP child events must stay within the ORT parent event interval. - The existing implementation reconstructs EP timestamps from two different clocks, which can drift by a few microseconds depending on platform timing behavior. - macOS exposed that drift often enough to make `KernelPluginEp_SessionProfiling` flaky even though the logical event ordering was correct. - Clamping the emitted child interval to the already-correlated parent interval preserves the expected semantics and removes the platform-specific skew from the final profiling output. ## Testing - `ninja -C build/cuda/Debug onnxruntime_autoep_test` - `cd build/cuda/Debug && ./onnxruntime_autoep_test --gtest_filter=OrtEpLibrary.KernelPluginEp_SessionProfiling` - `cd build/cuda/Debug && ./onnxruntime_autoep_test --gtest_filter=OrtEpLibrary.KernelPluginEp_RunProfiling` ## Notes For Reviewers - This is intentionally scoped to the example plugin EP profiling path used by the AutoEP tests. - The change avoids relaxing any assertions in `test_execution.cc`; it fixes the emitted profiling data instead.
1 parent 58a87dc commit bbb0cd0

2 files changed

Lines changed: 29 additions & 4 deletions

File tree

onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/ep_profiling.cc

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
// Copyright (c) Microsoft Corporation. All rights reserved.
22
// Licensed under the MIT License.
33
#include "ep_profiling.h"
4+
#include <algorithm>
45
#include <array>
56
#include <chrono>
67
#include <optional>
@@ -59,7 +60,8 @@ void EpEventManager::PushOrtEvent(uint64_t profiler_id) {
5960
tls_profiling_state_.profiler_id = profiler_id;
6061
}
6162

62-
void EpEventManager::PopOrtEvent(uint64_t profiler_id, const std::string& ort_event_name) {
63+
void EpEventManager::PopOrtEvent(uint64_t profiler_id, const std::string& ort_event_name,
64+
int64_t ort_event_start_us, int64_t ort_event_duration_us) {
6365
std::lock_guard<std::mutex> lock(mutex_);
6466

6567
auto iter = profiler_state_.find(profiler_id);
@@ -77,6 +79,8 @@ void EpEventManager::PopOrtEvent(uint64_t profiler_id, const std::string& ort_ev
7779

7880
if (ep_event.thread_id == current_thread_id && ep_event.ort_event_name.empty()) {
7981
ep_event.ort_event_name = ort_event_name;
82+
ep_event.ort_event_start_us = ort_event_start_us;
83+
ep_event.ort_event_duration_us = ort_event_duration_us;
8084
}
8185
}
8286

@@ -169,9 +173,11 @@ OrtStatus* ORT_API_CALL ExampleKernelEpProfiler::StopEventImpl(OrtEpProfilerImpl
169173

170174
Ort::ConstProfilingEvent ort_event(c_ort_event);
171175
const char* ort_event_name = ort_event.GetName();
176+
const int64_t ort_event_start_us = ort_event.GetTimestampUs();
177+
const int64_t ort_event_duration_us = ort_event.GetDurationUs();
172178

173179
// Annotate all EP events that were collected during this ORT event with metadata from the ORT event.
174-
ep_event_manager.PopOrtEvent(self->profiler_id, ort_event_name);
180+
ep_event_manager.PopOrtEvent(self->profiler_id, ort_event_name, ort_event_start_us, ort_event_duration_us);
175181
return nullptr;
176182
EXCEPTION_TO_RETURNED_STATUS_END
177183
}
@@ -209,6 +215,22 @@ OrtStatus* ORT_API_CALL ExampleKernelEpProfiler::EndProfilingImpl(
209215
raw_ep_event.end_time - raw_ep_event.start_time)
210216
.count();
211217

218+
// The ORT-to-EP clock reconstruction can differ by a few microseconds on some platforms.
219+
// Bound the EP event to the correlated ORT parent interval so the emitted child event remains
220+
// properly nested without weakening the test's containment checks.
221+
if (raw_ep_event.ort_event_start_us >= 0 && raw_ep_event.ort_event_duration_us >= 0) {
222+
const int64_t parent_start_us = raw_ep_event.ort_event_start_us;
223+
const int64_t parent_end_us = parent_start_us + raw_ep_event.ort_event_duration_us;
224+
225+
int64_t rel_end_us = rel_ts_us + std::max<int64_t>(dur_us, 0);
226+
rel_ts_us = std::clamp(rel_ts_us, parent_start_us, parent_end_us);
227+
rel_end_us = std::clamp(rel_end_us, parent_start_us, parent_end_us);
228+
if (rel_end_us < rel_ts_us) {
229+
rel_end_us = rel_ts_us;
230+
}
231+
dur_us = rel_end_us - rel_ts_us;
232+
}
233+
212234
// Set parent_name as an event arg. The parent_name is just the name of the correlated ORT event.
213235
std::unordered_map<std::string, std::string> args = {{"parent_name", raw_ep_event.ort_event_name.c_str()}};
214236

onnxruntime/test/autoep/library/example_plugin_ep_kernel_registry/ep_profiling.h

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,9 @@ class EpEventManager {
6868
std::chrono::high_resolution_clock::time_point start_time;
6969
std::chrono::high_resolution_clock::time_point end_time;
7070
std::string ort_event_name; // Set from the correlated ORT event
71-
std::thread::id thread_id; // Thread that created this event
71+
int64_t ort_event_start_us = -1;
72+
int64_t ort_event_duration_us = -1;
73+
std::thread::id thread_id; // Thread that created this event
7274
};
7375

7476
static EpEventManager& GetInstance();
@@ -82,7 +84,8 @@ class EpEventManager {
8284
void UnregisterProfiler(uint64_t profiler_id);
8385

8486
void PushOrtEvent(uint64_t profiler_id);
85-
void PopOrtEvent(uint64_t profiler_id, const std::string& ort_event_name);
87+
void PopOrtEvent(uint64_t profiler_id, const std::string& ort_event_name, int64_t ort_event_start_us,
88+
int64_t ort_event_duration_us);
8689

8790
void AddEpEvent(uint64_t profiler_id, Event event);
8891

0 commit comments

Comments
 (0)