Add EPContext data I/O callbacks#28529
Conversation
2ec23e2 to
ff7b449
Compare
|
@copilot review the EPContext data I/O callback changes for API design, memory safety, and test coverage. |
|
@copilot review this PR and focus on possible regressions in existing EPContext behavior. |
Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/6c003fbe-7b15-4f84-ae6a-62d265448bae Co-authored-by: GopalakrishnanN <22121842+GopalakrishnanN@users.noreply.github.com>
Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/6c003fbe-7b15-4f84-ae6a-62d265448bae Co-authored-by: GopalakrishnanN <22121842+GopalakrishnanN@users.noreply.github.com>
Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/6c003fbe-7b15-4f84-ae6a-62d265448bae Co-authored-by: GopalakrishnanN <22121842+GopalakrishnanN@users.noreply.github.com>
Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/6c003fbe-7b15-4f84-ae6a-62d265448bae Co-authored-by: GopalakrishnanN <22121842+GopalakrishnanN@users.noreply.github.com>
Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/6c003fbe-7b15-4f84-ae6a-62d265448bae Co-authored-by: GopalakrishnanN <22121842+GopalakrishnanN@users.noreply.github.com>
Reviewed and tightened the EPContext callback path for API/memory safety and coverage. I fixed |
Analyzed for regressions in existing EPContext behavior. No regressions found: ✅ Zero production EP code modified - TensorRT, QNN, OpenVINO, NV TensorRT RTX EPs unchanged and still use their own file I/O (std::ifstream) ✅ Purely additive changes - New callback APIs are opt-in; existing EPs don't call SessionOptions_GetEpContextConfig/ReadEpContextData/WriteEpContextData ✅ Path resolution compatible - New EP API helper ✅ Backward compatibility maintained - Default behavior (callbacks=nullptr) falls back to disk I/O; only example_plugin_ep (test code) uses new callbacks The changes are safe infrastructure additions that don't affect existing EPContext workflows. |
| auto release_ep_context_config = [factory](OrtEpContextConfig* config_to_release) { | ||
| factory->ep_api.ReleaseEpContextConfig(config_to_release); | ||
| }; | ||
| std::unique_ptr<OrtEpContextConfig, decltype(release_ep_context_config)> ep_context_config{ |
There was a problem hiding this comment.
Would be good to add a C++ version of OrtEpContextConfig so that we don't have to use unique_ptrs.
| } | ||
|
|
||
| std::filesystem::path model_file_path{model_path}; | ||
| return model_file_path.parent_path() / data_path; |
There was a problem hiding this comment.
The function joins a relative file_name to the model directory without validating the result stays within that directory. A malicious model could influence EP-derived filenames (e.g., "../../../etc/passwd") to escape the intended directory. Recommend canonicalizing and verifying containment after path resolution.
| RETURN_IF_ERROR(ep_api.WriteEpContextData(ep_context_config_, ep_ctx.c_str(), graph, | ||
| ep_context_data.data(), ep_context_data.size())); | ||
| } | ||
| attributes[0] = Ort::OpAttr("ep_cache_context", ep_ctx.data(), static_cast<int>(ep_ctx.size()), |
There was a problem hiding this comment.
only if embed == 0, else raw data
| Ort::AllocatorWithDefaultOptions allocator; | ||
| void* ep_context_data = nullptr; | ||
| size_t ep_context_data_size = 0; | ||
| RETURN_IF_ERROR(ep->ep_api.ReadEpContextData(ep->ep_context_config_, ep_cache_context.c_str(), ort_graphs[0], |
There was a problem hiding this comment.
As discussed in our meeting, it is probably better to give the EP full control of how it calls the application's callback functions. An EP may want to do custom things (e.g., memory mapping file) in the scenario where the application does not specify a callback function for reading; and we don't want ORT to take on that responsibility.
Suggested changes:
- Remove C API
OrtEp::ReadEpContextData - Add C API
OrtEpApi::EpContextConfig_GetEpContextDataReadFunc(not sure about name). This API returns the callback function andvoid*state set by the application. The EP will call the app's callback function directly if it exists. If the app's callback does not exist, then the EP will read the binary data from disk as it normally does.
| std::string ep_ctx = config_.embed_ep_context_in_model ? "binary_data" : fused_node_name + ".ctx"; | ||
| if (!config_.embed_ep_context_in_model) { | ||
| const std::string ep_context_data = "binary_data"; | ||
| RETURN_IF_ERROR(ep_api.WriteEpContextData(ep_context_config_, ep_ctx.c_str(), graph, |
There was a problem hiding this comment.
For the same reasons in the comment for ReadEpContextData...
Suggested changes:
- Remove C API
OrtEp::WriteEpContextData - Add C API
OrtEpApi::EpContextConfig_GetEpContextDataWriteFunc(not sure on name). This API returns the callback function and void* state set by the application. The EP will call the app's callback function directly if it exists. If the app's callback does not exist, then the EP will save the binary data to disk as it normally does.
| OrtAllocator* /*allocator*/, void** buffer, | ||
| size_t* data_size) { |
There was a problem hiding this comment.
| OrtAllocator* /*allocator*/, void** buffer, | |
| size_t* data_size) { | |
| OrtAllocator* /*allocator*/, void** buffer, | |
| size_t* data_size) { |
There was a problem hiding this comment.
Pull request overview
This PR adds EPContext binary data I/O callbacks so applications can intercept compiled-model EPContext reads/writes (e.g., for encryption or custom storage), with ORT-provided helpers that route either to callbacks or to a disk fallback.
Changes:
- Introduces public C/C++ APIs for EPContext read/write callbacks and EP-side helper APIs (
OrtEpContextConfig,ReadEpContextData,WriteEpContextData). - Wires the example plugin EP to use the new EPContext I/O helpers for embedded vs external EPContext flows.
- Adds unit and end-to-end tests covering callback invocation, fallback-to-disk behavior, and failure/invalid-argument cases.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| onnxruntime/test/framework/ep_plugin_provider_test.cc | Adds focused unit tests for EPContext data callback + disk fallback routing. |
| onnxruntime/test/autoep/test_execution.cc | Adds end-to-end autoep tests validating external EPContext data read/write callback usage. |
| onnxruntime/test/autoep/library/example_plugin_ep/ep.h | Extends example EP config and constructor to carry EPContext config handle. |
| onnxruntime/test/autoep/library/example_plugin_ep/ep.cc | Uses new EPContext data helpers during compile and EPContext node creation. |
| onnxruntime/test/autoep/library/example_plugin_ep/ep_factory.cc | Extracts EPContext config from session options and passes it to the EP. |
| onnxruntime/core/session/plugin_ep/ep_api.h | Declares new EP API entry points for EPContext config and data I/O. |
| onnxruntime/core/session/plugin_ep/ep_api.cc | Implements config extraction and callback-or-disk EPContext read/write helpers. |
| onnxruntime/core/session/ort_apis.h | Declares new core C API entry for registering EPContext read callback. |
| onnxruntime/core/session/onnxruntime_c_api.cc | Wires the new SessionOptions API into the exported OrtApi table. |
| onnxruntime/core/session/model_compilation_options.h | Adds C++ API for registering EPContext write callback in compilation options. |
| onnxruntime/core/session/model_compilation_options.cc | Stores EPContext write callback in internal model-gen options. |
| onnxruntime/core/session/compile_api.h | Declares compile API entry for setting EPContext data write callback. |
| onnxruntime/core/session/compile_api.cc | Implements compile API plumbing for EPContext data write callback. |
| onnxruntime/core/session/abi_session_options.cc | Implements SessionOptions_SetEpContextDataReadFunc. |
| onnxruntime/core/framework/session_options.h | Stores EPContext read callback + state in SessionOptions. |
| onnxruntime/core/framework/ep_context_options.h | Adds holder for EPContext data write callback in model-gen options. |
| onnxruntime/core/framework/ep_context_options.cc | Implements accessor for EPContext data write callback. |
| include/onnxruntime/core/session/onnxruntime_ep_c_api.h | Adds public EP API surface/docs for EPContext config and data I/O helpers. |
| include/onnxruntime/core/session/onnxruntime_cxx_inline.h | Adds C++ inline wrappers for the new read/write callback setters. |
| include/onnxruntime/core/session/onnxruntime_cxx_api.h | Declares new C++ wrapper methods for EPContext read/write callbacks. |
| include/onnxruntime/core/session/onnxruntime_c_api.h | Adds callback typedefs and the public C API entry for EPContext read callback + compile API write callback. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /** \brief Registers a callback to provide EPContext binary data during session load. | ||
| * | ||
| * When loading a compiled model with external (non-embedded) EPContext binary data, an execution provider can use | ||
| * OrtEpApi::ReadEpContextData to call this callback instead of reading the binary data from disk. | ||
| * | ||
| * The state pointer is stored as-is and is not owned by ORT. It must remain valid while any session or EP created | ||
| * from these options may call the callback. If the same state may be used by multiple EPs or threads, the application | ||
| * is responsible for synchronization. | ||
| * | ||
| * \param[in] options The OrtSessionOptions instance. | ||
| * \param[in] read_func The OrtReadEpContextDataFunc callback. | ||
| * \param[in] state Opaque state passed to read_func. Can be NULL. | ||
| * | ||
| * \snippet{doc} snippets.dox OrtStatus Return Value | ||
| * | ||
| * \since Version 1.27. | ||
| */ | ||
| ORT_API2_STATUS(SessionOptions_SetEpContextDataReadFunc, _Inout_ OrtSessionOptions* options, | ||
| _In_ OrtReadEpContextDataFunc read_func, _In_opt_ void* state); |
| const std::filesystem::path test_dir = std::filesystem::temp_directory_path() / "ort_ep_context_data_test"; | ||
| std::filesystem::create_directories(test_dir); | ||
| const std::filesystem::path data_path = test_dir / "context.bin"; | ||
| const std::string data_path_utf8 = PathToUTF8String(data_path.native()); | ||
| auto cleanup = gsl::finally([&]() { | ||
| std::error_code ec; | ||
| std::filesystem::remove(data_path, ec); | ||
| std::filesystem::remove(test_dir, ec); | ||
| }); |
| * @{ | ||
| */ | ||
| ORT_RUNTIME_CLASS(Ep); | ||
| ORT_RUNTIME_CLASS(EpContextConfig); |
There was a problem hiding this comment.
could we also add a C++ API wrapper type for OrtEpContextConfig?
| const auto read_status = Env::Default().ReadFileIntoBuffer( | ||
| data_path.native().c_str(), 0, file_size, gsl::make_span(static_cast<char*>(allocated_buffer), file_size)); | ||
| if (!read_status.IsOK()) { | ||
| allocator->Free(allocator, allocated_buffer); |
There was a problem hiding this comment.
general - we should wrap the allocated buffer in a RAII type like unique_ptr to avoid leaks if there are exceptions.
| PathToUTF8String(data_path.native())); | ||
|
|
||
| if (buffer_size != 0) { | ||
| ORT_API_RETURN_IF(buffer_size > static_cast<size_t>(std::numeric_limits<std::streamsize>::max()), |
There was a problem hiding this comment.
is the cast from std::streamsize to size_t safe? not sure if size_t is guaranteed to be as wide as std::streamsize.
| _In_opt_ const OrtEpContextConfig* config, | ||
| _In_ const char* file_name, | ||
| _In_opt_ const OrtGraph* graph, | ||
| _Inout_ OrtAllocator* allocator, |
There was a problem hiding this comment.
why do we use _Inout_ for allocator?
| auto model_compile_options = reinterpret_cast<onnxruntime::ModelCompilationOptions*>(ort_model_compile_options); | ||
|
|
||
| if (model_compile_options == nullptr) { | ||
| return OrtApis::CreateStatus(ORT_INVALID_ARGUMENT, "OrtModelCompilationOptions is NULL"); |
There was a problem hiding this comment.
nit: consistent spelling of "null"
| return OrtApis::CreateStatus(ORT_INVALID_ARGUMENT, "OrtModelCompilationOptions is NULL"); | |
| return OrtApis::CreateStatus(ORT_INVALID_ARGUMENT, "OrtModelCompilationOptions is null"); |
Description
Adds EPContext data I/O callback support for compiled-model flows so applications can intercept external EPContext binary reads and writes for encryption or custom storage.
This PR:
OrtEpContextConfigand EP API helpers for callback-vs-disk read/write routingTesting
onnxruntime_test_allPluginExecutionProviderTest.EpContextData*: 9 passedonnxruntime_autoep_test, includingexample_plugin_ep.dllOrtEpLibrary.PluginEp_GenEpContextModel_ExternalDataUsesWriteCallback: passedOrtEpLibrary.PluginEp_LoadEpContextModel_ExternalDataUsesReadCallback: passedonnxruntime_autoep_test: 59 passed