Skip to content

Add EPContext data I/O callbacks#28529

Open
GopalakrishnanN wants to merge 10 commits into
mainfrom
gokrishnan/CryptoSupportChatGPT
Open

Add EPContext data I/O callbacks#28529
GopalakrishnanN wants to merge 10 commits into
mainfrom
gokrishnan/CryptoSupportChatGPT

Conversation

@GopalakrishnanN
Copy link
Copy Markdown
Contributor

Description

Adds EPContext data I/O callback support for compiled-model flows so applications can intercept external EPContext binary reads and writes for encryption or custom storage.

This PR:

  • adds public read/write callback hooks for EPContext data
  • adds OrtEpContextConfig and EP API helpers for callback-vs-disk read/write routing
  • wires the example plugin EP through the new helpers
  • expands focused and end-to-end test coverage for callback, fallback, error, empty-payload, and invalid-argument cases
  • documents callback lifetime, synchronous invocation, write completion, and thread-safety semantics

Testing

  • Built onnxruntime_test_all
  • Ran PluginExecutionProviderTest.EpContextData*: 9 passed
  • Built shared onnxruntime_autoep_test, including example_plugin_ep.dll
  • Ran OrtEpLibrary.PluginEp_GenEpContextModel_ExternalDataUsesWriteCallback: passed
  • Ran OrtEpLibrary.PluginEp_LoadEpContextModel_ExternalDataUsesReadCallback: passed
  • Ran full onnxruntime_autoep_test: 59 passed

@GopalakrishnanN GopalakrishnanN force-pushed the gokrishnan/CryptoSupportChatGPT branch from 2ec23e2 to ff7b449 Compare May 16, 2026 00:25
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

Comment thread include/onnxruntime/core/session/onnxruntime_c_api.h Outdated
Comment thread include/onnxruntime/core/session/onnxruntime_c_api.h Outdated
Comment thread include/onnxruntime/core/session/onnxruntime_c_api.h Outdated
Comment thread include/onnxruntime/core/session/onnxruntime_cxx_inline.h Outdated
Comment thread include/onnxruntime/core/session/onnxruntime_ep_c_api.h Outdated
Comment thread include/onnxruntime/core/session/onnxruntime_ep_c_api.h Outdated
Comment thread include/onnxruntime/core/session/onnxruntime_ep_c_api.h Outdated
Comment thread onnxruntime/test/autoep/library/example_plugin_ep/ep_factory.cc Outdated
Comment thread onnxruntime/test/framework/ep_plugin_provider_test.cc Outdated
Comment thread onnxruntime/test/framework/ep_plugin_provider_test.cc Outdated
@GopalakrishnanN
Copy link
Copy Markdown
Contributor Author

@copilot review the EPContext data I/O callback changes for API design, memory safety, and test coverage.

@GopalakrishnanN
Copy link
Copy Markdown
Contributor Author

@copilot review this PR and focus on possible regressions in existing EPContext behavior.

Copilot AI and others added 5 commits May 19, 2026 01:43
Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/6c003fbe-7b15-4f84-ae6a-62d265448bae

Co-authored-by: GopalakrishnanN <22121842+GopalakrishnanN@users.noreply.github.com>
Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/6c003fbe-7b15-4f84-ae6a-62d265448bae

Co-authored-by: GopalakrishnanN <22121842+GopalakrishnanN@users.noreply.github.com>
Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/6c003fbe-7b15-4f84-ae6a-62d265448bae

Co-authored-by: GopalakrishnanN <22121842+GopalakrishnanN@users.noreply.github.com>
Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/6c003fbe-7b15-4f84-ae6a-62d265448bae

Co-authored-by: GopalakrishnanN <22121842+GopalakrishnanN@users.noreply.github.com>
Agent-Logs-Url: https://github.com/microsoft/onnxruntime/sessions/6c003fbe-7b15-4f84-ae6a-62d265448bae

Co-authored-by: GopalakrishnanN <22121842+GopalakrishnanN@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented May 19, 2026

@copilot review the EPContext data I/O callback changes for API design, memory safety, and test coverage.

Reviewed and tightened the EPContext callback path for API/memory safety and coverage. I fixed ReadEpContextData so failed read callbacks now clear output params and free callback-allocated buffers before returning error, and added a regression test for allocate-then-fail callback behavior (plus focused PluginExecutionProviderTest.EpContextData* validation). Commit: c47e4f7.

Copy link
Copy Markdown
Contributor

Copilot AI commented May 19, 2026

@copilot review this PR and focus on possible regressions in existing EPContext behavior.

Analyzed for regressions in existing EPContext behavior. No regressions found:

Zero production EP code modified - TensorRT, QNN, OpenVINO, NV TensorRT RTX EPs unchanged and still use their own file I/O (std::ifstream)

Purely additive changes - New callback APIs are opt-in; existing EPs don't call SessionOptions_GetEpContextConfig/ReadEpContextData/WriteEpContextData

Path resolution compatible - New EP API helper ResolveEpContextDataPath() uses same logic as TensorRT EP: resolves relative paths against context model's parent directory (model_file_path.parent_path() / data_path)

Backward compatibility maintained - Default behavior (callbacks=nullptr) falls back to disk I/O; only example_plugin_ep (test code) uses new callbacks

The changes are safe infrastructure additions that don't affect existing EPContext workflows.

Comment thread include/onnxruntime/core/session/onnxruntime_c_api.h
auto release_ep_context_config = [factory](OrtEpContextConfig* config_to_release) {
factory->ep_api.ReleaseEpContextConfig(config_to_release);
};
std::unique_ptr<OrtEpContextConfig, decltype(release_ep_context_config)> ep_context_config{
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to add a C++ version of OrtEpContextConfig so that we don't have to use unique_ptrs.

}

std::filesystem::path model_file_path{model_path};
return model_file_path.parent_path() / data_path;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function joins a relative file_name to the model directory without validating the result stays within that directory. A malicious model could influence EP-derived filenames (e.g., "../../../etc/passwd") to escape the intended directory. Recommend canonicalizing and verifying containment after path resolution.

RETURN_IF_ERROR(ep_api.WriteEpContextData(ep_context_config_, ep_ctx.c_str(), graph,
ep_context_data.data(), ep_context_data.size()));
}
attributes[0] = Ort::OpAttr("ep_cache_context", ep_ctx.data(), static_cast<int>(ep_ctx.size()),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only if embed == 0, else raw data

Ort::AllocatorWithDefaultOptions allocator;
void* ep_context_data = nullptr;
size_t ep_context_data_size = 0;
RETURN_IF_ERROR(ep->ep_api.ReadEpContextData(ep->ep_context_config_, ep_cache_context.c_str(), ort_graphs[0],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in our meeting, it is probably better to give the EP full control of how it calls the application's callback functions. An EP may want to do custom things (e.g., memory mapping file) in the scenario where the application does not specify a callback function for reading; and we don't want ORT to take on that responsibility.

Suggested changes:

  • Remove C API OrtEp::ReadEpContextData
  • Add C API OrtEpApi::EpContextConfig_GetEpContextDataReadFunc (not sure about name). This API returns the callback function and void* state set by the application. The EP will call the app's callback function directly if it exists. If the app's callback does not exist, then the EP will read the binary data from disk as it normally does.

std::string ep_ctx = config_.embed_ep_context_in_model ? "binary_data" : fused_node_name + ".ctx";
if (!config_.embed_ep_context_in_model) {
const std::string ep_context_data = "binary_data";
RETURN_IF_ERROR(ep_api.WriteEpContextData(ep_context_config_, ep_ctx.c_str(), graph,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the same reasons in the comment for ReadEpContextData...

Suggested changes:

  • Remove C API OrtEp::WriteEpContextData
  • Add C API OrtEpApi::EpContextConfig_GetEpContextDataWriteFunc (not sure on name). This API returns the callback function and void* state set by the application. The EP will call the app's callback function directly if it exists. If the app's callback does not exist, then the EP will save the binary data to disk as it normally does.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

Comment on lines +136 to +137
OrtAllocator* /*allocator*/, void** buffer,
size_t* data_size) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
OrtAllocator* /*allocator*/, void** buffer,
size_t* data_size) {
OrtAllocator* /*allocator*/, void** buffer,
size_t* data_size) {

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds EPContext binary data I/O callbacks so applications can intercept compiled-model EPContext reads/writes (e.g., for encryption or custom storage), with ORT-provided helpers that route either to callbacks or to a disk fallback.

Changes:

  • Introduces public C/C++ APIs for EPContext read/write callbacks and EP-side helper APIs (OrtEpContextConfig, ReadEpContextData, WriteEpContextData).
  • Wires the example plugin EP to use the new EPContext I/O helpers for embedded vs external EPContext flows.
  • Adds unit and end-to-end tests covering callback invocation, fallback-to-disk behavior, and failure/invalid-argument cases.

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
onnxruntime/test/framework/ep_plugin_provider_test.cc Adds focused unit tests for EPContext data callback + disk fallback routing.
onnxruntime/test/autoep/test_execution.cc Adds end-to-end autoep tests validating external EPContext data read/write callback usage.
onnxruntime/test/autoep/library/example_plugin_ep/ep.h Extends example EP config and constructor to carry EPContext config handle.
onnxruntime/test/autoep/library/example_plugin_ep/ep.cc Uses new EPContext data helpers during compile and EPContext node creation.
onnxruntime/test/autoep/library/example_plugin_ep/ep_factory.cc Extracts EPContext config from session options and passes it to the EP.
onnxruntime/core/session/plugin_ep/ep_api.h Declares new EP API entry points for EPContext config and data I/O.
onnxruntime/core/session/plugin_ep/ep_api.cc Implements config extraction and callback-or-disk EPContext read/write helpers.
onnxruntime/core/session/ort_apis.h Declares new core C API entry for registering EPContext read callback.
onnxruntime/core/session/onnxruntime_c_api.cc Wires the new SessionOptions API into the exported OrtApi table.
onnxruntime/core/session/model_compilation_options.h Adds C++ API for registering EPContext write callback in compilation options.
onnxruntime/core/session/model_compilation_options.cc Stores EPContext write callback in internal model-gen options.
onnxruntime/core/session/compile_api.h Declares compile API entry for setting EPContext data write callback.
onnxruntime/core/session/compile_api.cc Implements compile API plumbing for EPContext data write callback.
onnxruntime/core/session/abi_session_options.cc Implements SessionOptions_SetEpContextDataReadFunc.
onnxruntime/core/framework/session_options.h Stores EPContext read callback + state in SessionOptions.
onnxruntime/core/framework/ep_context_options.h Adds holder for EPContext data write callback in model-gen options.
onnxruntime/core/framework/ep_context_options.cc Implements accessor for EPContext data write callback.
include/onnxruntime/core/session/onnxruntime_ep_c_api.h Adds public EP API surface/docs for EPContext config and data I/O helpers.
include/onnxruntime/core/session/onnxruntime_cxx_inline.h Adds C++ inline wrappers for the new read/write callback setters.
include/onnxruntime/core/session/onnxruntime_cxx_api.h Declares new C++ wrapper methods for EPContext read/write callbacks.
include/onnxruntime/core/session/onnxruntime_c_api.h Adds callback typedefs and the public C API entry for EPContext read callback + compile API write callback.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +7531 to +7549
/** \brief Registers a callback to provide EPContext binary data during session load.
*
* When loading a compiled model with external (non-embedded) EPContext binary data, an execution provider can use
* OrtEpApi::ReadEpContextData to call this callback instead of reading the binary data from disk.
*
* The state pointer is stored as-is and is not owned by ORT. It must remain valid while any session or EP created
* from these options may call the callback. If the same state may be used by multiple EPs or threads, the application
* is responsible for synchronization.
*
* \param[in] options The OrtSessionOptions instance.
* \param[in] read_func The OrtReadEpContextDataFunc callback.
* \param[in] state Opaque state passed to read_func. Can be NULL.
*
* \snippet{doc} snippets.dox OrtStatus Return Value
*
* \since Version 1.27.
*/
ORT_API2_STATUS(SessionOptions_SetEpContextDataReadFunc, _Inout_ OrtSessionOptions* options,
_In_ OrtReadEpContextDataFunc read_func, _In_opt_ void* state);
Comment thread onnxruntime/core/session/onnxruntime_c_api.cc
Comment on lines +1750 to +1758
const std::filesystem::path test_dir = std::filesystem::temp_directory_path() / "ort_ep_context_data_test";
std::filesystem::create_directories(test_dir);
const std::filesystem::path data_path = test_dir / "context.bin";
const std::string data_path_utf8 = PathToUTF8String(data_path.native());
auto cleanup = gsl::finally([&]() {
std::error_code ec;
std::filesystem::remove(data_path, ec);
std::filesystem::remove(test_dir, ec);
});
@GopalakrishnanN GopalakrishnanN requested a review from edgchen1 May 21, 2026 18:50
* @{
*/
ORT_RUNTIME_CLASS(Ep);
ORT_RUNTIME_CLASS(EpContextConfig);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we also add a C++ API wrapper type for OrtEpContextConfig?

const auto read_status = Env::Default().ReadFileIntoBuffer(
data_path.native().c_str(), 0, file_size, gsl::make_span(static_cast<char*>(allocated_buffer), file_size));
if (!read_status.IsOK()) {
allocator->Free(allocator, allocated_buffer);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

general - we should wrap the allocated buffer in a RAII type like unique_ptr to avoid leaks if there are exceptions.

PathToUTF8String(data_path.native()));

if (buffer_size != 0) {
ORT_API_RETURN_IF(buffer_size > static_cast<size_t>(std::numeric_limits<std::streamsize>::max()),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the cast from std::streamsize to size_t safe? not sure if size_t is guaranteed to be as wide as std::streamsize.

_In_opt_ const OrtEpContextConfig* config,
_In_ const char* file_name,
_In_opt_ const OrtGraph* graph,
_Inout_ OrtAllocator* allocator,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we use _Inout_ for allocator?

auto model_compile_options = reinterpret_cast<onnxruntime::ModelCompilationOptions*>(ort_model_compile_options);

if (model_compile_options == nullptr) {
return OrtApis::CreateStatus(ORT_INVALID_ARGUMENT, "OrtModelCompilationOptions is NULL");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: consistent spelling of "null"

Suggested change
return OrtApis::CreateStatus(ORT_INVALID_ARGUMENT, "OrtModelCompilationOptions is NULL");
return OrtApis::CreateStatus(ORT_INVALID_ARGUMENT, "OrtModelCompilationOptions is null");

Comment thread onnxruntime/core/session/onnxruntime_c_api.cc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants