- ONNX Runtime version >= 1.23.0
- A dynamic/shared EP library that exports the functions
CreateEpFactories()andReleaseEpFactory(). - ORT GPU python wheel installed.
Please see plugin_ep_inference.py for a full example.
- Register plugin EP library with ONNX Runtime
onnxruntime.register_execution_provider_library("plugin_ep.so")
- Find the OrtEpDevice for that EP
ep_device = onnxruntime.get_ep_devices() for ep_device in ep_devices: if ep_device.ep_name == ep_name: target_ep_device = ep_device
- Append the EP to ORT session option
sess_options.add_provider_for_devices([target_ep_device], {})
- Create ORT session with the EP
sess = onnxrt.InferenceSession("/path/to/model", sess_options=sess_options)
- Run ORT session
res = sess.run([], {input_name: x})
- Unregister plugin EP library
onnxruntime.unregister_execution_provider_library(ep_registration_name)
The workflow is the same as above except for step 2 and 3. Instead, set the selection policy directly
sess_options.set_provider_selection_policy(policy)Available "policy":
onnxruntime.OrtExecutionProviderDevicePolicy_DEFAULTonnxruntime.OrtExecutionProviderDevicePolicy_PREFER_CPUonnxruntime.OrtExecutionProviderDevicePolicy_PREFER_NPUonnxruntime.OrtExecutionProviderDevicePolicy_PREFER_GPUonnxruntime.OrtExecutionProviderDevicePolicy_MAX_PERFORMANCEonnxruntime.OrtExecutionProviderDevicePolicy_MAX_EFFICIENCYonnxruntime.OrtExecutionProviderDevicePolicy_MIN_OVERALL_POWER
For additional APIs and details on plugin EP usage, see the official documentation: https://onnxruntime.ai/docs/execution-providers/plugin-ep-libraries.html#using-a-plugin-ep-library