Skip to content

Commit 081633e

Browse files
authored
Add EP Device Compatibility APIs (microsoft#26922)
### Description <!-- Describe your changes. --> This PR adds new C API for querying hardware device and execution provider (EP) compatibility/incompatibility reasons. The API allows applications to determine why a specific EP may not be compatible with a given hardware device before attempting to use it. The EP implementor decides how device compatibility is defined for their stack and communicates this via the possible states in the OrtDeviceEpIncompatibilityDetails struct. New API Functions - GetHardwareDeviceEPIncompatibilityReasons - Query incompatibility reasons between an EP and a hardware device - DeviceEpIncompatibilityDetails_GetReasonsBitmask - Get a bitmask of incompatibility reasons - DeviceEpIncompatibilityDetails_GetErrorCode - Get an EP-specific error code - DeviceEpIncompatibilityDetails_GetNotes - Get optional human-readable notes about incompatibility - ReleaseDeviceEpIncompatibilityDetails - Release the details object New Types - OrtDeviceEpIncompatibilityDetails - Opaque type holding incompatibility information - OrtDeviceEpIncompatibilityReason - Enum defining standard incompatibility reason flags Testing - Added unit tests in hardware_device_compatibility_test.cc for the new C API functions - Added plugin EP compatibility tests in test_execution.cc . Tests verify both compatible (CPU device with CPU-supporting EP) and incompatible (GPU device with CPU-only EP) scenarios - Updated example plugin EP to implement the compatibility checking interface ### Motivation and Context Applications using ONNX Runtime need a way to understand why an execution provider cannot run on a particular hardware device. This is especially important for: - User feedback - Providing meaningful error messages when hardware is incompatible - EP selection - Choosing the best available EP based on compatibility - Diagnostics - Understanding driver/dependency requirements Previously, there was no standardized way to query this information. EPs would simply fail at session creation/model load/inference time without providing actionable feedback.
1 parent c343143 commit 081633e

17 files changed

Lines changed: 996 additions & 49 deletions

include/onnxruntime/core/session/environment.h

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,17 @@ class Environment {
146146
return execution_devices_;
147147
}
148148

149+
/// Get hardware device incompatibility details for a specific EP.
150+
/// @param ep_name The name of the execution provider to check.
151+
/// @param hw The hardware device to check for incompatibility.
152+
/// @param details Output: Incompatibility details including reasons for incompatibility if any.
153+
/// @returns Status indicating success or failure.
154+
Status GetHardwareDeviceEpIncompatibilityDetails(const std::string& ep_name,
155+
const OrtHardwareDevice* hw,
156+
std::unique_ptr<OrtDeviceEpIncompatibilityDetails>& details) const;
157+
158+
const std::vector<const OrtHardwareDevice*>& GetSortedOrtHardwareDevices() const;
159+
149160
Status CreateSharedAllocator(const OrtEpDevice& ep_device,
150161
OrtDeviceMemoryType mem_type, OrtAllocatorType allocator_type,
151162
const OrtKeyValuePairs* allocator_options, OrtAllocator** allocator);

include/onnxruntime/core/session/onnxruntime_c_api.h

Lines changed: 127 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
// Copyright (c) Microsoft Corporation. All rights reserved.
1+
// Copyright (c) Microsoft Corporation. All rights reserved.
22
// Licensed under the MIT License.
33

44
// See docs\c_cxx\README.md on generating the Doxygen documentation from this file
@@ -333,6 +333,7 @@ ORT_RUNTIME_CLASS(ExternalInitializerInfo);
333333
ORT_RUNTIME_CLASS(ExternalResourceImporter); // Capability object for external resource import
334334
ORT_RUNTIME_CLASS(ExternalMemoryHandle); // EP-imported view of shared external allocation
335335
ORT_RUNTIME_CLASS(ExternalSemaphoreHandle); // EP-imported view of shared external semaphore
336+
ORT_RUNTIME_CLASS(DeviceEpIncompatibilityDetails);
336337

337338
#ifdef _MSC_VER
338339
typedef _Return_type_success_(return == 0) OrtStatus* OrtStatusPtr;
@@ -510,6 +511,16 @@ typedef enum OrtExecutionProviderDevicePolicy {
510511
OrtExecutionProviderDevicePolicy_MIN_OVERALL_POWER,
511512
} OrtExecutionProviderDevicePolicy;
512513

514+
/** \brief Reasons why an execution provider might not be compatible with a device
515+
*/
516+
typedef enum OrtDeviceEpIncompatibilityReason {
517+
OrtDeviceEpIncompatibility_NONE = 0,
518+
OrtDeviceEpIncompatibility_DRIVER_INCOMPATIBLE = 1 << 0,
519+
OrtDeviceEpIncompatibility_DEVICE_INCOMPATIBLE = 1 << 1,
520+
OrtDeviceEpIncompatibility_MISSING_DEPENDENCY = 1 << 2,
521+
OrtDeviceEpIncompatibility_UNKNOWN = 1 << 31
522+
} OrtDeviceEpIncompatibilityReason;
523+
513524
/** \brief Delegate to allow providing custom OrtEpDevice selection logic
514525
*
515526
* This delegate is called by the EP selection code to allow the user to provide custom device selection logic.
@@ -6784,6 +6795,121 @@ struct OrtApi {
67846795
ORT_API2_STATUS(SessionGetEpDeviceForOutputs, _In_ const OrtSession* session,
67856796
_Out_writes_(num_outputs) const OrtEpDevice** outputs_ep_devices,
67866797
_In_ size_t num_outputs);
6798+
/** \brief Get the number of available hardware devices.
6799+
*
6800+
* Returns the count of hardware devices discovered on the system.
6801+
* Use this to allocate an array before calling GetHardwareDevices().
6802+
*
6803+
* \param[in] env The OrtEnv instance where device discovery results are stored.
6804+
* \param[out] num_devices The number of OrtHardwareDevice instances available.
6805+
*
6806+
* \snippet{doc} snippets.dox OrtStatus Return Value
6807+
*
6808+
* \since Version 1.24.
6809+
*/
6810+
ORT_API2_STATUS(GetNumHardwareDevices, _In_ const OrtEnv* env, _Out_ size_t* num_devices);
6811+
6812+
/** \brief Get the list of available hardware devices.
6813+
*
6814+
* Enumerates hardware devices available on the system.
6815+
* Populates a user-provided array with pointers to OrtHardwareDevice instances. The caller is responsible
6816+
* for allocating the array with sufficient space (use GetNumHardwareDevices() to get the count).
6817+
*
6818+
* The returned pointers reference internal ORT data structures that are discovered once at process
6819+
* startup and remain valid for the lifetime of the OrtEnv. The caller does not need to release these
6820+
* pointers, but should not use them after calling ReleaseEnv().
6821+
*
6822+
* \param[in] env The OrtEnv instance where device discovery results are stored.
6823+
* \param[out] devices User-allocated array to receive pointers to OrtHardwareDevice instances.
6824+
* The array must have space for at least num_devices elements.
6825+
* \param[in] num_devices The size of the user-allocated devices array.
6826+
*
6827+
* \snippet{doc} snippets.dox OrtStatus Return Value
6828+
*
6829+
* \since Version 1.24.
6830+
*/
6831+
ORT_API2_STATUS(GetHardwareDevices, _In_ const OrtEnv* env,
6832+
_Out_writes_(num_devices) const OrtHardwareDevice** devices,
6833+
_In_ size_t num_devices);
6834+
6835+
/** \brief Check for known incompatibility issues between hardware device and a specific execution provider.
6836+
*
6837+
* This function checks for known incompatibility issues between the specified hardware device
6838+
* and a specific execution provider.
6839+
* If returned incompatibility details have non-zero reasons, it indicates the device is not compatible.
6840+
* However, if returned detail have reason == 0, it doesn't guarantee 100% compatibility for all models,
6841+
* as models may have specific requirements.
6842+
*
6843+
* Note: This method should only be called when the OrtEnv has been initialized with execution
6844+
* providers (after RegisterExecutionProviderLibrary is called).
6845+
*
6846+
* \param[in] env The OrtEnv instance with registered execution providers.
6847+
* \param[in] ep_name The name of the execution provider to check. Required and cannot be null or empty.
6848+
* \param[in] hw The hardware device to check for incompatibility.
6849+
* \param[out] details Compatibility details including reasons for incompatibility if any.
6850+
* Must be freed with OrtApi::ReleaseDeviceEpIncompatibilityDetails.
6851+
*
6852+
* \snippet{doc} snippets.dox OrtStatus Return Value
6853+
*
6854+
* \since Version 1.24.
6855+
*/
6856+
ORT_API2_STATUS(GetHardwareDeviceEpIncompatibilityDetails, _In_ const OrtEnv* env,
6857+
_In_ const char* ep_name,
6858+
_In_ const OrtHardwareDevice* hw,
6859+
_Outptr_ OrtDeviceEpIncompatibilityDetails** details);
6860+
6861+
/// \name OrtDeviceEpIncompatibilityDetails
6862+
/// Accessor functions for device incompatibility details
6863+
/// @{
6864+
6865+
/** \brief Get the incompatibility reasons bitmask from OrtDeviceEpIncompatibilityDetails.
6866+
*
6867+
* \param[in] details The OrtDeviceEpIncompatibilityDetails instance to query.
6868+
* \param[out] reasons_bitmask Pointer to store the bitmask of incompatibility reasons.
6869+
*
6870+
* \snippet{doc} snippets.dox OrtStatus Return Value
6871+
*
6872+
* \since Version 1.24.
6873+
*/
6874+
ORT_API2_STATUS(DeviceEpIncompatibilityDetails_GetReasonsBitmask,
6875+
_In_ const OrtDeviceEpIncompatibilityDetails* details,
6876+
_Out_ uint32_t* reasons_bitmask);
6877+
6878+
/** \brief Get the notes from OrtDeviceEpIncompatibilityDetails.
6879+
*
6880+
* \param[in] details The OrtDeviceEpIncompatibilityDetails instance to query.
6881+
* \param[out] notes Pointer to the notes string. May be nullptr if no notes are available.
6882+
* The returned string is owned by the details object and should not be freed.
6883+
*
6884+
* \snippet{doc} snippets.dox OrtStatus Return Value
6885+
*
6886+
* \since Version 1.24.
6887+
*/
6888+
ORT_API2_STATUS(DeviceEpIncompatibilityDetails_GetNotes,
6889+
_In_ const OrtDeviceEpIncompatibilityDetails* details,
6890+
_Outptr_result_maybenull_ const char** notes);
6891+
6892+
/** \brief Get the execution provider error code from OrtDeviceEpIncompatibilityDetails.
6893+
*
6894+
* This allows Independent Hardware Vendors (IHVs) to define their own error codes
6895+
* to provide additional details about device incompatibility.
6896+
*
6897+
* \param[in] details The OrtDeviceEpIncompatibilityDetails instance to query.
6898+
* \param[out] error_code Pointer to store the EP-specific error code. A value of 0 indicates no error code was set.
6899+
*
6900+
* \snippet{doc} snippets.dox OrtStatus Return Value
6901+
*
6902+
* \since Version 1.24.
6903+
*/
6904+
ORT_API2_STATUS(DeviceEpIncompatibilityDetails_GetErrorCode,
6905+
_In_ const OrtDeviceEpIncompatibilityDetails* details,
6906+
_Out_ int32_t* error_code);
6907+
6908+
/** \brief Release an OrtDeviceEpIncompatibilityDetails instance.
6909+
*
6910+
* \since Version 1.24.
6911+
*/
6912+
ORT_CLASS_RELEASE(DeviceEpIncompatibilityDetails);
67876913

67886914
/// @}
67896915
};

include/onnxruntime/core/session/onnxruntime_ep_c_api.h

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1308,6 +1308,27 @@ struct OrtEpApi {
13081308
*/
13091309
ORT_API2_STATUS(KernelInfo_GetEp, _In_ const OrtKernelInfo* info, _Outptr_ const OrtEp** ep);
13101310

1311+
/** \brief Set the details of an OrtDeviceEpIncompatibilityDetails instance.
1312+
*
1313+
* Used by execution provider factories to set incompatibility details in their
1314+
* GetHardwareDeviceIncompatibilityDetails implementation. ORT creates and initializes the object
1315+
* before passing it to the EP, so calling this function is optional. The EP uses this function
1316+
* to set incompatibility information when the device is not compatible.
1317+
*
1318+
* \param[in,out] details The OrtDeviceEpIncompatibilityDetails instance to update.
1319+
* \param[in] reasons_bitmask Bitmask of OrtDeviceEpIncompatibilityReason values. (0 = no incompatibility).
1320+
* \param[in] error_code Optional EP-specific error code (0 = no error).
1321+
* \param[in] notes Optional human-readable notes. Can be null.
1322+
*
1323+
* \snippet{doc} snippets.dox OrtStatus Return Value
1324+
*
1325+
* \since Version 1.24.
1326+
*/
1327+
ORT_API2_STATUS(DeviceEpIncompatibilityDetails_SetDetails, _Inout_ OrtDeviceEpIncompatibilityDetails* details,
1328+
_In_ uint32_t reasons_bitmask,
1329+
_In_ int32_t error_code,
1330+
_In_opt_z_ const char* notes);
1331+
13111332
/** \brief Creates an OrtKernelImpl instance for an If operator.
13121333
*
13131334
* Control flow operators require access to ORT session internals to orchestrate subgraph operations.
@@ -1990,6 +2011,30 @@ struct OrtEpFactory {
19902011
*/
19912012
ORT_API2_STATUS(SetEnvironmentOptions, _In_ OrtEpFactory* this_ptr, _In_ const OrtKeyValuePairs* options);
19922013

2014+
/** \brief Check for known incompatibility reasons between a hardware device and this execution provider.
2015+
*
2016+
* This function allows an execution provider to check if a specific hardware device is compatible
2017+
* with the execution provider. The EP can set specific incompatibility reasons via the
2018+
* OrtDeviceEpIncompatibilityDetails parameter using OrtEpApi::DeviceEpIncompatibilityDetails_SetDetails.
2019+
*
2020+
* \param[in] this_ptr The OrtEpFactory instance.
2021+
* \param[in] hw The hardware device to check for incompatibility.
2022+
* \param[in,out] details Pre-allocated incompatibility details object created and initialized by ORT.
2023+
* The EP can use OrtEpApi::DeviceEpIncompatibilityDetails_SetDetails to set
2024+
* incompatibility information. If the device is compatible, the EP can
2025+
* leave the object unchanged (it defaults to no incompatibility).
2026+
*
2027+
* \note Implementation of this function is optional.
2028+
* If not implemented, ORT will assume the device is compatible with this EP.
2029+
*
2030+
* \snippet{doc} snippets.dox OrtStatus Return Value
2031+
*
2032+
* \since Version 1.24.
2033+
*/
2034+
ORT_API2_STATUS(GetHardwareDeviceIncompatibilityDetails, _In_ OrtEpFactory* this_ptr,
2035+
_In_ const OrtHardwareDevice* hw,
2036+
_Inout_ OrtDeviceEpIncompatibilityDetails* details);
2037+
19932038
/** \brief Create an OrtExternalResourceImporterImpl for external resource import.
19942039
*
19952040
* This is used to create an external resource importer that enables zero-copy import of

onnxruntime/core/session/abi_devices.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,3 +75,9 @@ struct OrtEpDevice {
7575
// get/create methods to be as flexible as possible. this helper converts to a non-const factory instance.
7676
OrtEpFactory* GetMutableFactory() const { return ep_factory; }
7777
};
78+
79+
struct OrtDeviceEpIncompatibilityDetails {
80+
uint32_t reasons_bitmask{0}; // Bitmask of OrtDeviceEpIncompatibilityReason values
81+
int32_t error_code{0}; // EP-specific error code (0 = no error)
82+
std::string notes; // Additional human-readable notes
83+
};

0 commit comments

Comments
 (0)