Skip to content

Python API for HOST_ACCESSIBLE OrtValue allocation#28038

Draft
ericcraw wants to merge 1 commit intomicrosoft:mainfrom
ericcraw:python-host-accessible-api
Draft

Python API for HOST_ACCESSIBLE OrtValue allocation#28038
ericcraw wants to merge 1 commit intomicrosoft:mainfrom
ericcraw:python-host-accessible-api

Conversation

@ericcraw
Copy link
Copy Markdown
Contributor

@ericcraw ericcraw commented Apr 10, 2026

Description

Adds memory_info= parameter to OrtValue.ortvalue_from_shape_and_type(), backed by two new C-level factory methods that look up the registered shared allocator via the full OrtMemoryInfo (including mem_type).

This is required because the current shared allocator query doesn't include the memory type making HOST_ACCESSIBLE invisible to python. UsesCpuMemory() is used in GetPyObjFromTensor so that tensors in HOST_ACCESSIBLE memory are returned as zero-copy numpy views.

Motivation and Context

Enable zero copy interop between numpy and ortvalue.

This is a follow up for #28037

Adds memory_info= parameter to OrtValue.ortvalue_from_shape_and_type(),
backed by two new C-level factory methods that look up the registered
shared allocator via the full OrtMemoryInfo (including mem_type).

This is required because the current shared allocator query doesn't
include the memory type making HOST_ACCESSIBLE invisible to python.
UsesCpuMemory() is used in GetPyObjFromTensor so that tensors in
HOST_ACCESSIBLE memory are returned as zero-copy numpy views.
:param memory_info: An OrtMemoryInfo from an OrtEpDevice (e.g. via ep_device.memory_info(OrtDeviceMemoryType.HOST_ACCESSIBLE)). When provided, the allocator matching this memory info is used directly, which allows allocating HOST_ACCESSIBLE memory for zero-copy numpy interop. The device_type, device_id, and vendor_id parameters are ignored when memory_info is provided.
"""

if memory_info is not None:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if memory_info is not None:

When memory_info is not None, the other device parameters are silently ignored. The docstring documents this. This is acceptable, but a warnings.warn() or a check that the caller didn't set both memory_info and non-default device params would be more user-friendly.

@yuslepukhin
Copy link
Copy Markdown
Member

No Python test exercising the new memory_info= parameter or verifying that HOST_ACCESSIBLE OrtValues produce zero-copy numpy views.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Python-level support for allocating OrtValue tensors using an explicit OrtMemoryInfo (including mem_type) so plugin EP HOST_ACCESSIBLE shared allocators can be selected, enabling zero-copy numpy interop for those tensors.

Changes:

  • Update tensor-to-numpy conversion to treat HOST_ACCESSIBLE tensors as CPU-memory-compatible via OrtDevice::UsesCpuMemory().
  • Add new pybind factory methods to allocate OrtValue from shape/type using a full OrtMemoryInfo lookup.
  • Extend OrtValue.ortvalue_from_shape_and_type() Python API with an optional memory_info= parameter to route allocations through those new factories.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
onnxruntime/python/onnxruntime_pybind_state.cc Enables zero-copy numpy views for HOST_ACCESSIBLE tensors via UsesCpuMemory().
onnxruntime/python/onnxruntime_pybind_ortvalue.cc Adds OrtMemoryInfo-based OrtValue allocation factories using shared allocator lookup.
onnxruntime/python/onnxruntime_inference_collection.py Exposes memory_info= on OrtValue.ortvalue_from_shape_and_type() and dispatches to new C++ factories.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +88 to +93
auto& env = GetOrtEnv()->GetEnvironment();
AllocatorPtr allocator = env.GetRegisteredSharedAllocator(memory_info);

if (!allocator) {
throw std::runtime_error("No shared allocator found for the given OrtMemoryInfo.");
}
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new OrtValueFromShapeAndTypeWithMemoryInfo throws a generic error when no shared allocator is found. This can be hard to diagnose (e.g., mem_type mismatch between DEFAULT vs HOST_ACCESSIBLE). Consider including key details from the requested memory_info (device type/vendor/id, device mem type, and OrtMemType) in the exception message so callers can see what was looked up.

Copilot uses AI. Check for mistakes.
Comment on lines +1098 to +1113
if memory_info is not None:
if isinstance(element_type, int):
return cls(
C.OrtValue.ortvalue_from_shape_and_onnx_type_for_memory_info(
shape,
element_type,
memory_info,
)
)
return cls(
C.OrtValue.ortvalue_from_shape_and_type_for_memory_info(
shape,
element_type,
memory_info,
)
)
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new memory_info allocation path and the UsesCpuMemory() zero-copy numpy conversion path don’t appear to have test coverage. Adding a Python test that allocates an OrtValue using memory_info=ep_device.memory_info(OrtDeviceMemoryType.HOST_ACCESSIBLE) and validates ort_value.numpy() works (and ideally is zero-copy) would protect this behavior and prevent regressions.

Copilot uses AI. Check for mistakes.
const auto device_type = device.Type();
// Create an numpy array on top of the OrtValue memory, no copy
if (device_type == OrtDevice::CPU) {
// Create an numpy array on top of the OrtValue memory, no copy.
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar: use "a numpy array" (not "an numpy array").

Suggested change
// Create an numpy array on top of the OrtValue memory, no copy.
// Create a numpy array on top of the OrtValue memory, no copy.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants