Clarify USM context restrictions#9
Conversation
Various clarifications about the meaning of the context that is used to allocate USM memory: * If a kernel dereferences a USM pointer, that USM allocation must have the same context as the queue to which the kernel is submitted. * A USM pointer that is passed to an explicit memory operation command must come from a USM allocation that uses the same context as the queue to which the command is submitted. * A USM pointer that is passed to an explicit memory operation command must be accessible on the queue's device. * For the explicit copy operation commands that take both an accessor and a pointer, clarify that the pointer must be a host pointer. * When allocating USM "host" memory, at least one of the devices in the context must support USM "host" memory. I also cleaned up the descriptions of some of the USM allocation functions by moving some common wording to the corresponding section preamble. Closes KhronosGroup#184
cf49255 to
05da9f0
Compare
bashbaug
left a comment
There was a problem hiding this comment.
Mostly LGTM. I left a few detailed comments but nothing that should hold up this PR. This is a great improvement!
| Each type of USM memory allocation has different rules for where that memory is | ||
| accessible. Attempting to dereference a USM pointer in violation of these | ||
| rules results in undefined behavior. The explicit memory operation | ||
| <<command, commands>> that take USM pointers have a similar restriction, and | ||
| violating these restrictions results in undefined behavior. |
There was a problem hiding this comment.
This paragraph is a little awkward, the last sentence is a little redundant with the preceding paragraph, and it and seems to be missing part of the paragraph it is replacing, namely that some types of USM may not be supported by a device. Consider changing to something like:
| Each type of USM memory allocation has different rules for where that memory is | |
| accessible. Attempting to dereference a USM pointer in violation of these | |
| rules results in undefined behavior. The explicit memory operation | |
| <<command, commands>> that take USM pointers have a similar restriction, and | |
| violating these restrictions results in undefined behavior. | |
| Each type of USM memory allocation has different rules for where that memory is | |
| accessible. Attempting to dereference a USM pointer in violation of these | |
| rules results in undefined behavior. Attempting to access a type of USM memory | |
| that is not supported by the device also results in undefined behavior.. |
There was a problem hiding this comment.
I assume the last sentence in your suggestion is supposed to mean that it is UB to pass a USM pointer to one of the "explicit memory operation" functions unless that pointer is accessible on the device? This doesn't seem very clear, though, because a reader might not think of these functions as "accessing" the memory at that pointer.
How about if the last sentence says:
Passing a USM pointer to one of the explicit memory functions where the pointer is not accessible to the device generally results in undefined behavior. See <subsec:explicitmemory> for the exact rules.
I wanted to be a bit vague here to make it easier in the future to allow cases where (e.g.) the application passes a USM host pointer which is not accessible to the device. I want the description of the functions in section <<subsec:explicitmemory>> to be the single place where these rules are defined.
Note that this paragraph differs from the previous one because the previous paragraph talks about the rules regarding "context". By contrast, this paragraph talks about the rules for "accessibility" of the pointer. In fact, I see this entire paragraph as the replacement of the old sentence "Attempting to access a kind of USM allocation that is not supported by the device results in undefined behavior" because it clarifies what "access" means. I'm saying it means that you cannot dereference the pointer in a kernel and you cannot pass the pointer to the explicit memory operation functions.
There was a problem hiding this comment.
Just to be clear, the sentence from the old description that seems to be missing is:
Attempting to access a kind of USM allocation that is not supported by the device results in undefined behavior.
I read this as saying: If you try to use a type of USM (say, shared USM) and the device does not support that type of USM (say, the device does not have aspect::usm_shared_allocations) then it won't work. I would actually hope this is an error and not undefined behavior, but maybe I'm missing some nuance.
I tried to keep the previous sentence more-or-less unchanged but I'm fine rephrasing it or adding more detail. Note though, I do think this statement should apply to all USM accesses, and not just the "explicit memory functions".
There was a problem hiding this comment.
I would actually hope this is an error and not undefined behavior, but maybe I'm missing some nuance.
I would also like the "explicit memory operation" functions to return an error if the USM pointer has the wrong context or is not accessible on the device. However, I did not want to write this into the spec if it results in significant overhead. Do the underlying OpenCL and Level Zero "copy" APIs return an error in these cases? I think their specs do not specifically say whether this is the case.
In general, I've been leaning towards UB for error conditions in these "clarification" PRs when I'm not sure about the cost of error checking. I figure it's easier to change the spec later to say that an error must be returned, if we decide that it is reasonable.
Note though, I do think this statement should apply to all USM accesses, and not just the "explicit memory functions".
Sure, 100% agree. I think there are only two ways to use a USM pointer: you can dereference it in code, or you can pass it to one of the "explicit memory operation" functions. I tired to address the dereferencing case with this sentence:
Attempting to dereference a USM pointer in violation of these rules results in undefined behavior.
Thinking about it more, maybe it should be clarified a bit to:
Attempting to dereference a USM pointer on the host or on a device in violation of these rules results in undefined behavior.
Note that the actual rules for whether a USM pointer is "accessible" on a device are not clear in this PR (or before this PR). These are the rules in #8 that we are still debating.
| SYCL defines an allocator class named [code]#usm_allocator# that satisfies the | ||
| {cpp} named requirement [code]#Allocator#. The [code]#AllocKind# template | ||
| parameter can be either [code]#usm::alloc::host# or [code]#usm::alloc::shared#, | ||
| causing the allocator to allocate either USM host or USM shared memory. |
There was a problem hiding this comment.
This is admittedly rather pedantic, but do we want to talk about "USM host" and "USM shared" memory as adjectives, or are there other terms we should use instead?
The current SYCL 2020 spec doesn't seem to use either "USM host" or "host USM" as a descriptive term, except in one table heading.
The OpenCL extension spec uses "host USM" in the descriptions for various issues, but the spec text itself uses "Unified Shared Memory allocation returned by clHostMemAllocINTEL".
I think I have a slight preference for "host USM" vs. "USM host", or perhaps rephrasing to something like "host or shared unified shared memory".
There was a problem hiding this comment.
I think we need some term because there are 3 functions that could allocate "host USM" memory as well as the usm_allocator. It would get very verbose to say "Unified Shared Memory allocation returned by malloc_host, aligned_alloc_host, sycl::malloc (when called with usm::alloc::host), or the usm_allocator (when instantiated with usm::alloc::host)"
I don't have a preference between "USM host allocations" vs. "host USM allocations". If we use the latter already in the OpenCL spec, then let's standardize on that.
I think this term will be more clearly specified once we merge the remaining parts of the clarifications I want to do. Note that the internal PR #8 adds new sub-sections named "USM device allocations", "USM host allocations", and "USM shared allocations". I'll change the word order in #8 to say "Device USM allocations", etc.
Do you have any concern if I continue to use the phrase "host USM memory" as a shorthand for "memory in a host USM allocation"?
There was a problem hiding this comment.
Do you have any concern if I continue to use the phrase "host USM memory" as a shorthand for "memory in a host USM allocation"?
No concern, this sounds good, and I also prefer "host USM" over "USM host".
I have a slight preference for "host USM allocations" or even just "host USM" instead of "host USM memory", just because the M in USM stands for "memory", but that's awfully pedantic.
There was a problem hiding this comment.
I reversed the word order in terms like "USM host allocation" to be "host USM allocation". I also changed occurrences like "USM memory" to avoid the redundant "memory". These were done in bcff418
| a@ Returns a pointer to the newly allocated shared memory, which is allocated | ||
| on [code]#syclDevice#. This allocation is specified in bytes. Throws a |
There was a problem hiding this comment.
Pedantic: the allocated shared memory is not necessarily allocated on syclDevice, though it is associated with syclDevice.
| [code]#dest# must be a host pointer and must have at least | ||
| as many bytes as the range accessed by [code]#src#. |
There was a problem hiding this comment.
This is OK and it matches the intro paragraph to this section, but this is very restrictive and there are likely to be at least some cases where dest can be a pointer to a USM allocation. It might be better to leave this section ambiguous until we can work out the details, or adopting text similar to memcpy.
(Same goes for the other copy cases below.)
There was a problem hiding this comment.
All the "copy" APIs that take an accessor were originally intended to copy between the host and a device. I do not think the committee intended to expand these functions when it added USM. Even if we wanted to expand them, I think it should not be done as part of this PR. Therefore, I do not want this PR to expand the definition to say that src can be a USM pointer that is accessible on the handler's device.
My point in the meeting yesterday was that src should be allowed to be any pointer that is accessible on the host. This would include, for example, any USM host or shared pointer. However, since we agree that we are not ready for this, I left the wording vague by saying it must be a "host pointer".
There was a problem hiding this comment.
I'm OK with this for now, and it is easier to relax restrictions than it is to add them, but I do suspect we have code in the wild that is copying from an accessor to a USM allocation and therefore violating the updated description.
There was a problem hiding this comment.
Do you think the code "in the wild" is trying to copy between an accessor and a Device USM pointer (i.e. some USM memory that is not accessible on the host)? This is the case that I think the committee did not intend to allow.
All other cases would be allowed by the wording I have in this PR if we interpret "host pointer" to mean "a pointer that is accessible on the host".
There was a problem hiding this comment.
I suspect the most common case by far will be copying to a host USM allocation.
There was a problem hiding this comment.
No change here ... I think we agreed that the current wording is OK for now.
Update this paragraph to reflect code review comments.
* Use terms "device USM allocation", etc. instead of "USM device allocation". This matches existing wording in the OpenCL extension. * Avoid the term "USM memory" since the word "memory" is redundant here. * A shared USM allocation is not "allocated on" a device. It is "associated with" a device.
Various clarifications about the meaning of the context that is used
to allocate USM memory:
If a kernel dereferences a USM pointer, that USM allocation must have
the same context as the queue to which the kernel is submitted.
A USM pointer that is passed to an explicit memory operation command
must come from a USM allocation that uses the same context as the
queue to which the command is submitted.
A USM pointer that is passed to an explicit memory operation command
must be accessible on the queue's device.
For the explicit copy operation commands that take both an accessor
and a pointer, clarify that the pointer must be a host pointer.
When allocating USM "host" memory, at least one of the devices in
the context must support USM "host" memory.
I also cleaned up the descriptions of some of the USM allocation
functions by moving some common wording to the corresponding section
preamble.
Closes KhronosGroup#184