-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL][Bindless][UR][L0][E2E] Fix linear interop memory and L0 V1 adapter leaks. #18353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][Bindless][UR][L0][E2E] Fix linear interop memory and L0 V1 adapter leaks. #18353
Conversation
…L0 V1 adapter leaks. This patch fixes the mapping of interop memory to a linear range represented by a void*. The SYCL Bindless Images spec is clarified with wording to indicate which free APIs should be used on mapped imported memory. Using `sycl::free` for freeing mapped imported linear memory regions does not work for the LevelZero adapter, as the backend needs to call `zeMemFree` directly - it cannot go through the regular `sycl::free` code as that will use UMF free functions by default. This is invalid for memory regions mapped using `map_external_linear_memory`. The `free_mappep_linear_memory` SYCL API has been added to the Bindless Images specification to remedy this problem. This patch also introduces an E2E test which verifies that a Vulkan test buffer's memory can be correctly imported into SYCL and its data retrieved from and written to within a SYCL kernel by using purely USM pointers (i.e. without using bindless images). This is a step towards extending the test coverage of interop functionality currently in the Bindless Images extension, with the goal being at some point splitting the interop functionality into one or more separate extensions. Two other fixes have been made in the LevelZero UR adapter. A leaked `ze_image_handle_t` object has been fixed in the LevelZero V1 adapter. The previous implementation incorrectly reinterpreted bindless memory handles as `ur_mem_handle_t` types. The implementation of `urBindlessImagesImageFreeExp` is now unified between V1 and V2, removing the leak. As such the `level_zero/v2/image.cpp` file has been removed. The `urMemoryHandle` member of `ur_ze_external_memory_data` has been removed. Aside from being incorrectly reinterpreted from `ur_bindless_mem_handle_t`, the LevelZero adapter does not need to track the memory handle for external memory resources.
`free_mapped_linear_memory`. | ||
|
||
Imported external memory handle should be released using | ||
`release_external_memory`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From these, it is unclear to me whether a user needs to call both free_mapped_linear_memory
/free_image_mem
and release_external_memory
or just one. I.e., say I have:
external_mem EM = import_external_memory(...);
void *MEIM = map_external_image_memory(EM, ...);
...
free_image_mem(MEIM, ...);
release_external_memory(EM, ...);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've clarified the wording in the spec. It should hopefully be clear now that the user should first unmap_
the mapped memory, and then call release_external_memory
.
I've also added a paragraph stating that an external_mem
handle can only be mapped to a single image_mem_handle
or void *
at any one time. While this use case should be possible, at least for mapping multiple linear memory regions, it is currently untested, so I am reluctant to state that it is possible (and I do not want to leave this underspecified in the spec). We will need to consider this use case in a future PR, ensure that all backends support what we need (or introduce device queries if necessary), and amend the spec with an accompanying test case.
Memory mapped using `map_external_image_memory` should be freed using | ||
`free_image_mem`. | ||
|
||
Memory mapped using `map_external_linear_memory` should be freed using | ||
`free_mapped_linear_memory`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It irks me that free_image_mem
uses mem
while all these other APIs use memory
. I suppose it's a little late to address that though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is unfortunate. alloc_image_mem
is in the same boat. We should address this in the future, perhaps by deprecating and phasing out the _mem
function naming first. Not something for this PR though.
Memory mapped using `map_external_image_memory` should be freed using | ||
`free_image_mem`. | ||
|
||
Memory mapped using `map_external_linear_memory` should be freed using | ||
`free_mapped_linear_memory`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the intention is that the user is supposed to call release_external_memory
after calling one of these, then I would prefer the following names instead: unmap_external_image_memory
and unmap_external_linear_memory
, where the former would be a new API. It is less overloading of the term "free" and makes a much stronger association with the map_*
APIs they are associated with.
If we want to go that way, I think these deserve a separate section, instead of being bundled together with release_external_memory
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've amended the naming to follow your suggestion. The unmap_
and release_
functions now also have their separate code sections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
8562d66
to
bdfc206
Compare
bdfc206
to
b9e9be7
Compare
Ping @intel/dpcpp-nativecpu-reviewers @intel/unified-runtime-reviewers @intel/unified-runtime-reviewers-level-zero @intel/unified-runtime-reviewers-level-zero A quick review and approval for this PR would be very much appreciated, as getting these changes in to the upcoming DPC++ release will be very beneficial. CI has previously been failing |
I can confirm that my testing code (with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UR + CL adapter LGTM
|
Ping @intel/dpcpp-nativecpu-reviewers @intel/unified-runtime-reviewers-level-zero |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NativeCPU part looks fine to me
This patch fixes the mapping of interop memory to a linear range represented by a
void*
. The SYCL Bindless Images spec is clarified with wording to indicate which free APIs should be used on mapped imported memory.Using
sycl::free
for freeing mapped imported linear memory regions does not work for the LevelZero adapter, as the backend needs to callzeMemFree
directly - it cannot go through the regularsycl::free
code as that will use UMF free functions by default. This is invalid for memory regions mapped usingmap_external_linear_memory
.The
unmap_external_linear_memory
SYCL API has been added to the Bindless Images specification to remedy this problem. Anunmap_external_image_memory
has also been added.This patch also introduces an E2E test which verifies that a Vulkan buffer's memory can be correctly imported into SYCL and its data retrieved from and written to within a SYCL kernel by using purely USM pointers (i.e. without using bindless images). This is a step towards extending the test coverage of interop functionality currently in the Bindless Images extension, with the goal being at some point splitting the interop functionality into one or more separate extensions.
Two other fixes have been made in the LevelZero UR adapter.
A leaked
ze_image_handle_t
object has been fixed in the LevelZero V1 adapter. The previous implementation incorrectly reinterpreted bindless memory handles asur_mem_handle_t
types. The implementation ofurBindlessImagesImageFreeExp
is now unified between V1 and V2, removing the leak. As such thelevel_zero/v2/image.cpp
file has been removed.The
urMemoryHandle
member ofur_ze_external_memory_data
has been removed. Aside from being incorrectly reinterpreted fromur_bindless_mem_handle_t
, the LevelZero adapter does not need to track the memory handle for external memory resources.