Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Evict program cache on PI_ERROR_OUT_OF_RESOURCES #11987

Merged
merged 8 commits into from
Dec 8, 2023

Conversation

jzc
Copy link
Contributor

@jzc jzc commented Nov 22, 2023

This PR adds resource cleanup mechanisms when PI_ERROR_OUT_OF_RESOURCES occurs on piProgramBuild and piProgramLink. Currently. the whole cache is cleared, but this behavior can be extended in the future. In order to ensure thread safety of the cache clearing mechanism, the maps of the cache now hold their BuildResult's in shared_ptr's, which allows the cache to be cleared even if there are still threads that still hold references to a BuildResult.

@jzc jzc requested a review from a team as a code owner November 22, 2023 23:53
Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the approach, but we should probably address the race conditions before we consider merging this. Should it be converted to draft for good measure?

sycl/source/detail/program_manager/program_manager.cpp Outdated Show resolved Hide resolved
@jzc jzc marked this pull request as draft November 27, 2023 17:30
@jzc jzc marked this pull request as ready for review December 5, 2023 15:10
// CHECK-CACHE: piKernelRelease
// CHECK-CACHE: piKernelRelease
// CHECK-CACHE: piProgramRelease
// CHECK-CACHE: piProgramRelease
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cache is destructed in a different order now, which is why these are swapped. Also the previous version of this test only had 2 checks for release of program/kernel even though there were 3 programs/kernels that should have been created, which is why any extra check was added.

Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just need to avoid the error, but just to avoid post-commit failure we should fix the unused variable.

@steffenlarsen steffenlarsen merged commit 6cf1ae0 into intel:sycl Dec 8, 2023
11 checks passed
KornevNikita added a commit to KornevNikita/llvm that referenced this pull request Jun 10, 2024
I'm observing cache overflow when running heavy tests on OCL backend
with gpu. Clear cache in case of PI_ERROR_OUT_OF_HOST_MEMORY as well as
for PI_ERROR_OUT_OF_RESOURCES.
Using as reference: intel#11987
dm-vodopyanov pushed a commit that referenced this pull request Jun 13, 2024
I'm observing cache overflow when running heavy tests on OCL backend
with gpu. Clear cache in case of PI_ERROR_OUT_OF_HOST_MEMORY as well as
for PI_ERROR_OUT_OF_RESOURCES.
Using as reference: #11987
ianayl pushed a commit to ianayl/sycl that referenced this pull request Jun 13, 2024
I'm observing cache overflow when running heavy tests on OCL backend
with gpu. Clear cache in case of PI_ERROR_OUT_OF_HOST_MEMORY as well as
for PI_ERROR_OUT_OF_RESOURCES.
Using as reference: intel#11987
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants