Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented memory pool for SYCL device/host memory #1039

Merged
merged 5 commits into from
Dec 4, 2023

Conversation

do-jason
Copy link
Collaborator

@do-jason do-jason commented Nov 30, 2023

Implemented memory pool for SYCL device/host memory. (added src/acc/sycl/sycl_memory_pool.*)

  • All except ZE_* and ZEX_* environment variables are not necessary.
  • Now "ulimit -n" is not necessary in most cases.

Tuned SYCL kernel workgroup size.

  • Showed significant improvement in 2D classification performance.

Fixed AccPtr host memory leak in PassWeights.weights. (src/acc/acc_ml_optimiser_impl.h)
Prevented threads of the same MPI rank from being assigned to different SYCL devices in automatic mapping (ml_optimiser_mpi.cpp and ml_optimiser.cpp line 1740)
Fix Intel IPP library header file for oneAPI 2024 release and older. (CMakeLists.txt and src/acc/utilities.h)
Minor update for SYCL2020 standard.
Consistent use of tab/space in source codes under src/acc/sycl directory.

…and ZEX_* environment variables are not necessary. "ulimit -n" is not necessary.)

Tuned SYCL kernel workgroup size (Showed big improvement in 2D classification performance)
Fixed AccPtr host memory leak in PassWeights.weights (src/acc/acc_ml_optimiser_impl.h)
Prevented threads of the same MPI rank from being assigned to different SYCL devices in automatic mapping
Fix Intel IPP library header file for oneAPI 2024 release and older
Minor update for SYCL2020 standard
Consistent use of tab/space in source codes
@biochem-fan
Copy link
Member

Thank you very much for your continuing contributions!

I am bit busy right now (MicroED and XFEL beam time) but hopefully will try to compile this by next Wednesday. (or can @scheres test this when you recompile the LMB binary?)

By the way, please don't squash your commits when sending your PR next time so that I can see which change corresponds to which goal.

@do-jason
Copy link
Collaborator Author

do-jason commented Dec 1, 2023

@biochem-fan I have updated PR notes with a little bit more details. Sorry for the single commit which contains all the changes. This PR was made from Intel private branch, and it has lots of dirty and debug commits. So, I have made clean single git for this. I will make multiple commits in future PR which has lots of changes.
Only the change in src/acc/acc_ml_optimiser_impl.h has impact to all architectures, and all others have impact to SYCL code path only.

@biochem-fan
Copy link
Member

I confirmed 6bba191 compiled and worked. Is this ready to be merged, or do you have more commits?

@jonggwan
Copy link

jonggwan commented Dec 4, 2023

@biochem-fan I have no more planned commits for this PR.

@biochem-fan biochem-fan merged commit e368e1c into ver5.0 Dec 4, 2023
0 of 4 checks passed
@biochem-fan
Copy link
Member

OK, I merged this. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants