Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU Device Variable on Intel GPUs #4056

Merged
merged 4 commits into from
Aug 13, 2024

Conversation

WeiqunZhang
Copy link
Member

This adds GPU device variable support on Intel GPUs using Intel oneAPI compiler's experimental feature.

To make the user interface consistent, we have add a macro AMREX_DEVICE_GLOBAL_VARIABLE. For example, the user can define a device variable as follows for all GPUs and CPUs.

AMREX_DEVICE_GLOBAL_VARIABLE(amrex::Real, my_dg1);  // amrex::Real my_dg1;
AMREX_DEVICE_GLOBAL_VARIABLE(amrex::Real, 4, my_dg2);  // amrex::Real my_dg2[4];

Below are their declarations.

extern AMREX_DEVICE_GLOBAL_VARIABLE(amrex::Real, my_dg1);
extern AMREX_DEVICE_GLOBAL_VARIABLE(amrex::Real, 4, my_dg2);

GPU and CPU kernels can use the global variables if they see the declarations.

We have also added two functions from copying data from and to device global variables.

//! Copy `nbytes` bytes from host to device global variable. `offset` is the
//! offset in bytes from the start of the device global variable.
template <typename T>
void memcpy_from_host_to_device_global_async (T& dg, const void* src,
                                              std::size_t nbytes,
                                              std::size_t offset = 0)

//! Copy `nbytes` bytes from device global variable to host. `offset` is the
//! offset in bytes from the start of the device global variable.
template <typename T>
void memcpy_from_device_global_to_host_async (void* dst, T const& dg,
                                              std::size_t nbytes,
                                              std::size_t offset = 0)

Summary

Additional background

Checklist

The proposed changes:

  • fix a bug or incorrect behavior in AMReX
  • add new capabilities to AMReX
  • changes answers in the test suite to more than roundoff level
  • are likely to significantly affect the results of downstream AMReX users
  • include documentation in the code and/or rst files, if appropriate

This adds GPU device variable support on Intel GPUs using Intel oneAPI
compiler's experimental feature.

To make the user interface consistent, we have add a macro
AMREX_DEVICE_GLOBAL_VARIABLE. For example, the user can define a device
variable as follows for all GPUs and CPUs.

    AMREX_DEVICE_GLOBAL_VARIABLE(amrex::Real, my_dg1);  // amrex::Real my_dg1;
    AMREX_DEVICE_GLOBAL_VARIABLE(amrex::Real, 4, my_dg2);  // amrex::Real my_dg2[4];

Below are their declarations.

    extern AMREX_DEVICE_GLOBAL_VARIABLE(amrex::Real, my_dg1);
    extern AMREX_DEVICE_GLOBAL_VARIABLE(amrex::Real, 4, my_dg2);

GPU and CPU kernels can use the global variables if they see the
declarations.

We have also added two functions from copying data from and to device global
variables.

    //! Copy `nbytes` bytes from host to device global variable. `offset` is the
    //! offset in bytes from the start of the device global variable.
    template <typename T>
    void memcpy_from_host_to_device_global_async (T& dg, const void* src,
                                                  std::size_t nbytes,
                                                  std::size_t offset = 0)

    //! Copy `nbytes` bytes from device global variable to host. `offset` is the
    //! offset in bytes from the start of the device global variable.
    template <typename T>
    void memcpy_from_device_global_to_host_async (void* dst, T const& dg,
                                                  std::size_t nbytes,
                                                  std::size_t offset = 0)
@ax3l ax3l added the GPU label Aug 6, 2024
Copy link
Contributor

@kngott kngott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit looks good and straightforward.

My only question would be: do we want to name the function calls and macro after the generic Intel naming of the object? Don't have a good idea, but "global" and "device" are extremely overloaded at this point and if feels like I'm going to forget what this and need to look it up occasionally. "device_global_var"?

@WeiqunZhang
Copy link
Member Author

To me device global variable means (device (global variable)). global variable is a common term and device is a noun adding an attribute like cheese in cheese cake.

@kngott
Copy link
Contributor

kngott commented Aug 13, 2024

Fair enough. As long as it works generally.

@WeiqunZhang WeiqunZhang merged commit 5dedac0 into AMReX-Codes:development Aug 13, 2024
72 checks passed
@WeiqunZhang WeiqunZhang deleted the device_global branch August 13, 2024 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants