-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request for additonal attribute preferred_work_group_size_multiple
#886
Comments
(global and local cache sizes would be a useful addition too) |
The Querying it requires a kernel, which can only be extract from a kernel bundle in the executable state. |
Device object could definitely expose
There does not seem to be any descriptor to query local cache size. Only the @fcharras Could you please clarify what you mean by "local cache sizes" and perhaps refer to it in |
Is there a way to access the information elsewhere from python (maybe in |
Not yet, but perhaps exposing it for |
BTW, notice that in OpenCL the property is also specific to kernel and device, see
|
Thank you very much.
If I understand correctly, the information on the preferred work group size is only accessible once the kernel as been compiled (which makes sense because the compiler could have extra information on this value). Then if it is exposed with
edit: that's probably the right workflow considering https://github.com/IntelPython/numba-dpex/blob/12cbcf80f09da38bad23cfc7327266da6a4fc5e1/numba_dpex/decorators.py#L28 |
You're right, that doesn't seem to exist in clinfo, sorry for the misdirection. The point only holds for FYI, in soda-inria/sklearn-numba-dpex#2 we've been using |
Thank you, that would be awesome |
These are DPCTLDevice_GetGlobalMemCacheSize, DPCTLDevice_GlobalMemCacheLineSize, and DPCTLDevice_GetGlobalMemCacheType. To support the latter, introduced DPCTLGlobalMemCacheType enum in dpctl_sycl_enum_types.h Tests are added to test_capi target.
These are DPCTLDevice_GetGlobalMemCacheSize, DPCTLDevice_GlobalMemCacheLineSize, and DPCTLDevice_GetGlobalMemCacheType. To support the latter, introduced DPCTLGlobalMemCacheType enum in dpctl_sycl_enum_types.h Tests are added to test_capi target.
These are DPCTLDevice_GetGlobalMemCacheSize, DPCTLDevice_GlobalMemCacheLineSize, and DPCTLDevice_GetGlobalMemCacheType. To support the latter, introduced DPCTLGlobalMemCacheType enum in dpctl_sycl_enum_types.h Tests are added to test_capi target.
These are DPCTLDevice_GetGlobalMemCacheSize, DPCTLDevice_GlobalMemCacheLineSize, and DPCTLDevice_GetGlobalMemCacheType. To support the latter, introduced DPCTLGlobalMemCacheType enum in dpctl_sycl_enum_types.h Tests are added to test_capi target.
These are DPCTLDevice_GetGlobalMemCacheSize, DPCTLDevice_GlobalMemCacheLineSize, and DPCTLDevice_GetGlobalMemCacheType. To support the latter, introduced DPCTLGlobalMemCacheType enum in dpctl_sycl_enum_types.h Tests are added to test_capi target.
Thank you for the PR @oleksandr-pavlyk I've also come accross papers (e.g this one for top k) where the authors use some other information about the device for the finest grained optimizations, like bank size or size of simd units, it's also exposed in |
gh-886: Added 3 new device attributes and kernel's device-specific attributes
These are DPCTLDevice_GetGlobalMemCacheSize, DPCTLDevice_GlobalMemCacheLineSize, and DPCTLDevice_GetGlobalMemCacheType. To support the latter, introduced DPCTLGlobalMemCacheType enum in dpctl_sycl_enum_types.h Tests are added to test_capi target.
dpctl.SyclDevice
exposes a number of useful attributes when scheduling kernels, butpreferred_work_group_size_multiple
seems to be an important information (used when choosing kernel local size) that is missing.It is available with clinfo:
or exposed in pyopencl
The text was updated successfully, but these errors were encountered: