Enables adding overload to DpexExpKernelTarget and fully inline them into the final module. #1230

diptorupd · 2023-11-25T20:55:52Z

Have you provided a meaningful PR description?
Add a new device_func decorator to compile DpexKernelTarget overloads
- A new generate_device_ir option for DpexTargetDescriptor is introduced. The new option can be used to prevent a module to be compiled to device IR binary.
- A new device_func decorator that is registered to compile overloads in DpexExpKernelTarget. The device_func decorated
  functions are not compiled to device IR.
- The kernel decorator is updated to compile a finalized module to device IR.
Have you added a test, reproducer or referred to an issue with a reproducer?
Have you tested your changes locally for CPU and GPU devices?
Have you made sure that new changes do not introduce compiler warnings?
If this PR is a work in progress, are you filing the PR as a draft?

diptorupd · 2023-11-28T16:03:56Z

Unit tests design sketch

Basic overload end to end
compilation_mode option in decorator (negative test case)
check if overloads are available in DpexExpKerneltarget._defns and no other target
~~kernel return error (copy from existing core test case)~~
inlining : check if can be set using decorator
inlining : check if overload gets fully inlined or not (two cases) (LLVM IR level)

numba_dpex/core/descriptor.py

numba_dpex/experimental/kernel_dispatcher.py

- A new target option was added for the DpexKernelTarget target to compile functions using the experimental KernelDispatcher differently based on whether they are "kernels" or "device functions". kernels have the spir_kernel calling convention, cannot return a value, enforce execution queue equivalence, and are always compiled down to device IR (SPIR-V). device functions have the spir_func calling convention, do not have the same restrictions on return value and input arguments and are only compiled to LLVM bitcode. - A device_func decorator was added to experimental module. The new decorator is roughly equivalent to numba_dpex.func but uses the new KernelDispatcher and the compilation mode of device function. The `device_func` decorator is registered to compile overloads in DpexExpkernelTarget. - In the kernel compilation mode the final LLVM module is now "finaliozed" before conversion to SPIR-V. During finalization all overload calls are linked into the main (kernel) module and optionally inlined.

- A new target option "inlining threshold" was added to DpexKernelTarget to define how the LLVM inlining passes should optimize the final codegen library. The decorator-level option will supersede any global configuration setting.

- Addresses review comments on how to properly set the inline_threshold target option

Enables adding overload to DpexExpKernelTarget and fully inline them into the final module. d1700ff

diptorupd requested a review from adarshyoga November 25, 2023 20:55

diptorupd marked this pull request as draft November 25, 2023 20:55

diptorupd force-pushed the experimental/enable_overloads branch 3 times, most recently from 83806ee to c22f8a3 Compare November 28, 2023 04:41

diptorupd marked this pull request as ready for review November 28, 2023 21:57

diptorupd force-pushed the experimental/enable_overloads branch from 0685c43 to cf04418 Compare November 28, 2023 21:58

ZzEeKkAa reviewed Nov 29, 2023

View reviewed changes

numba_dpex/core/descriptor.py Outdated Show resolved Hide resolved

ZzEeKkAa reviewed Nov 29, 2023

View reviewed changes

numba_dpex/experimental/kernel_dispatcher.py Outdated Show resolved Hide resolved

diptorupd force-pushed the experimental/enable_overloads branch from cf04418 to 788053d Compare December 1, 2023 15:33

Diptorup Deb added 3 commits December 3, 2023 19:46

Add a decorator-level option to set inlining threshold.

db5e318

- A new target option "inlining threshold" was added to DpexKernelTarget to define how the LLVM inlining passes should optimize the final codegen library. The decorator-level option will supersede any global configuration setting.

Add unit test for overload compilation.

5becde4

diptorupd force-pushed the experimental/enable_overloads branch from 788053d to f4533b4 Compare December 4, 2023 01:46

Diptorup Deb added 2 commits December 6, 2023 17:59

Print a more descriptive name for a DpctlSyclQueue type object.

ad7cf73

Extract inline_threshold value from targetdescr before use.

4f9616b

- Addresses review comments on how to properly set the inline_threshold target option

diptorupd force-pushed the experimental/enable_overloads branch from f4533b4 to ca1ef70 Compare December 7, 2023 00:00

Diptorup Deb added 3 commits December 6, 2023 22:15

Unit test to validate inline_threshold target option.

06d379e

Unit to test if warning raised if compilation mode set by user.

38edc59

Unit test to check if inline_threshold works

0af4414

diptorupd force-pushed the experimental/enable_overloads branch from ca1ef70 to 0af4414 Compare December 7, 2023 04:15

adarshyoga approved these changes Dec 7, 2023

View reviewed changes

diptorupd merged commit d1700ff into main Dec 7, 2023
40 of 44 checks passed

diptorupd deleted the experimental/enable_overloads branch December 7, 2023 06:05

github-actions bot added a commit that referenced this pull request Dec 7, 2023

Merge pull request #1230 from IntelPython/experimental/enable_overloads

a755ffe

Enables adding overload to DpexExpKernelTarget and fully inline them into the final module. d1700ff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enables adding overload to DpexExpKernelTarget and fully inline them into the final module. #1230

Enables adding overload to DpexExpKernelTarget and fully inline them into the final module. #1230

diptorupd commented Nov 25, 2023 •

edited

Loading

diptorupd commented Nov 28, 2023 •

edited

Loading

Enables adding overload to DpexExpKernelTarget and fully inline them into the final module. #1230

Enables adding overload to DpexExpKernelTarget and fully inline them into the final module. #1230

Conversation

diptorupd commented Nov 25, 2023 • edited Loading

diptorupd commented Nov 28, 2023 • edited Loading

diptorupd commented Nov 25, 2023 •

edited

Loading

diptorupd commented Nov 28, 2023 •

edited

Loading