-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RELAX][PASS] Annotate Custom Scope layout pass for Adreno GPU #17599
base: main
Are you sure you want to change the base?
Conversation
Texture scope annotation is handled by - Layout conversion from 4D to 5D with convert_layout pass - Legalization of ops with Adreno specific legalization - FuseOps & FuseTIR - Now, over the fused TIR annotate the scopes by hint_on_device - RealizeVDevice will take care of injecting to_device as needed. - Also, introduced SpecializeTIRParams to update the fused TIR the prim function buffer var map with new scope information. Changes in FuseOps and FuseTIR are to forward op attr and op pattern info. This info is used for Texture specific scoping decisions.
fe15d5b
to
1549733
Compare
@tvm-bot rerun |
@Hzfengsy do you mind take a look given it touches FuseOps/TIR |
src/script/printer/relax/utils.h
Outdated
@@ -141,7 +141,7 @@ inline int FindVDeviceIndexByTargetKind(const VDevice& vdevice, const IRDocsifie | |||
int kind_index = 0; | |||
for (size_t i = 0; i < vdevices.size(); ++i) { | |||
auto vdev = Downcast<VDevice>(vdevices[i]); | |||
if (vdev.same_as(vdevice)) { | |||
if (vdev == vdevice) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ideally we would like to deduplicate vdevice globally via ptr equality, is it possible to get the pass to generate the same vdevice instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be possible (I think RealizeVDevice pass does similar). Let me explore...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tqchen have made changes to maintain ptr equality for the pass (AnnotateCustomMemoryScope). But, we have an issue as detailed below.
The vdevices GlobalInfo is populated from the pass (AnnotateCustomMemoryScope). Now, when the mod is printed in python after this pass the ptr equality fails, i.e vdevice id printed as -1 ( vdevice="opencl:-1:global"
). Additionally, printing mod just before returning AnnotateCustomMemoryScope
or while in next pass (cpp) works fine.
Also, if the vdevices are populated from python (text cases) and accessed in cpp seem to be working fine.
Functionally, I can proceed here as further pipeline is in cpp. Pls advice.
@@ -1092,6 +1095,10 @@ def LegalizeOps( | |||
legalization function is not registered. By default we don't print | |||
warnings. | |||
|
|||
add_attributes : bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to compose instead? e.g. run attribute attach pass after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After legalization pass we don't have any trace of operator specific attributes.
also cc @yongwww for memory scope related changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some initial comments
@tvm-bot rerun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Texture scope annotation is handled by
hint_on_device
Changes in FuseOps and FuseTIR are to forward op attr and op pattern info. This info is used for Texture specific scoping decisions.