[RELAX][PASS] Annotate Custom Scope layout pass for Adreno GPU #17599

srkreddy1238 · 2025-01-21T16:24:15Z

Texture scope annotation is handled by

Layout conversion from 4D to 5D with convert_layout pass
Legalization of ops with Adreno specific legalization and fall back legalization
FuseOps & FuseTIR
Now, over the fused TIR annotate the texture scopes by hint_on_device
RealizeVDevice will take care of injecting to_device as needed.
Also, introduced SpecializeTIRParams to update the fused TIR the prim function buffer var map with new scope information.

Changes in FuseOps and FuseTIR are to forward op attr and op pattern info. This info is used for Texture specific scoping decisions.

Texture scope annotation is handled by - Layout conversion from 4D to 5D with convert_layout pass - Legalization of ops with Adreno specific legalization - FuseOps & FuseTIR - Now, over the fused TIR annotate the scopes by hint_on_device - RealizeVDevice will take care of injecting to_device as needed. - Also, introduced SpecializeTIRParams to update the fused TIR the prim function buffer var map with new scope information. Changes in FuseOps and FuseTIR are to forward op attr and op pattern info. This info is used for Texture specific scoping decisions.

srkreddy1238 · 2025-01-23T03:45:34Z

@tvm-bot rerun

tqchen · 2025-01-25T14:43:25Z

@Hzfengsy do you mind take a look given it touches FuseOps/TIR

tqchen · 2025-01-25T14:45:12Z

src/script/printer/relax/utils.h

@@ -141,7 +141,7 @@ inline int FindVDeviceIndexByTargetKind(const VDevice& vdevice, const IRDocsifie
  int kind_index = 0;
  for (size_t i = 0; i < vdevices.size(); ++i) {
    auto vdev = Downcast<VDevice>(vdevices[i]);
-    if (vdev.same_as(vdevice)) {
+    if (vdev == vdevice) {


ideally we would like to deduplicate vdevice globally via ptr equality, is it possible to get the pass to generate the same vdevice instead?

Should be possible (I think RealizeVDevice pass does similar). Let me explore...

@tqchen have made changes to maintain ptr equality for the pass (AnnotateCustomMemoryScope). But, we have an issue as detailed below.

The vdevices GlobalInfo is populated from the pass (AnnotateCustomMemoryScope). Now, when the mod is printed in python after this pass the ptr equality fails, i.e vdevice id printed as -1 ( vdevice="opencl:-1:global"). Additionally, printing mod just before returning AnnotateCustomMemoryScope or while in next pass (cpp) works fine.

Also, if the vdevices are populated from python (text cases) and accessed in cpp seem to be working fine.

Functionally, I can proceed here as further pipeline is in cpp. Pls advice.

tqchen · 2025-01-25T14:46:57Z

python/tvm/relax/transform/transform.py

@@ -1092,6 +1095,10 @@ def LegalizeOps(
        legalization function is not registered. By default we don't print
        warnings.

+    add_attributes : bool


is it possible to compose instead? e.g. run attribute attach pass after.

After legalization pass we don't have any trace of operator specific attributes.

include/tvm/relax/transform.h

tqchen · 2025-01-25T14:48:31Z

also cc @yongwww for memory scope related changes

Hzfengsy

some initial comments

python/tvm/relax/transform/optimize_batchnorm.py

src/relax/op/tensor/binary.cc

srkreddy1238 · 2025-01-28T10:03:19Z

@tvm-bot rerun

Hzfengsy

LGTM

srkreddy1238 force-pushed the annotate_texture_scope branch from fe15d5b to 1549733 Compare January 21, 2025 16:31

srkreddy1238 added 2 commits January 21, 2025 22:08

lint

f2b81ce

Optional attr addition in legalization

15589e5

srkreddy1238 requested review from yongwww and tqchen January 22, 2025 16:31

srkreddy1238 requested a review from spectrometerHBH January 24, 2025 04:28

tqchen requested changes Jan 25, 2025

View reviewed changes

tqchen assigned yongwww Jan 25, 2025

Hzfengsy reviewed Jan 25, 2025

View reviewed changes

python/tvm/relax/transform/optimize_batchnorm.py Outdated Show resolved Hide resolved

src/relax/op/tensor/binary.cc Show resolved Hide resolved

tqchen assigned Hzfengsy Jan 25, 2025

srkreddy1238 added 3 commits January 27, 2025 12:27

VDevice ptr equality.

83161b1

Rename pass - SpecializeTIRParams

89f0dd8

Remote OptimizeBatchnorm pass. Redundant with DecomposeOpsForInference.

31819e6

srkreddy1238 requested review from tqchen and Hzfengsy January 29, 2025 04:23

Hzfengsy approved these changes Jan 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RELAX][PASS] Annotate Custom Scope layout pass for Adreno GPU #17599

[RELAX][PASS] Annotate Custom Scope layout pass for Adreno GPU #17599

srkreddy1238 commented Jan 21, 2025 •

edited

Loading

srkreddy1238 commented Jan 23, 2025

tqchen commented Jan 25, 2025

tqchen Jan 25, 2025

srkreddy1238 Jan 26, 2025

srkreddy1238 Jan 27, 2025

tqchen Jan 25, 2025

srkreddy1238 Jan 26, 2025

tqchen commented Jan 25, 2025

Hzfengsy left a comment

srkreddy1238 commented Jan 28, 2025

Hzfengsy left a comment

[RELAX][PASS] Annotate Custom Scope layout pass for Adreno GPU #17599

Are you sure you want to change the base?

[RELAX][PASS] Annotate Custom Scope layout pass for Adreno GPU #17599

Conversation

srkreddy1238 commented Jan 21, 2025 • edited Loading

srkreddy1238 commented Jan 23, 2025

tqchen commented Jan 25, 2025

tqchen Jan 25, 2025

Choose a reason for hiding this comment

srkreddy1238 Jan 26, 2025

Choose a reason for hiding this comment

srkreddy1238 Jan 27, 2025

Choose a reason for hiding this comment

tqchen Jan 25, 2025

Choose a reason for hiding this comment

srkreddy1238 Jan 26, 2025

Choose a reason for hiding this comment

tqchen commented Jan 25, 2025

Hzfengsy left a comment

Choose a reason for hiding this comment

srkreddy1238 commented Jan 28, 2025

Hzfengsy left a comment

Choose a reason for hiding this comment

srkreddy1238 commented Jan 21, 2025 •

edited

Loading