Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Generic] Forward MS and AS rewrites for generic schedules #13754

Closed

Conversation

AndrewZhaoLuo
Copy link
Contributor

Hitting generic strategies for ARM targets, which don't properly forward layout rewrite info.

@tvm-bot
Copy link
Collaborator

tvm-bot commented Jan 10, 2023

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

  • No users to tag found in teams: generic See #10317 for details

Generated by tvm-bot

@AndrewZhaoLuo
Copy link
Contributor Author

cc @tkonolige

Copy link
Contributor

@tkonolige tkonolige left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct. Specifying need_meta_schedule_layout should be set from the specific target strategy. Like here: https://github.com/apache/tvm/blob/main/python/tvm/relay/op/strategy/arm_cpu.py#L581-L582

@AndrewZhaoLuo
Copy link
Contributor Author

I don't think this is correct. Specifying need_meta_schedule_layout should be set from the specific target strategy. Like here: https://github.com/apache/tvm/blob/main/python/tvm/relay/op/strategy/arm_cpu.py#L581-L582

Is the a concern of correctness or style? There is no target-specific strategy in this case (batch_matmul on arm) so it hits the generic strategies.

@tkonolige
Copy link
Contributor

Currently auto/metaschedule layout transform is enabled on a target-by-target basis. You're changing the default to be enabled for all targets that do not specify. In the past we've had some problems with layout transform not working on all targets, so I don't think this is a safe default.

@AndrewZhaoLuo
Copy link
Contributor Author

Hmm so my bug though has us hit a generic strategy, but metaschedule still rewrites the layout during tuning.

During compilation step we get an error since the rewriting is not forwarded properly. Is layout rewriting also enabled on a per-target basis? Otherwise all targets which hit generic strategy will get same problem.

@junrushao
Copy link
Member

For context: tuning-based layout rewriting on Relay is quite hacky and buggy because Relay is not designed to support this feature (even if it's necessary to deliver better performance). Relax natively supports this optimization and will be more likely to work decently.

@AndrewZhaoLuo
Copy link
Contributor Author

@tkonolige would you be fine if I made copies of the generic strategies for my target, and properly forward rewrite info there?

We still have the problem with generic strategies getting rewritten, but my problem will be solved.

@AndrewZhaoLuo
Copy link
Contributor Author

AndrewZhaoLuo commented Jan 11, 2023

Ah so chatting with @tkonolige and others a bit more, this problem is a bit more complicated and maybe exposes some past inconsistency with tvm.target.Target. Anyway, here are my findings.

  1. If we use generic strategy we should not allow LayoutRewrite (this is different from AlterOpLayout) pass used in MS. This is because LayoutRewrite really only makes sense for CPU platforms where non-managed cache means tensor layout should match access patterns from the get go. It is also buggy potentially so turning it on for only CPU can help manage risk area for bugs.

  2. I was constructing my target object in TVM incorrectly. arm_cpu key should imply cpu key. Doing tvm.target.Target("...") does not do this properly (it implies arm_cpu but not cpu). Evidently I have been doing this wrong all my life. We should be doing tvm.target.arm_cpu("..."). There was a cpu strategy that forwards information correctly

Based on this here are future work:

  1. This PR should be closed as if we hit generic strategy, we are not in CPU and we should not be doing LayoutRewrite
  2. We should probably look into making tvm.target.Target make it so arm_cpu also implies cpu. I've been doing things this way all my life and no doubt have accidentally dispatched to generic strategies on accident.
  3. Use of target through codebase might be a bit inconsistent. In some parts the hardware target is used (e.g. cpu, arm_cpu, gpu). In other places the compilation target is used (e.g. llvm, cuda). This is fine as long as usage is consistent between the two or you really care about the hardware/compilation target.

@AndrewZhaoLuo
Copy link
Contributor Author

AndrewZhaoLuo commented Jan 12, 2023

#13775 <-- is probably the best solution to my problem for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants