Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SCHEDULE] ElemwiseReverseComputeAt #102

Closed
tqchen opened this issue Aug 27, 2020 · 4 comments
Closed

[SCHEDULE] ElemwiseReverseComputeAt #102

tqchen opened this issue Aug 27, 2020 · 4 comments

Comments

@tqchen
Copy link
Contributor

tqchen commented Aug 27, 2020

This primitive us very useful for autoscheduling to keep all tiling decisions at one stage.

Example:

for i  
    for k:
       opaque_produce(C[i*4, i*4+4])
for j:
      D[j] = C[j]+1
s.elemwise_reverse_compute_at(D, C, i)
for i: 
    for k:
       opaque_produce(C[i*4, i*4+4])
    for j in range(4):
      D[i*4 + j] = C[i*4 + j]+1

Why, it is useful to make all tiling decisions at the reduction point. Then "reverse inline" the later elementwise stage back to the loop tiles of the original computation. Correctness:

  • Because it is elemwise, we know that the domain of C and D are exactly the same, and it is fine to reverse inline.

I am not too happy about the name, and would love to see if we have better ideas.

@tqchen
Copy link
Contributor Author

tqchen commented Aug 27, 2020

@spectrometerHBH
Copy link
Collaborator

spectrometerHBH commented Aug 27, 2020

I think there is chance to generalize this transformation.
If D is complete, and the newly generated loops enable the read region of D to cover the write region of opaque_produce, we can ensure the correctness.
And it is exactly the reverse of the check&mutate of compute_at.

@tqchen
Copy link
Contributor Author

tqchen commented Aug 27, 2020

Would be nice to land this primitive, as it can simplify quite a few scheduling logic by push the tiling decision to a single stage rather than two stages. Would also be useful to think how it interacts with cache write case of gpu

@junrushao
Copy link
Member

I agree with @spectrometerHBH that there is chance to generalize this primitive, and maybe we should have a better name for it...Could you take a shot when you have time?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants