[MetaSchedule] Introducing MemHammer #14164
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR introduces MemHammer, which performs threadblock level auto data movement in MetaSchedule. The vision that memhammer holds is to free users from laborious manual schedules at threadblock level. With memhammer, the only thing user needs to do is mark a specific block with an annotation
auto_copy
, and memhammer will lower it with auto thread index binding, vectorize, and wmma API calls. We also introduced two new schedule primitivesread_at
andwrite_at
, which enable users to perform a cache read / write in an easy-to-use manner, without arduouscache_read
,compute_at
, and other manual optimizations.Given a data movement description like this:
By annotating the block with
T.block_attr({"auto_copy": 1})
and other optional arguments, it will be lowered to the following code with cooperative fetch, vectorize, and other specified features:For more examples, see
tests/python/unittest/test_tir_transform_memhammer_lower_auto_copy.py
.All supported features: