-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[MetaSchedule][Hexagon] Improve vectorization for standalone elementw…
…ise op (#14408) [MetaSchedule][Hexagon] Improve vectorization for standalone elementwise ops Motivation: It was found that for standalone elementwise operations (add, sub, etc.) MetaScheduler generates code with poor performance due to lack of vector code on some input tensor shapes. Current implementation is not able to vectorize if innermost loops extent is not multiple of the vector length. What was done: Core changes: it checks current loops nest, if all loops are "simple", i.e. loops without annotations, bindings, reduce axis, then it does the following: 1) Fuse all loops into single one. 2) Split this new loop into 2 parts: inner and outer. Herewith split factor for the inner loop is equal to 'max_vectorize_extent' MetaScheduler parameter. 3) Parallelize outer loop and vectorize inner loop. Performance measurement: Measurement was done on Qualcomm Snapdragon 888. As it was expected, 1 and 2 got significant performance boost, 3 and 4 - without changes. N | op | Dtype | Shape | Before fix, ms | After fix, ms | speedup | --|---------|-------|------------------|----------------|---------------|---------| 1 | add | uint8 | 1, 8, 56, 56, 32 | 1.264 | 0.167 | 7.5x | 2 | qnn.add | uint8 | 1, 8, 56, 56, 32 | 2.213 | 0.336 | 6.6x | 3 | add | int32 | 1, 8, 56, 56, 32 | 0.161 | 0.150 | 1.07x | 4 | seq* | uint8 | 1, 64, 56, 56 | 2.634 | 2.679 | 0.98x | ----------------------------------------------------------------------------------| seq* - test of the ops sequence: qnn.conv2d + bias_add + qnn.requantize, weights shape = [256, 64, 1, 1]
- Loading branch information
1 parent
a0edf24
commit 14ddb37
Showing
2 changed files
with
114 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters