Fix MaxPool Block / Width Sharding with Large Kernels / Wide Reductions #14531

wransom-TT · 2024-10-31T18:36:17Z

For Block and Width Sharding with large kernels, Max Pool produced incorrect results. This has been fixed, plus support has been added for large kernels with wide reductions which was previously untested.

Ticket

Link to Github Issue

Problem description

When block and width sharding functionality was introduced the large reader and compute kernels were not updated and these cases were not covered by test cases. Additionally, the large kernels did not support wide reductions.

What's changed

Test cases have been expanded to include large kernels, wide reductions, and more strides for the standard height/width/block sharding tests
The large compute and reader kernels have been updated to support width and block sharding
The large compute and reader kernels have been updated to support wide reductions
All the max pool kernels have been refactored slightly to achieve more alignments between the large / wide / standard kernel implementations

Checklist

Post commit CI passes
[N/A] Blackhole Post commit (if applicable)
[N/A] Model regression CI testing passes (if applicable)
[N/A] Device performance regression CI testing passes (if applicable)
New/Existing tests provide coverage for changes

…s and wide reductions

tests/ttnn/unit_tests/operations/test_maxpool2d.py

ttnn/cpp/ttnn/operations/pool/maxpool/device/kernels/compute/max_pool_multi_core.cpp

.../device/kernels/dataflow/reader_max_pool_2d_multi_core_sharded_with_halo_large_kernel_v2.cpp

...ool/maxpool/device/kernels/dataflow/reader_max_pool_2d_multi_core_sharded_with_halo_wide.cpp

ttnn/cpp/ttnn/operations/pool/maxpool/device/max_pool2d_multi_core_program_factory.cpp

nkpatel-tt · 2024-11-05T08:42:54Z

ttnn/cpp/ttnn/operations/pool/maxpool/device/max_pool2d_multi_core_program_factory.cpp

    if (is_large_kernel) {
        uint32_t max_pool_partials_cb_id = tt::CB::c_intermed1;  // max_pool partials
-        uint32_t max_pool_partials_cb_pagesize = in_cb_sz;
+        uint32_t max_pool_partials_cb_pagesize = out_cb_pagesize;


Minor comment, This variable should be min(out_cb_pagesize, TILE_SIZE8nbytes).

Hey looks like a small typo, I updated it to min(out_cb_pagesize, TILE_SIZE * 8 * out_nbytes);, let me know if this isn't what you meant!

...cpp/ttnn/operations/pool/maxpool/device/kernels/compute/max_pool_multi_core_large_kernel.cpp

.../device/kernels/dataflow/reader_max_pool_2d_multi_core_sharded_with_halo_large_kernel_v2.cpp

tests/ttnn/unit_tests/operations/test_maxpool2d.py

…ns (tenstorrent#14531) * tenstorrent#14249: Fixed bug for width and block sharding with large kernel sizes and wide reductions

#14249: Fixed bug for width and block sharding with large kernel size…

af60c98

…s and wide reductions

wransom-TT requested a review from nkpatel-tt October 31, 2024 18:36

wransom-TT requested review from ayerofieiev-tt, dmakoviichuk-tt, rfurko-tt, cfjchu, TT-BrianLiu, razorback3, dongjin-na, mywoodstock, shwetankTT, sankarmanoj-tt and pavlejosipovic as code owners October 31, 2024 18:36

wransom-TT marked this pull request as draft October 31, 2024 18:42

wransom-TT added 3 commits October 31, 2024 20:22

#0: PR cleanup

f625a25

#0: test skips to pass grayskull

94d2ed7

#0: test skips to pass N300

ecc2bd0

wransom-TT force-pushed the wransom/LargeKernel branch from 8b1f490 to ecc2bd0 Compare November 1, 2024 18:53

wransom-TT added 2 commits November 1, 2024 19:27

#0: minor cleanup

697ccbd

#0: fix for 16 row reductions

c029d9a

wransom-TT marked this pull request as ready for review November 2, 2024 05:36

mywoodstock reviewed Nov 5, 2024

View reviewed changes

nkpatel-tt reviewed Nov 5, 2024

View reviewed changes

...cpp/ttnn/operations/pool/maxpool/device/kernels/compute/max_pool_multi_core_large_kernel.cpp Outdated Show resolved Hide resolved

nkpatel-tt reviewed Nov 5, 2024

View reviewed changes

.../device/kernels/dataflow/reader_max_pool_2d_multi_core_sharded_with_halo_large_kernel_v2.cpp Show resolved Hide resolved

wransom-TT added 3 commits November 6, 2024 09:13

Merge branch 'main' into wransom/LargeKernel

e5934e3

#0: PR updates 1

4eaddf1

Merge branch 'main' into wransom/LargeKernel

12f82ee

mywoodstock approved these changes Nov 6, 2024

View reviewed changes

tests/ttnn/unit_tests/operations/test_maxpool2d.py Show resolved Hide resolved

wransom-TT requested a review from nkpatel-tt November 6, 2024 23:44

wransom-TT added 2 commits November 7, 2024 10:04

Merge branch 'main' into wransom/LargeKernel

18b3008

#0: PR updates 2

476e18d

TT-BrianLiu approved these changes Nov 8, 2024

View reviewed changes

wransom-TT merged commit 8ca460b into main Nov 8, 2024
118 checks passed

wransom-TT deleted the wransom/LargeKernel branch November 8, 2024 17:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix MaxPool Block / Width Sharding with Large Kernels / Wide Reductions #14531

Fix MaxPool Block / Width Sharding with Large Kernels / Wide Reductions #14531

wransom-TT commented Oct 31, 2024 •

edited

Loading

nkpatel-tt Nov 5, 2024

wransom-TT Nov 6, 2024

Fix MaxPool Block / Width Sharding with Large Kernels / Wide Reductions #14531

Fix MaxPool Block / Width Sharding with Large Kernels / Wide Reductions #14531

Conversation

wransom-TT commented Oct 31, 2024 • edited Loading

Ticket

Problem description

What's changed

Checklist

nkpatel-tt Nov 5, 2024

Choose a reason for hiding this comment

wransom-TT Nov 6, 2024

Choose a reason for hiding this comment

wransom-TT commented Oct 31, 2024 •

edited

Loading