Address issues of top-k op #16670

asandhupatlaTT · 2025-01-13T17:13:23Z

Ticket

Link to Github Issue

Problem description

few flags & input combinations (k, largest, sorted) are not supported

What's changed

Change compute kernel to support those flags. Adding code to support k=64 from tt-metal side

Checklist

Post commit CI passes : https://github.com/tenstorrent/tt-metal/actions/runs/12860740063
Blackhole Post commit (if applicable): https://github.com/tenstorrent/tt-metal/actions/runs/12860740804
Model regression CI testing passes (if applicable) : https://github.com/tenstorrent/tt-metal/actions/runs/12860741430
Device performance regression CI testing passes (if applicable) : https://github.com/tenstorrent/tt-metal/actions/runs/12860742232
(For models and ops writers) Full new models tests passes
New/Existing tests provide coverage for changes

github-actions

⚠️ Clang-Tidy found issue(s) with the introduced code (1/1)

ttnn/cpp/ttnn/operations/reduction/topk/topk.cpp

asandhupatlaTT · 2025-01-20T06:07:22Z

Please refer to #13235 (comment) for support of sorted=False

ttnn/cpp/ttnn/operations/reduction/topk/topk.cpp

bbradelTT · 2025-01-20T21:41:57Z

tests/ttnn/unit_tests/operations/test_topk.py

-        (1, 1, 2048, 64, 32),
-        (1, 1, 32, 32768, 32),
-        (1, 1, 8192, 64, 32),
+        (1, 1, 64, 64, 2, 32),


The issue mentions a k of 50. That should be tested as well.

Yes i tried k=50 but If i'm not mistaken, tt-budha (& thereby my code) only supports powers of 2.
Ill see what needs to be done to support non-powers-of-2 numbers.

One idea is : convert K to nearest power of 2 --> do LLK/compute kernel --> then reshape or slice output to desired shape

It would be easiest to use the separation of ExecuteTopK and TopK and do what you suggested, except convert K to either 32 or 64.

ExecuteTopK:

invokes TopK with k either 32 or 64

then reshape or slice output to desired shape

It is correct that the TopK algorithm only supports powers of 2 for K. Any non-power-2 values need to be rounded UP to the nearest supported K value, and then you can truncate the output if needed. Rounding down won't work.

@asandhupatlaTT I see only 32 and 64 in tests/ttnn/unit_tests/operations/test_topk.py
What happens when you try 2, 4, 8, and 16?

@bbradelTT since my kernel is. similar to tt-budha, it should work. but ill few test cases in next patch

ttnn/cpp/ttnn/operations/reduction/topk/device/topk_op.cpp

bbradelTT · 2025-01-20T21:49:32Z

ttnn/cpp/ttnn/operations/reduction/topk/device/kernels/dataflow/writer_local_topk.cpp

@@ -34,9 +34,9 @@ void kernel_main() {
    uint32_t final_indices_cb_addr = get_write_ptr(final_indices_cb_index);

    uint64_t noc_final_addr_values =
-        get_noc_addr(noc_final_x, noc_final_y, final_values_cb_addr) + start_wt * tile_bytes_values;
+        get_noc_addr(noc_final_x, noc_final_y, final_values_cb_addr) + start_wt * tile_bytes_values * Kt;


It would be good to clean up how Kt is defined as well in the program factory.

Sorry i dont understand this comment.

In ttnn/cpp/ttnn/operations/reduction/topk/device/topk_program_factory.hpp Kt is defined by

uint32_t Kt = k % TILE_WIDTH == 0 ? k / TILE_WIDTH : k / TILE_WIDTH + 1;

That indicates that k could be something other than a multiple of TILE_WIDTH (e.g. 1, 2, 3, etc.). If that is not the case, it would be good to at least put a comment, and possibly change the code to just

uint32_t Kt = k / TILE_WIDTH;

bbradelTT · 2025-01-20T21:50:12Z

ttnn/cpp/ttnn/operations/reduction/topk/device/kernels/compute/topk_local.cpp

+            cb_wait_front(input_transposed_cb_index, Wt);
+            cb_wait_front(index_transposed_cb_index, Wt);
+
+            while (idx < num_k_sequences) {


This should be a for loop.

I was sticking with tt-budha code. ok ill convert while to for loop

bbradelTT · 2025-01-20T21:51:13Z

ttnn/cpp/ttnn/operations/reduction/topk/device/kernels/compute/topk_local.cpp

+
+            cb_push_back(input_transposed_cb_index, Wt);
+            cb_push_back(index_transposed_cb_index, Wt);
+            // print_all_tiles(input_transposed_cb_index, 0);


Should delete commented out code unless have an explanation.

thats a good place to print in-case people wanna debug this in future.

I see people adding such commented lines in other kernels (for example:

tt-metal/ttnn/cpp/ttnn/operations/reduction/argmax/device/kernels/reader_argmax_interleaved_multicore.cpp

Line 46 in d0b0f9b

// DPRINT << cb_id_intermed0 << " "<< cb_id_intermed1 << " " <<intermed0_addr << " " << intermed1_addr <<" " <<

)

But i can remove it in next patch

bbradelTT · 2025-01-20T21:53:42Z

ttnn/cpp/ttnn/operations/reduction/topk/device/kernels/compute/topk_final.cpp

-                release_dst();
-                direction = !direction;
+
+            while (idx < num_k_sequences) {


Please use for loops where possible. while loops should only be used when for loops would be too awkward.

I was sticking with tt-budha code. ok ill convert while to for loop

ttnn/cpp/ttnn/operations/reduction/topk/topk.cpp

bbradelTT · 2025-01-20T22:03:12Z

ttnn/cpp/ttnn/operations/reduction/topk/device/kernels/compute/topk_local.cpp


+    int end_phase = (K <= 64) ? logk - 1 : 5;


Why do you need the elvis operator, and this variable? if K == 64, isn't logk - 1 equal to 5?

I was sticking with tt-budha code.
for K > 64, its capped to 5

Will this code be able to deal with K larger than 64? E.g. 128 or 256?

Thats the goal. But Radomir said the implementation is not there yet. So 64 is max value that we can support as of now.
@rdjogoTT lemme know if i missed anything

K=64 was the max value we have tested in the past, however the LLKs should be able to support K>64 as well. It should only require compute kernel-level changes, but it will be complex.

@rdjogoTT If K=128 was supported, would this end phase still be 5?

Yes, since we load 2 tiles into Dest at a time the max subsequence length we can sort is 64. 128 will then require some additional steps to get to 128

Ok, thanks. In that case, we can leave this as is. @asandhupatlaTT please add a comment about only supporting up to 64.

github-actions

⚠️ Clang-Tidy found issue(s) with the introduced code (1/1)

ttnn/cpp/ttnn/operations/reduction/topk/topk.cpp

github-actions

⚠️ Clang-Tidy found issue(s) with the introduced code (1/1)

ttnn/cpp/ttnn/operations/reduction/topk/topk.cpp

Signed-off-by: Amruth Sandhupatla <asandhupatla@tenstorrent.com>

asandhupatlaTT mentioned this pull request Jan 18, 2025

Issues to support aten.topk to ttnn.topk conversion #13235

Open

github-actions bot reviewed Jan 18, 2025

View reviewed changes

ttnn/cpp/ttnn/operations/reduction/topk/topk.cpp Show resolved Hide resolved

ttnn/cpp/ttnn/operations/reduction/topk/topk.cpp Outdated Show resolved Hide resolved

asandhupatlaTT force-pushed the asandhupatla/13235 branch from 6e27e82 to d0be909 Compare January 18, 2025 07:32

bbradelTT reviewed Jan 20, 2025

View reviewed changes

asandhupatlaTT force-pushed the asandhupatla/13235 branch from 2ffbb3d to f39f716 Compare January 21, 2025 14:10

github-actions bot reviewed Jan 21, 2025

View reviewed changes

ttnn/cpp/ttnn/operations/reduction/topk/topk.cpp Outdated Show resolved Hide resolved

ttnn/cpp/ttnn/operations/reduction/topk/topk.cpp Outdated Show resolved Hide resolved

github-actions bot reviewed Jan 21, 2025

View reviewed changes

ttnn/cpp/ttnn/operations/reduction/topk/topk.cpp Show resolved Hide resolved

ttnn/cpp/ttnn/operations/reduction/topk/topk.cpp Show resolved Hide resolved

asandhupatlaTT added 16 commits January 22, 2025 02:53

add skeleton support for largest & sorted

c8c2e3b

Signed-off-by: Amruth Sandhupatla <asandhupatla@tenstorrent.com>

potential patch for k = 64

3d67320

Signed-off-by: Amruth Sandhupatla <asandhupatla@tenstorrent.com>

k=32 passes but for k=64 faiks: patch1

bf1a99b

Signed-off-by: Amruth Sandhupatla <asandhupatla@tenstorrent.com>

missing one last sort in rebuild

212eb7b

Signed-off-by: Amruth Sandhupatla <asandhupatla@tenstorrent.com>

passes for all cases except when W=8192

5a65a8f

Signed-off-by: Amruth Sandhupatla <asandhupatla@tenstorrent.com>

add test cases for testing

953d389

Signed-off-by: Amruth Sandhupatla <asandhupatla@tenstorrent.com>

fix bug when we rebuild same tile twice at same time

7a0ee2d

Signed-off-by: Amruth Sandhupatla <asandhupatla@tenstorrent.com>

single core top-k works for k=64

d146cc0

Signed-off-by: Amruth Sandhupatla <asandhupatla@tenstorrent.com>

patch 1 for multicore support

e9288d1

Signed-off-by: Amruth Sandhupatla <asandhupatla@tenstorrent.com>

fix bug where we get wrong writer pointer for k > 32

6743b03

Signed-off-by: Amruth Sandhupatla <asandhupatla@tenstorrent.com>

support all dims

cab6201

Signed-off-by: Amruth Sandhupatla <asandhupatla@tenstorrent.com>

fix test cases of top-k

67772b1

Signed-off-by: Amruth Sandhupatla <asandhupatla@tenstorrent.com>

cleanup

71d4360

Signed-off-by: Amruth Sandhupatla <asandhupatla@tenstorrent.com>

add sorted flags to test case

4f86077

Signed-off-by: Amruth Sandhupatla <asandhupatla@tenstorrent.com>

support K which is not power of 2

1f3d380

Signed-off-by: Amruth Sandhupatla <asandhupatla@tenstorrent.com>

support n-d tensor

9d4eef9

Signed-off-by: Amruth Sandhupatla <asandhupatla@tenstorrent.com>

asandhupatlaTT force-pushed the asandhupatla/13235 branch from 22fe8fd to 9d4eef9 Compare January 22, 2025 02:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Address issues of top-k op #16670

Address issues of top-k op #16670

asandhupatlaTT commented Jan 13, 2025 •

edited

Loading

github-actions bot left a comment

asandhupatlaTT commented Jan 20, 2025

bbradelTT Jan 20, 2025

asandhupatlaTT Jan 21, 2025

bbradelTT Jan 21, 2025

rdjogoTT Jan 21, 2025

bbradelTT Jan 21, 2025

asandhupatlaTT Jan 21, 2025

bbradelTT Jan 20, 2025

asandhupatlaTT Jan 21, 2025

bbradelTT Jan 21, 2025

bbradelTT Jan 20, 2025

asandhupatlaTT Jan 21, 2025

bbradelTT Jan 20, 2025

asandhupatlaTT Jan 21, 2025

bbradelTT Jan 20, 2025

asandhupatlaTT Jan 21, 2025

bbradelTT Jan 20, 2025

asandhupatlaTT Jan 21, 2025

bbradelTT Jan 21, 2025

asandhupatlaTT Jan 21, 2025

rdjogoTT Jan 21, 2025

bbradelTT Jan 21, 2025

rdjogoTT Jan 21, 2025

bbradelTT Jan 21, 2025 •

edited

Loading

github-actions bot left a comment

github-actions bot left a comment

Address issues of top-k op #16670

Are you sure you want to change the base?

Address issues of top-k op #16670

Conversation

asandhupatlaTT commented Jan 13, 2025 • edited Loading

Ticket

Problem description

What's changed

Checklist

github-actions bot left a comment

Choose a reason for hiding this comment

asandhupatlaTT commented Jan 20, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bbradelTT Jan 21, 2025 • edited Loading

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

asandhupatlaTT commented Jan 13, 2025 •

edited

Loading

bbradelTT Jan 21, 2025 •

edited

Loading