-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get_noc_addr_from_bank_id
needed for first eltwise PyKernel
#2050
Conversation
8817e25
to
50deef5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to deal with the template arg
50deef5
to
41aa689
Compare
@@ -22,6 +22,8 @@ module attributes {} { | |||
%4 = "ttkernel.get_noc_addr_xy"(%c0_i32, %c0_i32, %c262208_i32) : (i32, i32, i32) -> !ttkernel.noc_addr | |||
// CHECK: emitc.call_opaque "noc_async_read"[[C:.*]] | |||
"ttkernel.noc_async_read"(%4, %c262432_i32, %c32_i32) : (!ttkernel.noc_addr, i32, i32) -> () | |||
// CHECK: [[C:.*]] = emitc.call_opaque "get_noc_addr_from_bank_id"[[C:.*]] {template_args = [#emitc.opaque<"true">]} [[C:.*]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a unit test for the ttkernel -> emitc conversion. I'm not sure if this belongs in this PR and if so, should I add it for the other ttkernel ops that were recently added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for testing your changes! I propose some small tweaks. This existing test is a little hard to read because it abuses the [[C:.*]]
to capture some substrings into C
that is never used. It is also not necessary to match the end of the line. I suggest we make things more readable with this version of your change:
%bank_id = arith.constant 1 : i32
%addr_offset = arith.constant 262400 : i32
%noc_addr = "ttkernel.get_noc_addr_from_bank_id"(%bank_id, %addr_offset) : (i32, i32) -> !ttkernel.noc_addr
// CHECK: = emitc.call_opaque "get_noc_addr_from_bank_id"({{.*}}) {template_args = [#emitc.opaque<"true">]}
This makes the call more readable and we are not matching anything we don't need.
(We also have test/ttmlir/Translate/TTKernel/ttkernel_{noc, tensix}.mlir
that test cpp translation, but I want to clean those up via a separate ticket.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, ty for the pointer!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleaner version of ttkernel.mlir test, per conversation
41aa689
to
449a235
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good.
This PR implements the last TTKernel op needed by eltwise kernel example
get_noc_addr_from_bank_id
is templated indataflow_api.h
on the metal sideFurther investigation shows that every kernel written sets
DRAM
to true.Similar API like
get_noc_addr
is templated the same way and this template is not exposed in the tablegen for mlir (would love to know how to expose it), so I will also not expose it.