-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CUDNN] Add cuDNN as a Relay partitioning target (BYOC) #10871
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo two nits.
python/tvm/relay/op/contrib/cudnn.py
Outdated
assert isinstance(partition.body.op, relay.Function) | ||
|
||
global_name = str(partition.attrs.global_symbol) | ||
target = tvm.target.cuda() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just notice this, I think Target.current() is better so that cuda params are not lost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a go using this based on your suggestion, but it seems the Target only ends up in the context if you use the 'with' way of specifying targets (not just directly passing them to relay.build). In an ideal world, this somehow gets plumbed directly through the BYOC/partitioning mechanism - but I think that exceeds the immediate scope of this PR. Perhaps we could leave as a future improvement for now?
python/tvm/relay/op/contrib/cudnn.py
Outdated
return _register | ||
|
||
|
||
@tvm._ffi.register_func("relay.ext.cudnn") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now would be a good time to hoist this boilerplate into a library_byoc.py helper or something similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done - let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just trigger again CI. It looks like a random fail
5c2e300
to
0e06391
Compare
it looks like some flaky test in aarch64 |
This adds infrastructure to support offloading of Relay patterns to cuDNN. In this initial commit, only softmax is supported.
cc @masahi @mbrookhart @junrushao1994 PTAL and merge if you're happy :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM!
* [CUDNN] Add cuDNN as a Relay partitioning target (BYOC) This adds infrastructure to support offloading of Relay patterns to cuDNN. In this initial commit, only softmax is supported. * Refactor common TE BYOC code into separate file * Add test guard
* [CUDNN] Add cuDNN as a Relay partitioning target (BYOC) This adds infrastructure to support offloading of Relay patterns to cuDNN. In this initial commit, only softmax is supported. * Refactor common TE BYOC code into separate file * Add test guard
* [CUDNN] Add cuDNN as a Relay partitioning target (BYOC) This adds infrastructure to support offloading of Relay patterns to cuDNN. In this initial commit, only softmax is supported. * Refactor common TE BYOC code into separate file * Add test guard
This adds infrastructure to support offloading of Relay patterns to cuDNN. In this initial commit, only softmax is supported. Later PRs will include support for more operators, including some limited fused patterns.