-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
f6aa9e9
to
28f07a7
Compare
28f07a7
to
195cf6c
Compare
@@ -194,6 +194,100 @@ MXNET_BINARY_MATH_OP_NC(right, b); | |||
|
|||
MXNET_BINARY_MATH_OP_NC(mul, a * b); | |||
|
|||
#ifndef _WIN32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate why Windows is not supported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's supported with a different implementation due to limitations of the windows vs compiler.
The fact that the corresponding unit tests are not discriminative against windows machines and they passed both windows cpu and gpu checks means this feature is also supported on windows.
Please do make sure you have some grasp of the big picture of a PR before you block one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm well aware of the unit tests passing on windows, thanks for the helpful hint.
Still, can you elaborate which part exactly is not supported by the windows compiler? Basically the whole PR is excluding Windows and that seems off. Having an entirely different implementation for a different OS is not something I see regularly, so I'm eager to learn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was due to the C1002: out of heap space
error we've encountered many times.
We're not generating more kernels(code) on windows to prevent hitting that error on windows machines.
If you still think windows is excluded I would only think you've not given the code changes a complete look:
- There's also some parts that we have
#ifdef _WIN32
such as: https://github.com/apache/incubator-mxnet/pull/16699/files#diff-c383124e9cb87f51ac456a96b799615aR73 - We also have parts that have
#else
blocks such as: https://github.com/apache/incubator-mxnet/pull/16699/files#diff-c383124e9cb87f51ac456a96b799615aR73
This is indeed a workaround for an issue that we could not solve on our own. I've also tried with upgrading vs compiler locally and it does not get this issue out of our way so that's why we have different impls for this same feature, otherwise we only have to drop this new feature for windows users.
It's good to be eager to learn, but IMHO blocking a PR without a complete look and a very solid reason is not a good (nor polite) way for demonstrating your eagerness.
@marcoabreu Appreciate your review. I can assure you that Windows is absolutely not excluded from supporting mixed-precision as Unix. @haojin2 has gone through thorough trial-and-error to make it work with Windows compilation tool chain, which I believe very few of us would be willing to get hands dirty to make this happen as compilation on Windows platforms is outside our domain knowledge. This is an extremely non-trivial task that took @haojin2 many day-and-nights to accomplish. So kudos to @haojin2 . We are trying to merge this to meet a deadline. If you feel your concerns/questions have not been addressed after @haojin2 's explanation, could you raise them so that we can help to close gap. Thanks. |
195cf6c
to
1ae73ea
Compare
1ae73ea
to
b2d501f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve seen that out of heap space error a lot. Really annoying. You may know already, but just FYI Sometimes was able to track it down to too much template nesting which was killing the compiler, when nesting too many MXNET_TYPE_SWITCH and the other similar macros like that one within templates where they end up generating lots of permutations of code.
@cjolivier01 Yeah that's exactly part of what I did in #16711, also in this PR you could see that there're some places I gave up using the type switches to only generate kernels for what I really need. I think we probably need a big revisit to our operator implementations to optimize the macro usages. Thanks for your insight. |
* support mixed-precision binary operations * improvement for documentations and error messages
* support mixed-precision binary operations * improvement for documentations and error messages
…, #16792) (#16832) * Fix nightly build (#16773) * Remove dependency on tvmop.conf * Fix binaries dependencies for ni nightly * Add comments * Update tvmop.py * Fix rebase * Fix (#16781) * Speed fused_op compilation by caching ptx and jit-compiled functions (#16783) * [Numpy] Fix collect_params().zero_grad() in gluon numpy interface (#16716) * fix zero_grad * Update parameter.py * add test * fix * Mixed data type binary ops (#16699) * support mixed-precision binary operations * improvement for documentations and error messages * Support boolean elemwise/broadcast binary add, multiply and true_divide (#16728) * support pure boolean elemwise/broadcast binary op * switch to unique_tpr * fix the test error * Fix rtrue_divide grad (#16769) * Fix rtrue_divide_scalar * More tests * Fix numpy-compatible mean output type for integer inputs (#16792) * fix mean output type for integer inputs * enable for windows
Description
Coverage for
true_divide
between floating types and integer types (including boolean).Coverage for
multiply
between floating types and boolean type. (mainly for )Also a side fix for cumsum with boolean inputs.
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
true_divide
mixed-precision supportmultiply
mixed-precision supportComments