-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[Large Tensor] Add LT support for NN optimizers and 1 activation function #17444
Conversation
@ChaiBapchya can you paste the tests run log of opperf indicating they run fine w/o giving SIGSEGV |
Looks like a lot of ops didn't have correct type for mapping index of input values. LGTM but I would like to see logs that this doesn't segfault. |
cc @szhengac |
I think this needs to be tested on training a large model. |
@szhengac Which model? Which dataset? Can you give some specifics? |
I think a toy example with a very wide dense layer is good. |
So I tested MXNet (build from source using this branch)
Results for training 10 epochs on 8 GPUS
Is this fine? |
@mxnet-label-bot add [pr-awaiting-review]
Previously they all used to give SIGSEGV, now they don't @access2rohit |
Can you also test the optimizer op with a large sparse tensor? Currently, SGD, Adagrad, Adam, and FTRL support |
|
Thanks for the help with passing sparse array
Output
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…tion (apache#17444) * fix hard sigmoid * change int i to index_t i for all Kernel Map functions * fix lint * size t indext fix
…tion (apache#17444) * fix hard sigmoid * change int i to index_t i for all Kernel Map functions * fix lint * size t indext fix
Description
Add large tensor support to optimizers and 1 activation function
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments
Tested hard_sigmoid with LT input : Pass
Rest of the *_update functions can't be tested with random_normal inputs as they give NaNs as result (even for shape < 2**32)
Hence not tested. But they don't give a segmentation fault (which previously was the problem due to lack of Large tensor support).