Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[OneEmbedding] Try quantize in Embedding Shuffle (#7912)
* try to add quantize in embedding_shuffle * add half support * add numpy quantized test * Add embedding gradient shuffle quantize * fix row blockreduce * use segement quantize and dump right data * decouple quantize and dequantize kernel' * add first version of quantize, need test and debug * fix quantize factor gather error * fix quantize and dequantize kernel * remove useless code * try to add embedding gradient shuffle * fix shuffle factor size bug * Add simple env_var control, need refine * configue the cuda stream * Add simple warp impl * Add fully warp impl * use pack to optimize dequantize kernel * fix half accuracy * add more check for embedding shuffle kernel * Add fully quantize unittest * simplify code * Support ComputeType and fix unittest * fix env name as ONEFLOW_ONE_EMBEDDING_ENABLE_QUANTIZE_COMM * fix env name * refine with ComputeType * fix embedding shuffle unittest * add round half quantize func * fix unittest * fix local test to use different id * fix jc comment * fix ONEFLOW_ONE_EMBEDDING_ENABLE_QUANTIZE_COMM to ONEFLOW_ONE_EMBEDDING_ENABLE_QUANTIZED_COMM * fix name for unittest * add check for column limit * add check in enable * remove unused variable * add log * add if def with cuda * fix compile error * use std min Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Co-authored-by: Juncheng <liujuncheng1022@gmail.com>
- Loading branch information