Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update swiglu and geglu forward: zeros_like -> empty_like (#217)
## Summary <!--- This is a required section; please describe the main purpose of this proposed code change. ---> This PR improves the performance of swiglu and geglu forward by replacing `zeros_like` with `empty_like`. The difference is that `empty_like` doesn't require a separate kernel launch. <!--- ## Details This is an optional section; is there anything specific that reviewers should be aware of? ---> ## Testing Done <!--- This is a required section; please describe how this change was tested. ---> Testing is covered by existing `test_geglu.py` and `test_swiglu.py`. <!-- Replace BLANK with your device type. For example, A100-80G-PCIe Complete the following tasks before sending your PR, and replace `[ ]` with `[x]` to indicate you have done them. --> - Hardware Type: A100-80G-PCIe - [x] run `make test` to ensure correctness - [x] run `make checkstyle` to ensure code style - [x] run `make test-convergence` to ensure convergence --------- Co-authored-by: Byron Hsu <byronhsu1230@gmail.com> Co-authored-by: Shao Tang <tangshao28@gmail.com>
- Loading branch information