-
Notifications
You must be signed in to change notification settings - Fork 163
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: support cached cos/sin in rope APIs (#585)
As requested in #530 , this PR implements the RoPE with cached cos/sin embeddings, which is more flexible in some use cases. In our previous RoPE implementations, cos/sin values are computed on-the-fly inside kernels with float32 instead using cached values. In this PR we found that if we use f16 cos/sin cache, the rope result will have a large discrepancy compared to our original implementation `flashinfer.apply_rope` (which stores cos/sin with fp32). So we require the `cos_cache` and `sin_cache` to use fp32 data type. cc @dreaming-panda @ByronHsu
- Loading branch information
Showing
8 changed files
with
465 additions
and
208 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.