-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
oneDNN v3.7 release notes #2481
base: rls-v3.7
Are you sure you want to change the base?
Conversation
17acd19
to
23e00b1
Compare
* [experimental] Extended microkernel API: | ||
Introduced int4 quantization support. | ||
Fpmath mode API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we did that in external microkernel API (and cannot find the related commits).
However we did add a new query for B matrix packing type.
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
* Improved fp16/bf16 softmax performance with relaxed [accumulation mode](https://oneapi-src.github.io/oneDNN/dev_guide_attributes_accumulation_mode.html#doxid-dev-guide-attributes-accumulation-mode). | ||
* Added support and improved perfomance for fp8 matmul with bf16/fp16. | ||
|
||
## Intel Graphics Products |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@karturov, please review and update this section.
* Improved performance of the following subgraphs with Graph API | ||
* Scaled dot-product Attention (SDPA) [with causal mask](https://oneapi-src.github.io/oneDNN/dev_guide_graph_sdpa.html#doxid-dev-guide-graph-sdpa) | ||
* Scaled dot-product Attention (SDPA) [with compressed key and value](https://oneapi-src.github.io/oneDNN/dev_guide_graph_sdpa_compressed_kv.html#doxid-dev-guide-graph-sdpa-compressed-kv) | ||
## AArch64-based Processors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jondea, @theComputeKid, could you please help summarizing AArch64 improvements?
Co-authored-by: Mourad Gouicem <mourad.gouicem@intel.com>
Co-authored-by: Tao Lv <tao.a.lv@intel.com>
Co-authored-by: Vadim Pirogov <vadim.o.pirogov@intel.com>
Co-authored-by: Tao Lv <tao.a.lv@intel.com>
Co-authored-by: Tao Lv <tao.a.lv@intel.com>
Co-authored-by: Tao Lv <tao.a.lv@intel.com>
Co-authored-by: Vadim Pirogov <vadim.o.pirogov@intel.com>
Co-authored-by: Vadim Pirogov <vadim.o.pirogov@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor changes suggested, please incorporate as you see fit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments suggested, please incorporate as you see fit! Thanks!
Co-authored-by: Ranu Kundu <ranu.kundu@intel.com>
Co-authored-by: Ranu Kundu <ranu.kundu@intel.com>
Co-authored-by: Ranu Kundu <ranu.kundu@intel.com>
Co-authored-by: Ranu Kundu <ranu.kundu@intel.com>
Co-authored-by: Ranu Kundu <ranu.kundu@intel.com>
Co-authored-by: Ranu Kundu <ranu.kundu@intel.com>
Co-authored-by: Ranu Kundu <ranu.kundu@intel.com>
Co-authored-by: Ranu Kundu <ranu.kundu@intel.com>
Co-authored-by: Ranu Kundu <ranu.kundu@intel.com>
This PR includes a release notes draft based on the information from the PRs for the contributors to review. Your additions and corrections are highly appreciated.