Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(main): release 0.0.5 #232

Merged
merged 5 commits into from
Jun 20, 2024
Merged

chore(main): release 0.0.5 #232

merged 5 commits into from
Jun 20, 2024

Conversation

github-actions[bot]
Copy link
Contributor

@github-actions github-actions bot commented May 4, 2024

🤖 I have created a release beep boop

0.1.0 (2024-06-20)

Highlights

Acknowledgement

We thank @ibsidorenko, @LiuXiaoxuanPKU, @Yard1 @AgrawalAmey, @xuzhenqi, @mgerstgrasser, @esmeetu, @yz-tang, @HSQ79815, @Qubitium, @shreygupta2809, @sighingnow, @vinx13,
@tqchen, @merrymercy, @comaniac and many others for their contributions and helpful discussions for 0.0.5 release.

Refactor

  • support any GQA group size for tensor-cores kernels (#301) (c111ca)
  • support any page size for tensor-cores kernels (#306) (82fd8c)

Features

  • add use_tensor_cores option to decode kernels to accelerate GQA (#317) (3b50dd5)
  • add group gemm operators (#282) (e08ba42)
  • initial support of distributed operators (#289) (03553da)
  • initial support of logits hook (#298) (ab1e2ad)
  • Separate Q and KV dtypes for decode (#286) (5602659)
  • support cuda graph for batched multi-query(prefill/append) attention (#275) (83ceb67)
  • support cuda graph for batched multi-query(prefill/append) attention (#277) (24cc583)
  • support custom attention mask in prefill/append attention kernels (#266) (7304282)
  • fused speculative sampilng kernels (#259) (cea2bb)
  • expose sampling APIs in pytorch (#238) (092902)

Performance Improvements


This PR was generated with Release Please. See documentation.

@github-actions github-actions bot force-pushed the release-please--branches--main branch from 8516bd5 to f30a8ef Compare May 4, 2024 00:13
@github-actions github-actions bot force-pushed the release-please--branches--main branch 2 times, most recently from e0f50ac to da332d5 Compare May 27, 2024 10:11
@github-actions github-actions bot changed the title chore(main): release 0.0.5 chore(main): release 0.1.0 May 28, 2024
@github-actions github-actions bot force-pushed the release-please--branches--main branch 6 times, most recently from c2a4b09 to 2d495a7 Compare June 4, 2024 05:22
@github-actions github-actions bot force-pushed the release-please--branches--main branch 6 times, most recently from c2f98d5 to ad28cf8 Compare June 11, 2024 05:15
@github-actions github-actions bot force-pushed the release-please--branches--main branch 6 times, most recently from b35df6b to 588ed9d Compare June 20, 2024 06:47
@github-actions github-actions bot force-pushed the release-please--branches--main branch from 588ed9d to d803bed Compare June 20, 2024 08:14
@yzh119 yzh119 changed the title chore(main): release 0.1.0 chore(main): release 0.0.5 Jun 20, 2024
@yzh119 yzh119 merged commit 5c05676 into main Jun 20, 2024
Copy link
Contributor Author

@yzh119 yzh119 deleted the release-please--branches--main branch June 20, 2024 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant