[Pallas] Make a FlashAttention Wrapper #6785

alanwaketan · 2024-03-20T19:08:05Z

Summary:
This pull request introduces a FlashAttention wrapper that aims to:

Test Plan:
PJRT_DEVICE=TPU python test/test_pallas.py -v -k test_flash_attention_wrapper

alanwaketan · 2024-03-20T20:26:21Z

Thanks Jack for approving.

alanwaketan added 6 commits March 20, 2024 18:31

tmp

0a44bb6

tmp

373472d

introduce flash_attention

66026e5

Add test case

fe63be9

Fix the test

522de56

Fix linters

c36fb51

alanwaketan requested review from will-cromar and JackCaoG March 20, 2024 19:08

alanwaketan self-assigned this Mar 20, 2024

alanwaketan added the backport_2.3 label Mar 20, 2024

JackCaoG approved these changes Mar 20, 2024

View reviewed changes

alanwaketan merged commit fcf24b6 into master Mar 20, 2024
18 checks passed

alanwaketan deleted the alanwaketan/flash_attention_3 branch March 20, 2024 23:10

alanwaketan mentioned this pull request Mar 27, 2024

2.3 backport PR request list #6676

Closed

Provide feedback