Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Results on A100: ``` [------------------------------------------------------------------ attn_decodingfw ------------------------------------------------------------------] | pytorch | optimized[flash-decoding] | optimized[triton_splitK] | optimized[flash-attention2.0.9] 1 threads: -------------------------------------------------------------------------------------------------------------------------------------------- B=256 Mq=1 Mkv=256 Hq=16 Hkv=1 K=128 | 1883.4 | 40.4 | 50.3 | 394.8 B=128 Mq=1 Mkv=512 Hq=16 Hkv=1 K=128 | 1955.9 | 44.2 | 48.4 | 366.5 B=64 Mq=1 Mkv=1024 Hq=16 Hkv=1 K=128 | 2012.0 | 34.7 | 48.8 | 368.4 B=32 Mq=1 Mkv=2048 Hq=16 Hkv=1 K=128 | 2101.4 | 33.7 | 47.3 | 351.4 B=16 Mq=1 Mkv=4096 Hq=16 Hkv=1 K=128 | 2057.9 | 31.9 | 50.0 | 403.2 B=8 Mq=1 Mkv=8192 Hq=16 Hkv=1 K=128 | 2078.3 | 34.2 | 51.6 | 527.5 B=4 Mq=1 Mkv=16384 Hq=16 Hkv=1 K=128 | 2135.2 | 37.4 | 47.0 | 581.6 B=2 Mq=1 Mkv=32768 Hq=16 Hkv=1 K=128 | 2163.6 | 45.0 | 57.4 | 1154.1 B=1 Mq=1 Mkv=65536 Hq=16 Hkv=1 K=128 | 413.3 | 61.0 | 80.7 | 2299.2 B=1 Mq=1 Mkv=131072 Hq=16 Hkv=1 K=128 | 803.5 | 81.7 | 144.3 | 4585.0 B=256 Mq=1 Mkv=256 Hq=16 Hkv=2 K=128 | 3059.6 | 402.6 | 73.7 | 393.8 B=128 Mq=1 Mkv=512 Hq=16 Hkv=2 K=128 | 3148.5 | 377.0 | 72.0 | 369.3 B=64 Mq=1 Mkv=1024 Hq=16 Hkv=2 K=128 | 3161.7 | 375.0 | 70.3 | 368.0 B=32 Mq=1 Mkv=2048 Hq=16 Hkv=2 K=128 | 3157.6 | 363.8 | 70.6 | 354.4 B=16 Mq=1 Mkv=4096 Hq=16 Hkv=2 K=128 | 3154.0 | 417.2 | 72.3 | 405.0 B=8 Mq=1 Mkv=8192 Hq=16 Hkv=2 K=128 | 3173.3 | 532.8 | 76.8 | 528.3 B=4 Mq=1 Mkv=16384 Hq=16 Hkv=2 K=128 | 3222.5 | 195.5 | 78.8 | 582.8 B=2 Mq=1 Mkv=32768 Hq=16 Hkv=2 K=128 | 3221.3 | 222.1 | 72.1 | 1154.3 B=1 Mq=1 Mkv=65536 Hq=16 Hkv=2 K=128 | 1333.9 | 222.6 | 96.3 | 2298.2 B=1 Mq=1 Mkv=131072 Hq=16 Hkv=2 K=128 | 2656.6 | 427.5 | 169.4 | 4583.7 Times are in microseconds (us). ``` ghstack-source-id: f3e0817f6e9be418eda7afbf72f1797d33acf60e Pull Request resolved: https://github.com/fairinternal/xformers/pull/797 __original_commit__ = fairinternal/xformers@381ad8088345b4c61051b6c767597dc3e320076c
- Loading branch information