Skip to content

Commit a81d681

Browse files
LucasWilkinsonliuzijing2014
authored andcommitted
[Attention] FA3 decode perf improvement - single mma warp group support for head dim 128 (vllm-project#16864)
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
1 parent 9577c8a commit a81d681

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

cmake/external_projects/vllm_flash_attn.cmake

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ else()
3838
FetchContent_Declare(
3939
vllm-flash-attn
4040
GIT_REPOSITORY https://github.com/vllm-project/flash-attention.git
41-
GIT_TAG 0a721daebe4fa7149f06ecf3d3eabeb6dcd0f1fa
41+
GIT_TAG e93779c59ba4905e56e5c39dc2c1904ada71fa21
4242
GIT_PROGRESS TRUE
4343
# Don't share the vllm-flash-attn build between build types
4444
BINARY_DIR ${CMAKE_BINARY_DIR}/vllm-flash-attn

0 commit comments

Comments
 (0)