From a00c4d059eda29e93439ba719a63dadd0d701869 Mon Sep 17 00:00:00 2001 From: Hongxia Yang Date: Wed, 24 Jul 2024 16:24:49 +0000 Subject: [PATCH 1/2] add links for mi300x tuning guide for mi300x users --- docs/source/getting_started/amd-installation.rst | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/docs/source/getting_started/amd-installation.rst b/docs/source/getting_started/amd-installation.rst index 71d7527a3e706..2418499ec0836 100644 --- a/docs/source/getting_started/amd-installation.rst +++ b/docs/source/getting_started/amd-installation.rst @@ -142,3 +142,10 @@ Alternatively, wheels intended for vLLM use can be accessed under the releases. - Triton flash attention does not currently support sliding window attention. If using half precision, please use CK flash-attention for sliding window support. - To use CK flash-attention or PyTorch naive attention, please use this flag ``export VLLM_USE_TRITON_FLASH_ATTN=0`` to turn off triton flash attention. - The ROCm version of PyTorch, ideally, should match the ROCm driver version. + + +.. tip:: + - For MI300x (gfx942) users, to achieve optimal performance, please refer to `MI300x tuning guide `_ for performance optimzation and tuning tips on system and workflow level. + For vLLM, please refer to `vLLM performance optimzation `_. + + From 18d51d661023f1b202781080ec8130919c89ecdf Mon Sep 17 00:00:00 2001 From: Hongxia Yang Date: Wed, 24 Jul 2024 16:38:16 +0000 Subject: [PATCH 2/2] format --- docs/source/getting_started/amd-installation.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/source/getting_started/amd-installation.rst b/docs/source/getting_started/amd-installation.rst index 2418499ec0836..1c7d274b7c47e 100644 --- a/docs/source/getting_started/amd-installation.rst +++ b/docs/source/getting_started/amd-installation.rst @@ -145,7 +145,7 @@ Alternatively, wheels intended for vLLM use can be accessed under the releases. .. tip:: - - For MI300x (gfx942) users, to achieve optimal performance, please refer to `MI300x tuning guide `_ for performance optimzation and tuning tips on system and workflow level. - For vLLM, please refer to `vLLM performance optimzation `_. + - For MI300x (gfx942) users, to achieve optimal performance, please refer to `MI300x tuning guide `_ for performance optimization and tuning tips on system and workflow level. + For vLLM, please refer to `vLLM performance optimization `_.