Skip to content

[Fix] Adjust default chunked prefill size and cuda graph max bs according to GPU memory capacity #1927

[Fix] Adjust default chunked prefill size and cuda graph max bs according to GPU memory capacity

[Fix] Adjust default chunked prefill size and cuda graph max bs according to GPU memory capacity #1927

Annotations

1 warning

performance-test-1-gpu-part-2

succeeded Nov 15, 2024 in 13m 10s