Skip to content

[Fix] Adjust default chunked prefill size and cuda graph max bs according to GPU memory capacity #1927

[Fix] Adjust default chunked prefill size and cuda graph max bs according to GPU memory capacity

[Fix] Adjust default chunked prefill size and cuda graph max bs according to GPU memory capacity #1927

Annotations

2 errors

performance-test-2-gpu

cancelled Nov 15, 2024 in 9m 5s