Skip to content

[Fix] Adjust default chunked prefill size and cuda graph max bs according to GPU memory capacity #1927

[Fix] Adjust default chunked prefill size and cuda graph max bs according to GPU memory capacity

[Fix] Adjust default chunked prefill size and cuda graph max bs according to GPU memory capacity #1927

Annotations

2 errors

unit-test-backend-part-1

cancelled Nov 15, 2024 in 15m 5s