[Fix] Adjust default chunked prefill size and cuda graph max bs according to GPU memory capacity #1927
Annotations
2 errors
|
Run test
The operation was canceled.
|
Loading