[Fix] Adjust default chunked prefill size and cuda graph max bs according to GPU memory capacity #1927
Annotations
2 errors
|
Evaluate data parallelism accuracy (DP=2)
The operation was canceled.
|
Loading