[Fix] Adjust default chunked prefill size and cuda graph max bs according to GPU memory capacity #1927
Annotations
1 warning
The following actions use a deprecated Node.js version and will be forced to run on node20: actions/checkout@v3. For more info: https://github.blog/changelog/2024-03-07-github-actions-all-actions-will-run-on-node20-instead-of-node16-by-default/
|
Loading