-
Notifications
You must be signed in to change notification settings - Fork 890
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
💡 [REQUEST] - 48G单卡无法加载MiniCPM-V-2_6 #392
Comments
我也遇到了这个问题,加载的时候会严重的爆内存,而且永远提示gpu0爆内存 |
我在一张A100-80G显卡上面做了测试,发现使用vllm加载时,内存会先到16GB(读取模型),读取完毕后的某一个瞬间,内存会达到29GB的峰值,然后又降低到了19GB。原因不明。 |
方案找到了,vllm的max-num-seqs默认为256,这将导致在初始化启动阶段带来极高的内存消耗,可以考虑降低到32,且gpu-memory-utilization调整为1,就可以在单张3090上成功运行了这个模型。 |
vllm 初始化的时候会拿一些数据去空跑,图像这边是用的 |
我当时测试的时候无法单纯降低max_model_len,我降低到了3072都无法运行。 |
起始日期 | Start Date
No response
实现PR | Implementation PR
No response
相关Issues | Reference Issues
No response
摘要 | Summary
显示内存溢出
基本示例 | Basic Example
缺陷 | Drawbacks
未解决问题 | Unresolved questions
No response
The text was updated successfully, but these errors were encountered: