We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
现有一个ppdetection框架下的模型(gfl目标检测模型),推理时模型的FPS值(衡量速度的一个指标)会随着batchsize的增大而降低,FLOPS(总运算量)会随着batchsize的增大而增加,请问这是为什么?(尝试了增大CPU内存和num_workers值,但仍然没有帮助) 部分结果和实验设置如下:
The text was updated successfully, but these errors were encountered:
请问是哪个模型呢,我们复测一下看看,影响性能的因素比较多,可能和机器的吞吐量,多卡通信效率等都有直接关系,建议测试的时候记录一下CPU利用率,内存利用率,GPU显存和利用率的占用情况,观察是是否是达到了GPU瓶颈
Sorry, something went wrong.
模型结构和配置文件对应的模型完全一致。 @changdazhou
这个现象似乎是正常,因为每个batch内的图像形状必须对齐,例如batchsize=2时,image1.shape=[800,1216],image2=[1216,800],那么两张图像都会通过0填充到[1216,1216]。当batchsize越大,形状不对齐的情况越多,需要填充的越多。因此计算量会增大。
是的,同一batch尺寸必须对齐
lyuwenyu
changdazhou
No branches or pull requests
问题确认 Search before asking
请提出你的问题 Please ask your question
现有一个ppdetection框架下的模型(gfl目标检测模型),推理时模型的FPS值(衡量速度的一个指标)会随着batchsize的增大而降低,FLOPS(总运算量)会随着batchsize的增大而增加,请问这是为什么?(尝试了增大CPU内存和num_workers值,但仍然没有帮助)
部分结果和实验设置如下:
The text was updated successfully, but these errors were encountered: