tvm inference performance get worse when batchsize is larger #5

VertexC · 2021-03-02T07:20:51Z

Currently tvm schedules are mostly optimized for batch_size=1. To use batch_size is larger than 1, one needs to compile modle with AutoTvm.

https://discuss.tvm.apache.org/t/more-slower-use-tvm-than-mxnet-when-i-use-batch-forward/1810/2

VertexC · 2021-03-02T07:23:52Z

Cannot find config for target=cuda -keys=cuda,gpu -max_num_threads=1024 -model=unknown -thread_warp_size=32, workload=('conv2d_nchw.cuda', ('TENSOR', (2, 3, 224, 224), 'float32'), ('TENSOR', (64, 3, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=cuda -keys=cuda,gpu -max_num_threads=1024 -model=unknown -thread_warp_size=32, workload=('conv2d_nchw.cuda', ('TENSOR', (2, 64, 224, 224), 'float32'), ('TENSOR', (64, 64, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=cuda -keys=cuda,gpu -max_num_threads=1024 -model=unknown -thread_warp_size=32, workload=('conv2d_nchw.cuda', ('TENSOR', (2, 64, 112, 112), 'float32'), ('TENSOR', (128, 64, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=cuda -keys=cuda,gpu -max_num_threads=1024 -model=unknown -thread_warp_size=32, workload=('conv2d_nchw.cuda', ('TENSOR', (2, 128, 112, 112), 'float32'), ('TENSOR', (128, 128, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'). A fallback configuration is used, which may bring great performance regression.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tvm inference performance get worse when batchsize is larger #5

tvm inference performance get worse when batchsize is larger #5

VertexC commented Mar 2, 2021

VertexC commented Mar 2, 2021

tvm inference performance get worse when batchsize is larger #5

tvm inference performance get worse when batchsize is larger #5

Comments

VertexC commented Mar 2, 2021

VertexC commented Mar 2, 2021