-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about the inference speed for using the model in real-time #20
Comments
Hi, We have reported on 640x360 image resolution to compare with ENet and computed the number of floating point operations using the profiler tool in tensorflow. MobileNet-SkipNet and ShuffleNet-SkipNet had GFLOPs is 6.2 and 2 respectively. So ShuffleNet should be faster. We did record the framerate but we havent reported it yet as we are still working on improving it, on TITAN-X the framerate for ShuffleNet was 143 fps and MobileNet was 141 on the same image resolution. However for an efficient inference there are a couple of tricks you need to do like NCHW optimization, there is also the use of optimize_inference tool for Tensorflow that merges some operations and does batchnorm folding. But even without these when I measured the framerate shufflenet was faster than mobilenet. If you checked our branch optimize_inference and run ./run.sh you will get a graph.pb, then run ./optimize.sh you will get optimized_graph.pb, if you wanna measure inference we ran the infer_optimize.py. But we're still working on it , as this version doesnt have the NCHW optimization yet. Please let us know if you found any bugs in this version of the code. |
Hey @MSiam, |
@MSiam Hey, |
@msson , in tf.1.8 we have |
Does anyone who can watch shuffleseg's inference speed faster than 1.21s/it? |
@msson You have to run in inference mode in the run.sh: This will measure the average running time. We are using TF 1.4. @hellochick Yes we are using time.time you can check in train.py code. |
@MSiam Thanks for your reply. I checked shuffleseg model's running time is about 9fps (not always) with 512x384 resolution and about 3.3fps with 1024x768 resolution in inference mode.
The reason why I am asking these is that I saw in the paper you mentioned shuffleseg's speed is about 15fps.
Thanks. |
What platform are you running on, are you running on the Jetson TX1? Cause otherwise this is much slower than what we reported. 143 fps the result we got on TITAN-X and around 16 fps on TX2 (this is without NCHW optimization) on 640x360 resolution. If you need more use NCHW optimization and u can check TensorRT. |
@MSiam Thanks I successfully checked the running time using graph_optimized.pb |
Hi @msson, if possible, would you like to share with me how do you obtain graph_optimized.pb? Also, are you training shuffleseg from scratch? Thanks! |
@zeroAska Hi, yes I trained shuffleseg from scratch. And the way I got graph_optimized.pb is that I just simply followed author's mention below. (you should check and download optimize.sh and infer_optimize.py on their branch.) |
Hi @MSiam, I get a inference time of about 60 FPS with image resolution 640x360 and around 70 FPS with image resolution 512 x 256 on GTX 1080 GPU, when using pre-trained fcn8s_shufflenet model weights, with master branch (without optimization). Is this speed normal or slow? |
We have reported above the one after optimization this is whats measured to compare with ENet since he mentions in his paper he was also fusing operations and batchnorm folding. It was arnd 143 fps on a TITAN-X. I am not sure hw was it before. Let us know if u faced issues with the optimization. Also another diff that ENEt is using torch which by default has CHW order while our initial implementation in TF was HWC which is slower u would have to use the CHW implementation if u want it even faster. However the number of operations should be a more stable way to compare as it doesnt have dependnlencies on the env. Thats what we were reporting in the paper. |
After using optimize_inference branch I get a speed of around 67.5 FPS which is still quite low on GTX 1080 considering you got 143 FPS. Anything I can do to improve the speed? Thanks! |
I just realized an issue in the inference time we earlier reported it was 143 fps but on a TITAN-X pascal architecture and I contacted ENet author he is working with Maxwell architecture. We're still trying to get access to a GPU with Maxwell architecture to measure the inference time and finalize the optimized + NCHW implementation. However the reported GFLOPs in the paper is correct no issues with that. Meanwhile I compared against this caffe implementation for ENet thats referred to by the original repo: Update: |
Hello @MSiam, 1- Run ./run.sh which will create a graph.pb The problem is that when I ran the infer_optimize.py several times, the FPS was different for every run, ranging from 8 FPS to 16 FPS. Do you have any suggestions as to why that might occur? Thank you. |
Hello,
I am trying to test models which are skipnet-mobilenet and shuffleseg with my own dataset (1024x768 resolution)
I've seen in the paper that those models' can be used in real-time situation cause the inference speed is faster than 10fps on PC.
However, when I run your code for inference with my own data, the speed is about 1.5fps and both models speed are similar even though you mentioned shuffeseg is faster than skipnet-mobilenet here.
Please give me any advice for using the model in real-time.
Thank you.
The text was updated successfully, but these errors were encountered: