Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the efficiency with different num_threads set #332

Open
huhanwj opened this issue Dec 17, 2024 · 1 comment
Open

Question about the efficiency with different num_threads set #332

huhanwj opened this issue Dec 17, 2024 · 1 comment

Comments

@huhanwj
Copy link

huhanwj commented Dec 17, 2024

I am trying to push for higher frame extraction speed for a UHD video with 30M bitrate as the 12 frames are extracted by

frame_batch = vr.get_batch(frame_indices).asnumpy()

I am using a Xeon Platinum 8360Y, which has 36 cores, and using cProfile to track the execution time of each function call, and I surprisingly find that when I set

  • num_thread=0 (which is auto), the line of code takes 22.733s
  • num_thread=1, 12.784s
  • num_thread=2, 14.255s
  • num_thread=4, 15.855s
  • num_thread=6, 17.310s
  • num_thread=8, 18.587s
  • num_thread=10, 19.813s
  • num_thread=12, 20.735s
  • num_thread=16, 22.702s
  • num_thread=24, 25.357s
  • num_thread=36, 25.763s
  • num_thread=48, 25.817s
  • num_thread=64, 26.388s
  • num_thread=72, 26.486s

Does this mean that the less threads, the better running speed?

@huhanwj
Copy link
Author

huhanwj commented Dec 30, 2024

Same behavior on AMD Ryzen 9 7900X and Intel Core i9-12900K, and only happens when dealing with UHD or higher resolution videos (4K, 8K) , anyone has any clue on it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant