Question about the efficiency with different num_threads set #332

huhanwj · 2024-12-17T10:21:10Z

I am trying to push for higher frame extraction speed for a UHD video with 30M bitrate as the 12 frames are extracted by

frame_batch = vr.get_batch(frame_indices).asnumpy()

I am using a Xeon Platinum 8360Y, which has 36 cores, and using cProfile to track the execution time of each function call, and I surprisingly find that when I set

num_thread=0 (which is auto), the line of code takes 22.733s
num_thread=1, 12.784s
num_thread=2, 14.255s
num_thread=4, 15.855s
num_thread=6, 17.310s
num_thread=8, 18.587s
num_thread=10, 19.813s
num_thread=12, 20.735s
num_thread=16, 22.702s
num_thread=24, 25.357s
num_thread=36, 25.763s
num_thread=48, 25.817s
num_thread=64, 26.388s
num_thread=72, 26.486s

Does this mean that the less threads, the better running speed?

The text was updated successfully, but these errors were encountered:

huhanwj · 2024-12-30T11:47:29Z

Same behavior on AMD Ryzen 9 7900X and Intel Core i9-12900K, and only happens when dealing with UHD or higher resolution videos (4K, 8K) , anyone has any clue on it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the efficiency with different num_threads set #332

Question about the efficiency with different num_threads set #332

huhanwj commented Dec 17, 2024 •

edited

Loading

huhanwj commented Dec 30, 2024 •

edited

Loading

Question about the efficiency with different num_threads set #332

Question about the efficiency with different num_threads set #332

Comments

huhanwj commented Dec 17, 2024 • edited Loading

huhanwj commented Dec 30, 2024 • edited Loading

huhanwj commented Dec 17, 2024 •

edited

Loading

huhanwj commented Dec 30, 2024 •

edited

Loading