Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenMP disable for Windows #1201

Merged
merged 6 commits into from
Oct 26, 2022
Merged

OpenMP disable for Windows #1201

merged 6 commits into from
Oct 26, 2022

Conversation

AramMu
Copy link
Contributor

@AramMu AramMu commented Jul 18, 2022

Using OpenMP on Windows makes it run around 4x slower than without OpenMP, the slowness stands out from Linux and MacOS versions, consider better to disable until the issue is handled.

Using OpenMP on Windows makes it run around 4x slower than without OpenMP, the slowness stands out from Linux and MacOS versions, consider better to disable until the issue is handled.
@saudet
Copy link
Member

saudet commented Jul 18, 2022 via email

@AramMu
Copy link
Contributor Author

AramMu commented Jul 26, 2022

This has also happened to all Windows machines that we tested not only mine, kindly please take a deeper look on it, it's not much reasonable to ask all customers to do additional things unless if it's necessary.

@saudet
Copy link
Member

saudet commented Jul 26, 2022

Since we've moved away from Autotools to build Leptonica and Tesseract, see issue #1163, it should now be possible to build them with Visual Studio. Could you try to do that instead?

@AramMu
Copy link
Contributor Author

AramMu commented Jul 27, 2022

Ok later I'll try with Visual Studio

@saudet
Copy link
Member

saudet commented Aug 10, 2022

I think I understand why it doesn't perform well on Windows. It looks like it's hard coding the num_threads(), whose values are most likely not optimal for any OS, especially not Windows, so removing them should solve this issue without disabling OpenMP: https://github.com/tesseract-ocr/tesseract/search?q=num_threads

@stweil What do you think?

@stweil
Copy link
Contributor

stweil commented Aug 10, 2022

I don't expect that would improve the performance on Windows or any other OS unless when running on a host which does not support at least four threads. But of course you can try removing the hard coded values and test the performance of the resulting code.

@saudet
Copy link
Member

saudet commented Aug 10, 2022

The problem on Windows is that thread scheduling isn't as efficient as on Linux or Mac, so when we start like 100 threads, everything slows down to a crawl. It's recommend to limit the number of threads to the number of available processing cores, which OpenMP does by default anyway, so we don't (usually) need to set num_threads() manually.

@saudet
Copy link
Member

saudet commented Aug 11, 2022

@AramMu I've pushed changes to that effect to your branch. Could you give that a try?

@AramMu
Copy link
Contributor Author

AramMu commented Aug 12, 2022

Ok I'll try the changes

@AramMu
Copy link
Contributor Author

AramMu commented Aug 30, 2022

Hi, I compiled and tried with the change and it crashes when running, can later investigate further or wait

@saudet
Copy link
Member

saudet commented Aug 31, 2022

Could you give me some sample code that crashes like that and that I can try to run here?

@saudet
Copy link
Member

saudet commented Sep 23, 2022

It looks like Tesseract doesn't always work right with more than 4 threads, so I've updated it to force it to always use 4 threads. This way it doesn't crash and performance should be good enough on all platforms, including Windows which should be about as fast as Linux now. Please give it a try!

@saudet
Copy link
Member

saudet commented Oct 6, 2022

Nope, unfortunately, it becomes slower on Windows even with just 2 threads. 🤔

@saudet
Copy link
Member

saudet commented Oct 6, 2022

With OpenMP, it doesn't seem to get a whole lot faster on Linux either, so we might as well disable it there and on Mac as well.
@AramMu What do you think?

@saudet saudet merged commit 8ed0c16 into bytedeco:master Oct 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants