Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix mpl and multithreading #84

Closed
wants to merge 8 commits into from
Closed

Fix mpl and multithreading #84

wants to merge 8 commits into from

Conversation

Kojobu
Copy link

@Kojobu Kojobu commented Jun 25, 2024

  1. Use modern matplotlib approach to register colormaps
  2. Fix multiprocessing (might be an issue with local hardware, please test before merging this!)

Kojobu added 2 commits June 25, 2024 20:20
2) fix mulitprocessing, Pool by adding mp_context
2) fix mulitprocessing, Pool by adding mp_context
@steven-murray
Copy link
Member

Thanks @Kojobu for the PR. I think the modern MPL part is a 21cmFAST thing, not 21cmMC -- there's actually an open PR there that fixes it. The mp_context thing I'll have to look into and make sure it doesn't mess things up for other hardware. Thanks again!

@Kojobu
Copy link
Author

Kojobu commented Jun 26, 2024

The multiprocessing seems to be a local problem, you may skip this. The small fix for matplotlib may be worth a look though.

@steven-murray
Copy link
Member

@Kojobu OK thanks for noting that. The fix for matplotlib is not actually in this PR (see the changed files tab above). The problem is in 21cmFAST.

@Kojobu
Copy link
Author

Kojobu commented Jun 26, 2024

@steven-murray You are right, my bad. So, I'm on Linux, Python version 3.11 and I consistently get memory leaks when running with INHOMO_RECO and TS_FLUCT. After too many trial and errors I found it's some issue with Python's concurrent.futures.ProcessPoolExecutor.
I fixed it by changing the Pool in mcmc.py to

set_start_method('spawn')
pool = mcmc_options.pop(
    "pool",
    ProcessPoolExecutor(
        max_workers=mcmc_options.get("threadCount", 1),
        max_tasks_per_child=1,
    ),
)

What this does is, it explicitly uses the spawn method (fork, which is standard in Linux, will be removed in Python3.14 according to the docs anyway) which creates a new process every run. It's kind of a brute force approach admittedly but performance-wise it shouldn't be much slower and it works.
I'm not sure if it's worth merging with main because it messes up with the logs a bit but it may be a workaround for future scientists running into the same issue.
I think that's all from my side, thanks for the help! It's much appreciated. :)

@Kojobu Kojobu closed this Jun 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants