Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frozen executable - loky (spawn method) does not work #236

Open
samuelstjean opened this issue Feb 27, 2020 · 11 comments · May be fixed by #375
Open

Frozen executable - loky (spawn method) does not work #236

samuelstjean opened this issue Feb 27, 2020 · 11 comments · May be fixed by #375

Comments

@samuelstjean
Copy link

I'm wondering if the issue reported in #124 and #125 is fully fixed. I'm using this through joblib 145.2 (which vendors loky 2.6.0 apparently) and it throws me a ton of errors on the default loky backend.
I mean by that that limiting to 1 core, so that it uses the sequential backend, works fine, but even adding here the freeze_support line as needed doesn't work.

I'm really not sure if the fault lies here or with pyinstaller, but the same code did use to work using plain old multiprocessing though. Here's what I mean as for the error

screenshot

It looks like it works fine, and then fails to dispatch more jobs. Pyinstaller itself is using a kind of workaround for multiprocessing (see here https://github.com/pyinstaller/pyinstaller/wiki/Recipe-Multiprocessing), so maybe that is interfering with however loky dispatch stuff since it rewrites a bunch of method in the process.

@samuelstjean
Copy link
Author

It really looks like the frozen version can not deal with some path trickery or rewriting of arguments that happens with loky. On linux it will simply complains I didn't give the correct arguments when it calls the multiprocessed parts. I can hand out a frozen version of the (quite complex) code for any platform if it may help.

@ogrisel
Copy link
Collaborator

ogrisel commented Mar 10, 2020

Indeed loky needs to introspect the python executable and pass specific command line arguments to be able to launch worker processes. Could you please provide a minimal reproduction script that shows how you use pyinstaller and joblib? Or maybe even a reproduction script with just pyinstaller and loky directly?

@samuelstjean
Copy link
Author

samuelstjean commented Mar 10, 2020

Well that should be a working example https://gist.github.com/samuelstjean/7286b3377e448b8ca7370bc6dc628fd5
The first comment indicates how to run the whole thing.

If I comment the parser so that it runs without asking for arguments (lines 30 and 31), it won't crash, but it just spawns a lot of processes which stays even after I close the terminal, making the whole computer unresponsive in the process.

If I change the backend to multiprocessing it works as expected.

@samuelstjean
Copy link
Author

Well I tested it on multiprocessing, threading and loky and it definitely works for the two others (even on linux), so it seems it does something weird possibly on all platforms, rendering it useless in frozen applications. This is what I get (I updated the gist to run all backends by itself)

(testinst) samuel ~ $ ./dist/test aa aa
Entering inner loop
Inner loop finished for backend threading
Inner loop finished for backend multiprocessing
usage: test [-h] input output
test: error: unrecognized arguments: -m --process-name --pipe 17
usage: test [-h] input output
test: error: unrecognized arguments: -m --process-name --pipe 23
usage: test [-h] input output
test: error: unrecognized arguments: -m --process-name --pipe 20
usage: test [-h] input output
test: error: unrecognized arguments: -m --process-name --pipe 19
usage: test [-h] input output
test: error: the following arguments are required: output
usage: test [-h] input output
test: error: unrecognized arguments: -m --process-name --pipe 18
exception calling callback for <Future at 0x7f4a0a1e35d0 state=finished raised TerminatedWorkerError>
Traceback (most recent call last):
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/externals/loky/_base.py", line 625, in _invoke_callbacks
    callback(self)
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/parallel.py", line 347, in __call__
    self.parallel.dispatch_next()
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/parallel.py", line 780, in dispatch_next
    if not self.dispatch_one_batch(self._original_iterator):
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/parallel.py", line 847, in dispatch_one_batch
    self._dispatch(tasks)
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/parallel.py", line 765, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 529, in apply_async
    future = self._workers.submit(SafeFunction(func))
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/externals/loky/reusable_executor.py", line 178, in submit
    fn, *args, **kwargs)
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py", line 1102, in submit
    raise self._flags.broken
joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.

The exit codes of the workers are {EXIT(2), EXIT(2)}
usage: test [-h] input output
test: error: unrecognized arguments: -m --process-name --pipe 21
usage: test [-h] input output
/tmp/_MEIs3o3Wc/joblib/externals/loky/backend/resource_tracker.py:120: UserWarning: resource_tracker: process died unexpectedly, relaunching.  Some folders/sempahores might leak.
test: error: unrecognized arguments: -m --process-name --pipe 22
Traceback (most recent call last):
  File "test.py", line 60, in <module>
    main()
  File "test.py", line 40, in main
    out = estimate_from_dwis()
  File "test.py", line 50, in estimate_from_dwis
    output = Parallel(n_jobs=ncores, verbose=verbose)(delayed(_inner)(data[i]) for i in ranger)
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/parallel.py", line 1042, in __call__
    self.retrieve()
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/parallel.py", line 921, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 540, in wrap_future_result
    return future.result(timeout=timeout)
  File "anaconda3/envs/testinst/lib/python3.7/concurrent/futures/_base.py", line 435, in result
    return self.__get_result()
  File "anaconda3/envs/testinst/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/externals/loky/_base.py", line 625, in _invoke_callbacks
    callback(self)
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/parallel.py", line 347, in __call__
    self.parallel.dispatch_next()
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/parallel.py", line 780, in dispatch_next
    if not self.dispatch_one_batch(self._original_iterator):
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/parallel.py", line 847, in dispatch_one_batch
    self._dispatch(tasks)
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/parallel.py", line 765, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 529, in apply_async
    future = self._workers.submit(SafeFunction(func))
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/externals/loky/reusable_executor.py", line 178, in submit
    fn, *args, **kwargs)
  File "anaconda3/envs/testinst/lib/python3.7/site-packages/joblib/externals/loky/process_executor.py", line 1102, in submit
    raise self._flags.broken
joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.

The exit codes of the workers are {EXIT(2), EXIT(2)}
[10938] Failed to execute script test
(testinst) samuel ~ $ usage: test [-h] input output
test: error: the following arguments are required: output

@samuelstjean samuelstjean changed the title Frozen executable - Errno 6 the handle is invalid Frozen executable - loky (spawn method) does not work Jul 19, 2020
@samuelstjean
Copy link
Author

Looks like it will never, ever work for the time being due to loky using spawn however, for all platforms.
https://bugs.python.org/issue32146
Not sure if it's worth it anymore as the only thing I could see would be to add a custom freeze_support for all platforms and not just windows (quickly tried it myself for fun, it shot up my ram in a few seconds somehow).

@ogrisel
Copy link
Collaborator

ogrisel commented Sep 10, 2020

Thanks for the reproducer. To get the loky and spawn start methods to work with a frozen executable, we need to be able to generate a commandline to start the worker process using the python interpreter that is embedded into the pyinstaller-generated executable. This is probably related to pyinstaller/pyinstaller#4865. Will need time to investigate the details but unfortunately I don't have much time at hand right now.

@JimMcDonough
Copy link

Is there a work around for this yet?

Traceback (most recent call last):
File "DDP_GUI_JM_Drop_Down_Prototype_11_16_20.py", line 2367, in
File "C:\Users\jimmc\Python\Python39\Lib\site-packages\PyInstaller\hooks\rthooks\pyi_rth_multiprocessing.py", line 50, in _freeze_support
name, value = arg.split('=')
ValueError: not enough values to unpack (expected 2, got 1)
[13496] Failed to execute script DDPTraceback (most recent call last):
File "DDP_GUI_JM_Drop_Down_Prototype_11_16_20.py", line 2367, in
File "C:\Users\jimmc\Python\Python39\Lib\site-packages\PyInstaller\hooks\rthooks\pyi_rth_multiprocessing.py", line 50, in _freeze_support
name, value = arg.split('=')
ValueError: not enough values to unpack (expected 2, got 1)
[2144] Failed to execute script DD_GUI_JM_Drop_Down_PrP_oGtotype_11_16_20
UI_JM_Drop_Down_Prototype_11_16_20
Traceback (most recent call last):
File "DDP_GUI_JM_Drop_Down_Prototype_11_16_20.py", line 2240, in on_page_changing
File "Semantic_Trend_GUI.py", line 203, in semantic_trend
File "pyLDAvis\gensim_models.py", line 125, in prepare
File "pyLDAvis_prepare.py", line 442, in prepare
File "pyLDAvis_prepare.py", line 278, in _topic_info
File "joblib\parallel.py", line 1054, in call
File "joblib\parallel.py", line 933, in retrieve
File "joblib_parallel_backends.py", line 542, in wrap_future_result
File "concurrent\futures_base.py", line 445, in result
File "concurrent\futures_base.py", line 390, in __get_result
joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.

@gdoras
Copy link

gdoras commented Nov 16, 2021

Dear all, is there some news regarding this issue ? Apparently, using PyInstaller to get an executable from a script using joblib's Parallel still causes the same symptoms reported above. Here is a piece of code reproducing the issue. When processed by PyInstaller, the main method is called endlessly (when using Loky or multiprocessing backends -- it works fine with threading backend, but it does not take advantage of multi-core environment). Using freeze_support does not help, unfortunately. Do you have any hint toward a possible solution ? Thanks in advance.

test.py:

import argparse
from multiprocessing import freeze_support
from joblib import Parallel, delayed

def main():
    arguments_parser = argparse.ArgumentParser()
    arguments_parser.add_argument("-n", default=None, type=int)
    flags, _ = arguments_parser.parse_known_args()

    # safe guard otherwise new processes are spawned forever
    if flags.n is None:
        raise RuntimeError('main() was called again.')

    with Parallel(n_jobs=2, verbose=5) as parallel:
        parallel(delayed(print)(i)for i in range(flags.n))


if __name__ == '__main__':
    freeze_support()
    main()

distribute.sh

pyinstaller \
--noconfirm \
--log-level=WARN \
--onedir \
--nowindow \
test.py

@JimMcDonough
Copy link

JimMcDonough commented Nov 16, 2021 via email

@heygy
Copy link

heygy commented Jan 3, 2023

Dear all, is there some news regarding this issue ? Apparently, using PyInstaller to get an executable from a script using joblib's Parallel still causes the same symptoms reported above. Here is a piece of code reproducing the issue. When processed by PyInstaller, the main method is called endlessly (when using Loky or multiprocessing backends -- it works fine with threading backend, but it does not take advantage of multi-core environment). Using freeze_support does not help, unfortunately. Do you have any hint toward a possible solution ? Thanks in advance.

test.py:

import argparse
from multiprocessing import freeze_support
from joblib import Parallel, delayed

def main():
    arguments_parser = argparse.ArgumentParser()
    arguments_parser.add_argument("-n", default=None, type=int)
    flags, _ = arguments_parser.parse_known_args()

    # safe guard otherwise new processes are spawned forever
    if flags.n is None:
        raise RuntimeError('main() was called again.')

    with Parallel(n_jobs=2, verbose=5) as parallel:
        parallel(delayed(print)(i)for i in range(flags.n))


if __name__ == '__main__':
    freeze_support()
    main()

distribute.sh

pyinstaller \
--noconfirm \
--log-level=WARN \
--onedir \
--nowindow \
test.py

I have been struggled with same issue for a long time. Have you found a solution?

@tomMoral tomMoral linked a pull request Jan 4, 2023 that will close this issue
2 tasks
@tomMoral tomMoral linked a pull request Jan 4, 2023 that will close this issue
2 tasks
@samuelstjean
Copy link
Author

Did this need more help in testing or something else? I'd say that anyone wanting to ship binary stuff simply can't use joblib with loki right now because of that, and I'd say that the dask backend also has some freezing issues somehow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants