Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regression from 0.3.4 to 0.3.5.x - class decorators #483

Closed
bitranox opened this issue May 23, 2022 · 41 comments · Fixed by #486
Closed

regression from 0.3.4 to 0.3.5.x - class decorators #483

bitranox opened this issue May 23, 2022 · 41 comments · Fixed by #486
Milestone

Comments

@bitranox
Copy link

dear @anivegesana ,

I am the author of the wrapt_timeout_decorator

since version 0.3.5 of dill, two of our tests are failing under windows.

In those tests we use our decorator to decorate some Class Methods, which are then called in a process via multiprocess (which is using implicitely dill).
(see this test )

Those tests are just fine with version 0.3.4 on windows/linux/macos, or 0.3.5.x on linux/macos, but the fail with dill > 0.3.4 on windows.

What seems to happen is, that the decorator recursively seem to call itself, creating a chain of subprocesses.
In order to debug that behaviour You might set the timeout to 3600 seconds in that tests and print some debug information to stderr.

I really struggle to make a minimal version to show that regression, therefore I would please ask You for help,
in order to be able to use dill 0.3.5 upwards for my community.

you can find the failed jobs with dill 0.3.5.1 here
and the same tests passing with dill 0.3.4.0 here

yours sincerely

Robert, Vienna

@anivegesana
Copy link
Contributor

anivegesana commented May 24, 2022

I don't have a Windows computer and I am a little bit hesitant about running a potential fork bomb on a computer that isn't mine. Looking into VMs.

In the meantime, can you send the output of the dumps line in

dill.detect.trace(True)
c = dill.dumps(ClassTest1.f)
dill.loads(c)

with both dill 0.3.4 and 0.3.5.1 and whether dumps or loads is causing the infinite loop.

If dumps succeeds, can you share c as well?

@bitranox
Copy link
Author

Dear anivegesana
Oh, its not a fork bomb - the timeout is like 3 seconds on the original test,
so after 4-5 forks (well on windows there is no real fork, therefore its very slow ...)
the timeout kicks in.
I guess the dill dump will be difficult, because it happens in a subprocess - kind of difficult to get it out there ?
but I will try to share the requested data for windows/linux/0.3.4/0.3.5.x ....
thanks for Your effort
greetings from Vienna, Austria
Robert

@bitranox
Copy link
Author

Dear anivegesana,
it is as I thought - I can not even pickle the decorated class method.
And I usually dont do that - I just use multiprocess which is using dill.

with dill 0.3.4 on windows - I cant pickle the function, but multiprocess is working like intended
def test_dill():
    """
    >>> test_dill()
    """
    import dill
    dill.detect.trace(True)
    c = dill.dumps(ClassTest1.f)
    dill.loads(c)
    test_dill()
Exception raised:
    Traceback (most recent call last):
      File "C:\Program Files\JetBrains\PyCharm 2022.1.1\plugins\python\helpers\pycharm\docrunner.py", line 138, in __run
        exec(compile(example.source, filename, "single",
      File "<doctest test_dill[0]>", line 1, in <module>
        test_dill()
      File "C:\opt/rotek-apps/lib/bitranox/wrapt_timeout_decorator/tests\wrapt_timeout_test_helper.py", line 35, in test_dill
        c = dill.dumps(ClassTest1.f)
      File "C:\opt\rotek-apps\lib\bitranox\wrapt_timeout_decorator\venv\lib\site-packages\dill\_dill.py", line 304, in dumps
        dump(obj, file, protocol, byref, fmode, recurse, **kwds)#, strictio)
      File "C:\opt\rotek-apps\lib\bitranox\wrapt_timeout_decorator\venv\lib\site-packages\dill\_dill.py", line 276, in dump
        Pickler(file, protocol, **_kwds).dump(obj)
      File "C:\opt\rotek-apps\lib\bitranox\wrapt_timeout_decorator\venv\lib\site-packages\dill\_dill.py", line 498, in dump
        StockPickler.dump(self, obj)
      File "C:\Program Files\Python310\lib\pickle.py", line 487, in dump
        self.save(obj)
      File "C:\Program Files\Python310\lib\pickle.py", line 578, in save
        rv = reduce(self.proto)
    NotImplementedError: object proxy must define __reduce_ex__()

thanks for Your effort
greetings from Vienna, Austria
Robert

@mmckerns
Copy link
Member

@anivegesana: my build and test boxes are (docker instances inside) VMs. FYI.

@bitranox
Copy link
Author

Dear anivegesana,
here the next attempt to shed some light on the issue:

def test_timeout_class_method(use_signals: bool) -> None:
    """
    >>> test_timeout_class_method(False)
    """
    import dill.detect
    dill.detect.trace(True)
    assert ClassTest1().f(dec_timeout="instance.x", dec_allow_eval=True, use_signals=use_signals) is None

under dill 0.3.4 - it works. under dill 0.3.5.1, the same output, but looping until timeout (of course with different object adresses in each loop)

D2: <dict object at 0x000002BE74FBF680>
T4: <class 'multiprocess.process.AuthenticationString'>
# T4
# D2
T4: <class 'multiprocess.context.Process'>
# T4
D2: <dict object at 0x000002BE74FD7480>
D2: <dict object at 0x000002BE72CE6140>
T4: <class 'multiprocess.process.AuthenticationString'>
# T4
# D2
F2: <function _target at 0x000002BE74FB84C0>
# F2
T4: <class 'wrapt_timeout_decorator.wrap_helper.WrapHelper'>
# T4
D2: <dict object at 0x000002BE74FBD600>
T1: <class 'TimeoutError'>
F2: <function _load_type at 0x000002BE74A243A0>
# F2
# T1
Me: <bound method ClassTest1.f of <wrapt_timeout_test_helper.ClassTest1 object at 0x000002BE74FC1F00>>
T1: <class 'method'>
# T1
F1: <function ClassTest1.f at 0x000002BE74FB8D30>
F2: <function _create_function at 0x000002BE74A244C0>
# F2
Co: <code object f at 0x000002BE74FAF050, file "C:\opt/rotek-apps/lib/bitranox/wrapt_timeout_decorator/tests\wrapt_timeout_test_helper.py", line 13>
F2: <function _create_code at 0x000002BE74A24550>
# F2
# Co
D4: <dict object at 0x000002BE74A2AC40>
# D4
D2: <dict object at 0x000002BE74FF0C40>
# D2
# F1
T4: <class 'wrapt_timeout_test_helper.ClassTest1'>
# T4
D2: <dict object at 0x000002BE72C9AC40>
# D2
# Me
D2: <dict object at 0x000002BE74FBE580>
# D2
F2: <function rebuild_pipe_connection at 0x000002BE74FEAA70>
# F2
T4: <class 'multiprocess.reduction.DupHandle'>
# T4
D2: <dict object at 0x000002BE74FF1E80>
# D2
# D2
D2: <dict object at 0x000002BE72CA00C0>
# D2
# D2

thanks for Your effort
greetings from Vienna, Austria
Robert

@anivegesana
Copy link
Contributor

That output is very strange. It is matching, which means that dill is not going through infinite regress, multiprocess is just executing the same code over and over. It is even more bizarre that it only effects dill.

@bitranox
Copy link
Author

Dear anivegesana,
I guess it is the combination of some black magic from wrapt decorator, together with multiprocess and dill. Just guessing.
The thing is, it worked fine until (including) Dill Version 0.3.4.
Its a pitty, because my users are using that timeout decorator exactly because it worked well with windows.

I pinned down the Dill Version to 0.3.4 in the requirements.txt on Windows for the time beeing, but I hope You can find the bug ...

Yours sincerely
Robert

@anivegesana
Copy link
Contributor

anivegesana commented May 26, 2022

I am still having trouble reproducing this bug. I am getting the error you described when you pickled the function. I'll try other Windows computers, but it might be dependent on the environment and I might need to try it on a CI which will be incredibly painful to debug. First, I think I should make a small patch to the issue with pickling wrapt classes which hopefully will resolve the issue without resorting to that.

Python 3.7.11 [MSC v.1927 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license()" for more information.
>>> from wrapt_timeout_decorator import timeout
>>> class ClassTest1(object):
...    def __init__(self) -> None:
...        self.x = 3.0
...    @timeout("instance.x/10", dec_allow_eval=True)  # type: ignore
...    def f(self) -> None:
...        time.sleep(0.5)

>>> ClassTest1().f()
Traceback (most recent call last):
  File "<pyshell#3>", line 1, in <module>
    ClassTest1().f()
  File "wrapt_timeout_decorator-1.3.12-py3-none-any.whl\wrapt_timeout_decorator\wrapt_timeout_decorator.py", line 139, in wrapper
    return wrapped_with_timeout(wrap_helper)
  File "wrapt_timeout_decorator-1.3.12-py3-none-any.whl\wrapt_timeout_decorator\wrapt_timeout_decorator.py", line 148, in wrapped_with_timeout
    return wrapped_with_timeout_process(wrap_helper)
  File "wrapt_timeout_decorator-1.3.12-py3-none-any.whl\wrapt_timeout_decorator\wrapt_timeout_decorator.py", line 162, in wrapped_with_timeout_process
    return timeout_wrapper()
  File "wrapt_timeout_decorator-1.3.12-py3-none-any.whl\wrapt_timeout_decorator\wrap_function_multiprocess.py", line 41, in __call__
    self.__process.start()
  File "multiprocess-0.70.13-py37-none-any.whl\multiprocess\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "multiprocess-0.70.13-py37-none-any.whl\multiprocess\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "multiprocess-0.70.13-py37-none-any.whl\multiprocess\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "multiprocess-0.70.13-py37-none-any.whl\multiprocess\popen_spawn_win32.py", line 89, in __init__
    reduction.dump(process_obj, to_child)
  File "multiprocess-0.70.13-py37-none-any.whl\multiprocess\reduction.py", line 63, in dump
    ForkingPickler(file, protocol, *args, **kwds).dump(obj)
  File "dill-0.3.5.1-py2.py3-none-any.whl\dill\_dill.py", line 620, in dump
    StockPickler.dump(self, obj)
  File "C:\Program Files\Python\lib\pickle.py", line 437, in dump
    self.save(obj)
  File "C:\Program Files\Python\lib\pickle.py", line 549, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Program Files\Python\lib\pickle.py", line 662, in save_reduce
    save(state)
  File "C:\Program Files\Python\lib\pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "dill-0.3.5.1-py2.py3-none-any.whl\dill\_dill.py", line 1251, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "C:\Program Files\Python\lib\pickle.py", line 859, in save_dict
    self._batch_setitems(obj.items())
  File "C:\Program Files\Python\lib\pickle.py", line 885, in _batch_setitems
    save(v)
  File "C:\Program Files\Python\lib\pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Program Files\Python\lib\pickle.py", line 774, in save_tuple
    save(element)
  File "C:\Program Files\Python\lib\pickle.py", line 549, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Program Files\Python\lib\pickle.py", line 662, in save_reduce
    save(state)
  File "C:\Program Files\Python\lib\pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "dill-0.3.5.1-py2.py3-none-any.whl\dill\_dill.py", line 1251, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "C:\Program Files\Python\lib\pickle.py", line 859, in save_dict
    self._batch_setitems(obj.items())
  File "C:\Program Files\Python\lib\pickle.py", line 885, in _batch_setitems
    save(v)
  File "C:\Program Files\Python\lib\pickle.py", line 549, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Program Files\Python\lib\pickle.py", line 638, in save_reduce
    save(args)
  File "C:\Program Files\Python\lib\pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Program Files\Python\lib\pickle.py", line 774, in save_tuple
    save(element)
  File "C:\Program Files\Python\lib\pickle.py", line 549, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Program Files\Python\lib\pickle.py", line 633, in save_reduce
    save(cls)
  File "C:\Program Files\Python\lib\pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "dill-0.3.5.1-py2.py3-none-any.whl\dill\_dill.py", line 1840, in save_type
    )), obj=obj, postproc_list=postproc_list)
  File "dill-0.3.5.1-py2.py3-none-any.whl\dill\_dill.py", line 1140, in _save_with_postproc
    pickler.save_reduce(*reduction, obj=obj)
  File "C:\Program Files\Python\lib\pickle.py", line 638, in save_reduce
    save(args)
  File "C:\Program Files\Python\lib\pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Program Files\Python\lib\pickle.py", line 789, in save_tuple
    save(element)
  File "C:\Program Files\Python\lib\pickle.py", line 504, in save
    f(self, obj) # Call unbound method with explicit self
  File "dill-0.3.5.1-py2.py3-none-any.whl\dill\_dill.py", line 1251, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "C:\Program Files\Python\lib\pickle.py", line 859, in save_dict
    self._batch_setitems(obj.items())
  File "C:\Program Files\Python\lib\pickle.py", line 885, in _batch_setitems
    save(v)
  File "C:\Program Files\Python\lib\pickle.py", line 524, in save
    rv = reduce(self.proto)
NotImplementedError: object proxy must define __reduce_ex__()

@anivegesana
Copy link
Contributor

anivegesana commented May 26, 2022

Based on GrahamDumpleton/wrapt#158, I believe this is likely mostly a conscious choice by the wrapt author as opposed to a problem in dill or multiprocess. multiprocess should not spin into an infinite loop and silently fail in very strange corner cases, so I believe that this issue should still be left up or moved over to multiprocess in case we can find out what is causing that strange behavior.

They created the change to wrapt that disabled pickling of ObjectProxy objects in this commit: GrahamDumpleton/wrapt/13c55faf7d3e83e184b8a150f0df3de294698b25

So it is still a mystery how the new version of dill changed behavior for this error to bubble up now and why it was ignored earlier. Also, what was the monkey patch you referenced in bitranox/wrapt_timeout_decorator#13 (comment)?

@bitranox
Copy link
Author

Dear anivegesana,

I am still having trouble reproducing this bug

which trouble ? Its kind of hard to debug in a subprocesses, but I think You should concentrate to debug multiprocess & pickle together, instead of only pickle the function itself. This also did not work in the past.

I am getting the error you described when you pickled the function

#1 Yes, this pickle error happens also with dill 0.3.4 - but multiprocess is working with my wrapt_timeout_decorator and dill 0.3.4. So it seems it does NOT have anything to do with that NotImplementedError: object proxy must define __reduce_ex__()

#2 it works under linux, also with dill '0.3.5' - so You might concentrate on the differences between linux and win32 inplementation ? There should be some if sys.platform == 'win32' or similar ? ...

#3 You migh also check which change between 0.3.4 and 0.3.5 is causing the trouble - maybe You can git bisect those changes

#4 the error NotImplementedError: object proxy must define __reduce_ex__() does NOT pop up in the trace when using multiprocess

#5

Also, what was the monkey patch you referenced in bitranox/wrapt_timeout_decorator#13 (comment)?

I dont remember - I did not tinker around with dill or multiprocess

I really dont like mysteries ;-), it means there is a hidden bug in the code .... which should be found, even if it is "hard to debug"

You have all the tools on hand to find that bug - just by replay the changes between 0.3.4 and 0.3.5 - Your argument "I dont have a windows machine" seems somehow a little bit odd in times of virtual machines nowadays ...

please be so kind and consider to help here, because a number of astronomers rely on that package.

yours sincerely

Robert

@mmckerns
Copy link
Member

Well, the mystery is indeed what is causing the strange behavior. The root cause is unknown, and it could be a bug in multiprocess, dill, or wrapt. It being a Windows-only bug makes it more difficult, even in the day of VMs, as one needs a copy of Windows and potentially some time investment to set up the VM. Let's keep it civil, and figure it out together.

So step (1) is make sure whatever issue is being reported can be reproduced. Then step (2) is try to use diagnostic tools to find what the root cause is. dill has dll.detect.trace and some other tools in the dill.detect module. You can see if it shows up with serialization variants in dill.settings. You can check if it shows up in multiprocess when the settings are changed as well -- such as trying the spawn multiprocess.context on a mac/linux box, or similarly trying to pickle with ForkingPickler from multiprocess. multiprocess also has a number of loggers, including a debugger and a subprocess debugger. pathos has subprocess profilers in pathos.profille that can help see what functions are called that might not show up in the traceback or on the debugger.

@anivegesana
Copy link
Contributor

anivegesana commented May 26, 2022

I did not intend to make it seem that your problem is not a problem. In fact, I wanted to imply that there are likely multiple problems in multiple libraries that makes it very hard to diagnose and we should try to break it up into smaller pieces.

All I meant by not being able to reproduce it was that on some Windows computers it has one behavior and on other Windows computers it has another, which is very bad and is the first problem that should be tackled. It is impossible to get anywhere else without consistency.

I do have access to VMs and am trying it on different VM platforms, hoping to find a difference that is causing the behavior.

pathos.profiler will probably help out a lot and I'll use it once I find a VM that causes the issue.

The most likely reason was that a change in dill broke a monkey patch that multiprocess made for Windows, but it breaks it differently based on other properties of the environment depending on what that patch does. I think it is in multiprocess because the other libraries are platform agnostic and there shouldn't be a reason that it performs differently on different machines with the same OS.

@anivegesana
Copy link
Contributor

One thing I noticed was that you ran your test that passed on Windows with Python 3.9 and the one that failed on Windows with Python 3.10. I tried it out on Python 3.7. That might be a confounding factor here. When I find time, I'll try it out on different Python versions and see if that is the cause of the varying results.

@bitranox
Copy link
Author

bitranox commented May 26, 2022

Dear,mmckerns,
Dear anivegesana,

Well, the mystery is indeed what is causing the strange behavior. The root cause is unknown, and it could be a bug in multiprocess, dill, or wrapt.

I did not intend to make it seem that your problem is not a problem. In fact, I wanted to imply that there are likely multiple problems in multiple libraries that makes it very hard to diagnose and we should try to break it up into smaller pieces.

Everything absolutely True, but dill is the only component that changed - and we know the bug was introduced between 0.3.4 and 0.3.5 - and it only occurs on windows - so this should help to find the issue.

It being a Windows-only bug makes it more difficult, even in the day of VMs, as one needs a copy of Windows and potentially some time investment to set up the VM.

I disagree here - tests on windows are absolutely necessary, because many things work different on windows - especially forking for multiprocess. Therefore a windows test vm sould always be in Your toolbox ...

One thing I noticed was that you ran your test that passed on Windows with Python 3.9 and the one that failed on Windows with Python 3.10. I tried it out on Python 3.7. That might be a confounding factor here.

I just updated the tests from 3.9 to 3.10 AFTER working on that issue. It makes no difference if 3.9 or 3.10

Let's keep it civil, and figure it out together

Sure - if I came across rude, please accept my sincere apologies - please consider the German/Austrian mentality, which is abolutely straight forward. (well Austrians a bit less the Germans ...)

The most likely reason was that a change in dill broke a monkey patch that multiprocess made for Windows, but it breaks it differently based on other properties of the environment depending on what that patch does. I think it is in multiprocess because the other libraries are platform agnostic and there shouldn't be a reason that it performs differently on different machines with the same OS.

Sounds good. You can let all the components be the same, just change dill from 0.3.4 to 0.3.5
I totally agree with Your opinion, in fact I mentioned it before :

#2 it works under linux, also with dill '0.3.5' - so You might concentrate on the differences between linux and win32 inplementation ? There should be some if sys.platform == 'win32' or similar ? ...

greetings from Vienna, Austria,
and thanks for Your help !

yours sincerely

Robert

@anivegesana
Copy link
Contributor

anivegesana commented May 26, 2022

If the only difference is the dill version and it works on some Windows computers and not others, I would assume that the wait has something to do with it. I tried it out on a big VM with 10 CPU cores and GitHub actions probably has just 2. I was to throw a wild guess, I would assume that it is related to #482 where a bug fix to dill's handling of recursive functions (#443) caused deeper searches for objects to pickle. Does adding

dill.settings['recurse'] = True

to the beginning of the test help? I am at a loss though, and the only way that I could hope to help is by recreating the entire setup on GitHub Actions (as opposed to Azure/others like I am trying now) and run your tests there.

@mmckerns
Copy link
Member

The most likely reason was that a change in dill broke a monkey patch that multiprocess made for Windows, but it breaks it differently based on other properties of the environment depending on what that patch does. I think it is in multiprocess because the other libraries are platform agnostic and there shouldn't be a reason that it performs differently on different machines with the same OS.

What specifically are you referring to? I know the multiprocess code pretty well... and I should be aware of any patches I've made for Windows. One thing to note about multiprocess is that the source code is different for each version of python, as well as there are portions that are different for each platform.

@anivegesana
Copy link
Contributor

I haven't look into the multiprocess source code yet. That was more of a list of places that I would plan to look next rather than a particular fix that I was suggesting.

@bitranox
Copy link
Author

Dear anivegesana,

adding dill.settings['recurse'] = True does not make any difference.
Windows Version seems not to matter - I have exactly the same behaviour on github actions and on my local machine with 8 cores and windows11 ...
yours sincerely
Robert

@mmckerns
Copy link
Member

The mods to multiprocess from multiprocessing are pretty small, so generally if there's an issue in multiprocess, it's also in the STL multiprocessing as well.

@bitranox: Can you specify a matrix of versions of python, dill, multiprocess and wrapt that you see the issue in, and where you don't?

@bitranox
Copy link
Author

Dear,mmckerns,
that matrix seems easy - it worked all the way down since 2017, until now, an Windows7 / Windows10 / Windows 11,
with Python 3.7/3.8/3.9/3.10 , o the difference is really only the new dill version ...

I guess to roll back the changes gradually from 3.0.5 to 3.0.4 should be the easiest way !

yours sincerely
Robert

@anivegesana
Copy link
Contributor

anivegesana commented May 26, 2022

#443 was definitely the beefiest change, so testing 04aba03 vs 0392e14 would give a half-way point between 0.3.4 and 0.3.5.

Thank you for your patience with this bug.

@mmckerns
Copy link
Member

@bitranox: can you generate a matrix, even if it's easy. I haven't seen a quote on the versions of multiprocess or wrapt that show the issue. I think we are all assuming that a change to dill is what triggered the issue. However, if you can explore version combinations a bit, it might help diagnose what's going on. @anivegesana: stepping through the commits to dill (starting with the two quoted above) is probably the right thing to do.

@bitranox
Copy link
Author

Dear anivegesana,
04aba03 is working

Dear,mmckerns,
creating the matrix would make a lot of work and no benefit. I found no combination until now, working with 0.3.5, and I havent found any combination NOT working with 0.3.4

I really try to make it easy for You folks, but lets concentrate on something that looks more beneficial:

stepping through the commits to dill (starting with the two quoted above) is probably the right thing to do.

Yours sincerely
Robert

@bitranox
Copy link
Author

Dear anivegesana,
0392e14 is also working

@anivegesana
Copy link
Contributor

anivegesana commented May 26, 2022

Ok. Not expected. Most other issues from this week were concerning that PR. That narrows it down significantly. How about e2831d0?

@bitranox
Copy link
Author

@anivegesana
Copy link
Contributor

d2f916c?

@bitranox
Copy link
Author

d2f916c NOT working

@anivegesana
Copy link
Contributor

Ok. It is somewhere between. 23c4745?

@bitranox
Copy link
Author

23c4745 NOT working

@anivegesana
Copy link
Contributor

5bd56a8?

@bitranox
Copy link
Author

5bd56a8 NOT working

@anivegesana
Copy link
Contributor

Last one: 914d47f? Then we know for sure.

@bitranox
Copy link
Author

914d47f NOT Working
lets double check - which one was the commit before that ?

@anivegesana
Copy link
Contributor

The parent of that is e2831d0. So I know what to look for now. Will get back to you with a fix. Thank you for your patience and help with debugging this!

@bitranox
Copy link
Author

Yes, double checked, e2831d0 IS working.

@anivegesana
Copy link
Contributor

anivegesana commented May 27, 2022

Ok. I found out that running the test suite with PyTest instead of just running the test manually lets me reproduce your error. This is not an issue with your library or multiprocess, it is an issue between dill and PyTest like #482. The error message I was getting back up here was because the console was implicitly __main__, which would break multiprocess anyway.

Now that I understand multiprocess and this issue, I can be more useful. It is probably not exactly the same as #482, so I'll do some more poking to see if it is solved by #486 or not. As far as I can tell, it is exactly the same as #482 and the tests pass. Rest assured, a fix will be available in dill 0.3.5.2 (if that is still on @mmckerns' radar in June, else dill 0.3.6).

@bitranox
Copy link
Author

Dear anivegesana,
that sounds great, thank You for Your hard work.
Yours sincerely
Robert

@mmckerns
Copy link
Member

mmckerns commented May 27, 2022

Yes, we should expect dill to emit a new release in June or soon thereafter. The milestone is tagged as dill-0.3.6. If you want to apply it to this and other PRs, go ahead. @anivegesana if you are sure it's the case, then we should tag and close this issue as a duplicate of #482.

@anivegesana
Copy link
Contributor

And thank you both for being patient. This was a very frustrating bug that kept slipping out from underneath my nose due to all of the moving parts and because I kept searching in the wrong places.

@anivegesana
Copy link
Contributor

I don't think I have permissions to do that. If it would be more convenient for me to do that because of regular contributions, I believe that the role is triage:

https://docs.github.com/en/organizations/managing-access-to-your-organizations-repositories/repository-roles-for-an-organization#permissions-for-each-role

It is hopefully broad enough to allow contributors to help with managing the repo while not being expressive enough to cause a dominictarr/event-stream situation because there is no way to change the code on master without going through you.

@mmckerns mmckerns added this to the dill-0.3.6 milestone May 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants