Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multithreading bug of refcounting #137

Closed
trouger opened this issue Jul 17, 2014 · 2 comments
Closed

multithreading bug of refcounting #137

trouger opened this issue Jul 17, 2014 · 2 comments

Comments

@trouger
Copy link

trouger commented Jul 17, 2014

Assume d is a netref of a remote dict, the following line may raise KeyError in rpyc/lib/colls.py (RefCountingColl.__getitem__) when executing in a working thread:

d.get("a", [])

Here is the explanation:

  1. The list object [] is boxed(added into conn._local_objects) and sent to the remote side. The working thread blocks waiting for the reply of d.get("a", []).
  2. On the remote side, a netref of the list object is created. Because "a" is not in d, the remote side returns the netref of the list object to the local side.
  3. The netref of the list object is garbage collected and an aync_request of HANDLE_DEL is sent to the local side.
  4. On the local side, the network "message queue" has the following two messages in the correct order: (1)reply of the d.get("a", []) call, which contains the id of the local list object, (2)a request to delete the local list object.
  5. On the local side, there is a BgServingThread which processes messages. The working thread receives the first of the above two messages, and before dispatching the reply, the BgServingThread gets scheduled. The BgServingThread receives and dispatches the second message, i.e., deletes the list object from conn._local_objects.
  6. When the working thread gets scheduled, it dispatches the reply, trying to get the list object from conn._local_objects by object id, and a KeyError is raised.
@tomerfiliba
Copy link
Collaborator

i agree with your analysis, but i can't think of any easy way to solve it. storing the list in a temp variable will solve it, but that's out of rpyc's scope.

@coldfix
Copy link
Contributor

coldfix commented Dec 22, 2017

I think this exact scenario can not anymore happen anymore due to 4c8f004:

  • when we are at point 5 in this command sequence, _dispatch is called in order for the two messages. This method unboxes the message and with it the object itself, before the HANDLE_DEL has a chance of ever running.
  • at point 6, the handler now executes the message with the previously unboxed object without error

Let me know if you think that I'm wrong.

Best, Thomas

@coldfix coldfix closed this as completed Dec 22, 2017
coldfix added a commit that referenced this issue Jun 11, 2018
This release brings a few minor backward incompatibilities, so be sure to read
on before upgrading. However, fear not: the ones that are most likely relevant
to you have a relatively simple migration path.

Backward Incompatibilities
^^^^^^^^^^^^^^^^^^^^^^^^^^

* ``classic.teleport_function`` now executes the function in the connection's
  namespace by default. To get the old behaviour, use
  ``teleport_function(conn, func, conn.modules[func.__module__].__dict__)``
  instead.

* Changed signature of ``Service.on_connect`` and ``on_disconnect``, adding
  the connection as argument.

* Changed signature of ``Service.__init__``, removing the connection argument

* no longer store connection as ``self._conn``. (allows services that serve
  multiple clients using the same service object, see `#198`_).

* ``SlaveService`` is now split into two asymetric classes: ``SlaveService``
  and ``MasterService``. The slave exposes functionality to the master but can
  not anymore access remote objects on the master (`#232`_, `#248`_).
  If you were previously using ``SlaveService``, you may experience problems
  when feeding the slave with netrefs to objects on the master. In this case, do
  any of the following:

  * use ``ClassicService`` (acts exactly like the old ``SlaveService``)
  * use ``SlaveService`` with a ``config`` that allows attribute access etc
  * use ``rpyc.utils.deliver`` to feed copies rather than netrefs to
    the slave

* ``RegistryServer.on_service_removed`` is once again called whenever a service
  instance is removed, making it symmetric to ``on_service_added`` (`#238`_)
  This reverts PR `#173`_ on issue `#172`_.

* Removed module ``rpyc.experimental.splitbrain``. It's too confusing and
  undocumented for me and I won't be developing it, so better remove it
  altogether. (It's still available in the ``splitbrain`` branch)

* Removed module ``rpyc.experimental.retunnel``. Seemingly unused anywhere, no
  documentation, no clue what this is about.

* ``bin/rpyc_classic.py`` will bind to ``127.0.0.1`` instead of ``0.0.0.0`` by
  default

* ``SlaveService`` no longer serves exposed attributes (i.e., it now uses
  ``allow_exposed_attrs=False``)

* Exposed attributes no longer hide plain attributes if one otherwise has the
  required permissions to access the plain attribute. (`#165`_)

.. _#165: #165
.. _#172: #172
.. _#173: #173
.. _#198: #198
.. _#232: #232
.. _#238: #238
.. _#248: #248

What else is new
^^^^^^^^^^^^^^^^

* teleported functions will now be defined by default in the globals dict

* Can now explicitly specify globals for teleported functions

* Can now use streams as context manager

* keep a hard reference to connection in netrefs, may fix some ``EOFError``
  issues, in particular on Jython related (`#237`_)

* handle synchronous and asynchronous requests uniformly

* fix deadlock with connections talking to each other multithreadedly (`#270`_)

* handle timeouts cumulatively

* fix possible performance bug in ``Win32PipeStream.poll`` (oversleeping)

* use readthedocs theme for documentation (`#269`_)

* actually time out sync requests (`#264`_)

* clarify documentation concerning exceptions in ``Connection.ping`` (`#265`_)

* fix ``__hash__`` for netrefs (`#267`_, `#268`_)

* rename ``async`` module to ``async_`` for py37 compatibility (`#253`_)

* fix ``deliver()`` from IronPython to CPython2 (`#251`_)

* fix brine string handling in py2 IronPython (`#251`_)

* add gevent_ Server. For now, this requires using ``gevent.monkey.patch_all()``
  before importing for rpyc. Client connections can already be made without
  further changes to rpyc, just using gevent's monkey patching. (`#146`_)

* add function ``rpyc.lib.spawn`` to spawn daemon threads

* fix several bugs in ``bin/rpycd.py`` that crashed this script on startup
  (`#231`_)

* fix problem with MongoDB, or more generally any remote objects that have a
  *catch-all* ``__getattr__`` (`#165`_)

* fix bug when copying remote numpy arrays (`#236`_)

* added ``rpyc.utils.helpers.classpartial`` to bind arguments to services (`#244`_)

* can now pass services optionally as instance or class (could only pass as
  class, `#244`_)

* The service is now charged with setting up the connection, doing so in
  ``Service._connect``. This allows using custom protocols by e.g. subclassing
  ``Connection``.  More discussions and related features in `#239`_-`#247`_.

* service can now easily override protocol handlers, by updating
  ``conn._HANDLERS`` in ``_connect`` or ``on_connect``. For example:
  ``conn._HANDLERS[HANDLE_GETATTR] = self._handle_getattr``.

* most protocol handlers (``Connection._handle_XXX``) now directly get the
  object rather than its ID as first argument. This makes overriding
  individual handlers feel much more high-level. And by the way it turns out
  that this fixes two long-standing issues (`#137`_, `#153`_)

* fix bug with proxying context managers (`#228`_)

* expose server classes from ``rpyc`` top level module

* fix logger issue on jython

.. _#137: #137
.. _#146: #146
.. _#153: #153
.. _#165: #165
.. _#228: #228
.. _#231: #231
.. _#236: #236
.. _#237: #237
.. _#239: #239
.. _#244: #244
.. _#247: #247
.. _#251: #251
.. _#253: #253
.. _#264: #264
.. _#265: #265
.. _#267: #267
.. _#268: #268
.. _#269: #269
.. _#270: #270

.. _gevent: http://www.gevent.org/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants