Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find fast way to check for all running greenlets #27

Open
rgalanakis opened this issue Jul 11, 2014 · 10 comments
Open

Find fast way to check for all running greenlets #27

rgalanakis opened this issue Jul 11, 2014 · 10 comments

Comments

@rgalanakis
Copy link
Owner

The select function will detect a deadlock condition (see #25), but this only happens before it starts to loop and wait for a channel to be ready. So the code could potentially get past this check, then a greenlet dies, and there's an undetected deadlock.

This choice was made because the deadlock detection is expensive in gevent. It requires looping over gc.get_objects() and looking for alive greenlets.

If that detection process could be sped up, the check could happen inside the select loop, and a deadlock could be more reliably detected.

@MichaelAz
Copy link
Contributor

Perhaps monkey-patching greenlet creation? Or, if we sacrifice some interoperability with gevent, we can keep track of only those greenlets we created. This will work, but it assumes no use of gevent is done outside of the goless interface.

@rgalanakis
Copy link
Owner Author

Regarding using goless for everything, definitely not. Its a road I've been down before. For example, we had to use 'bluepy.TaskletEx' instead of 'stackless.tasklet' for some things at CCP to get instrumentation and some other important things. I then wrote a gevent/stackless compatibility library (uthread), which worked great, except it used stackless.tasklet instead of bluepy.TaskletEx. Which meant I needed to write another compatibility backend for bluepy, and a way for it to be chosen/set/extended, and keep that backend system open for external use (so uthread wasn't coupled to bluepy). And then how to handle libraries that depend on gevent (or greenlet!) directly? Not an experience I want to repeat :)

I'd also like to avoid patching greenlet creation. We also have to patch greenlet death, and do whatever other book keeping is required. And its still brittle to callers, potentially.

Ultimately its a question of what makes goless most useful. My feeling is that its more useful as a library that performs well but with this slight deadlock chance (that we should fix). At the very least, it keeps the problem local.

My feeling is we can figure out a clever and reliable way to do this. Perhaps instead of monkeypatching greenlet creation, we can monkeypatch greenlet switching, and count how many switches to different greenlets occur. Something like that.

@ctismer
Copy link
Contributor

ctismer commented Jul 18, 2014

Not sure if I mentioned this before, but the problems that you are facing
looks typical for a lot of competing, "wrong" concepts.

Tasklet, greenlet, bluepy, uthread, ...

These are all wrong in the sense that they try to be the center of the world, and dominate
everything. These are the wrong building blocks, because they do not "compose".

Know about composability? That's a thing that was invented as part of the stackless for PyPy
work, and the current composable building blocks are "continulet"s.
Things built on top of them are really composable, because they are able to ignore the
implementation of things that are irrelevant in a context. Such contexts form a partition
of the world into disjoint views, which can live together without interference.

Just to mention that, we can discuss that later (and it is a thing where we will move Stackless
to, see Kristjan's tealet)

http://pypy.readthedocs.org/en/latest/stackless.html#theory-of-composability

@rgalanakis
Copy link
Owner Author

@MichaelAz So I haven't come up with anything in gevent. gevent.get_hub().loop has activecnt and pendingcnt but they don't seem to be what we want. Maybe something deeper in libev directly (ctypes or something could be used to ask it maybe)? I also tried speeding up the current implementation by using gc.get_referrers instead of gc.get_objects. The former was slower, even when I primed Python with tons of uncollected garbage.

@ctismer Hey, uthread (like goless) tries to hide those separate worlds, not be a world itself! :) You are right about composability, but is there anything we can do? I understand the talk about tealet/continulet, but ultimately doesn't it come down to some asynchronicity primitive being supplied by the language, instead of a library? Or, at the very least, something like asyncio to tie the asynchronous systems together?

@MichaelAz
Copy link
Contributor

I can't think of any non-hacky way to do this so I opened an issue with gevent (gevent/gevent#465). I'll try to get it implemented.

@ctismer
Copy link
Contributor

ctismer commented Jul 25, 2014

Why not derive your tasklet class and keep a record of the instances?
Is that hacky?
I did not implement this because it is so simple to add ...

@rgalanakis
Copy link
Owner Author

Thanks @MichaelAz let's see what happens...

@ctismer, I'd like goless to be compatible with 'native' usages of the backend, so it can integrate properly into an application. For example, I'd like for someone to be able to do::

import stackless
def calc():
    while True:
        value = channel.recv()
        channel.send(value ** 2)
t = stackless.tasklet(calc)
import goless
channel = goless.chan()
t()
for i in range(2, 5):
    channel.send(i)
    squared = channel.recv()
    print('%s squared is %s' % (i, squared))

Since that code uses stackless.tasklet directly, goless would report a deadlock if we try any of our own bookkeeping (patching, deriving, whatever). Note also that goless is imported after the tasklet is created, so even monkeypatching won't work.

The rationale for this decision was explained in a previous comment: #27 (comment)

@ctismer
Copy link
Contributor

ctismer commented Jul 26, 2014

So what exactly would you wish/propose?

Note that all tasklets are either in the runnables queue, or hidden in a channel,
and those are all reachable if you know their channel.
What is missing?

Maybe it is time to collect thoughts and check if we all have the same picture
of things. I am asking right now, because I'm at EuroPython in a sprint on
Stackless and PyPy.
I want to fix a few things. Minimalistic, as always...

cheers - Chris

@rgalanakis
Copy link
Owner Author

I would wish that gevent (or greenlet?) had a runcount attribute that showed how many greenlets were alive, just like stackless does. Nothing to do with stackless or pypy, just gevent.

Note that all tasklets are either in the runnables queue, or hidden in a channel,
and those are all reachable if you know their channel.

If you are talking about gevent, I would love an elaboration on this.

@ctismer
Copy link
Contributor

ctismer commented Jul 27, 2014

My fault. I did not read the topic correctly.
Everything I said was about stackless, not greenlet.

Sorry ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants