Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PesosSchedulerDriver.join() prevents signal handling #23

Open
anthonyrisinger opened this issue Apr 29, 2015 · 3 comments
Open

PesosSchedulerDriver.join() prevents signal handling #23

anthonyrisinger opened this issue Apr 29, 2015 · 3 comments

Comments

@anthonyrisinger
Copy link

With the current impl, it's not possible to handle signals like SIGINT (KeyboardInterrupt) because join() calls self.lock.wait() with no timeout -- the signal is caught and queued for handling but the main thread is never given a chance to respond (waiting for compactor thread to notify()).

Adding a timeout periodically gives the main thread a chance to respond.

Workaround:

class SchedulerDriver(scheduler.PesosSchedulerDriver):

    @scheduler.PesosSchedulerDriver.locked.__func__
    def join(self):
        if self.status is not mesos_pb2.DRIVER_RUNNING:
            return self.status

        while self.status is mesos_pb2.DRIVER_RUNNING:
            self.lock.wait(1)

        scheduler.log.info(
            "Scheduler driver finished with status %d",
            self.status,
            )
        assert self.status in (
            mesos_pb2.DRIVER_ABORTED,
            mesos_pb2.DRIVER_STOPPED,
            )
        return self.status
@tarnfeld
Copy link
Contributor

Good point, we should get that fixed. Taken from a framework we're using pesos with.. we're not using the join() method.

# Kick off the pesos scheduler and watch the magic happen
thread = threading.Thread(target=driver.run)
thread.setDaemon(True)
thread.start()

# Wait here until the tasks are done
while thread.isAlive():
    time.sleep(0.5)

@anthonyrisinger
Copy link
Author

run() calls join() internally, but what you have also allows SIGINT because you are in the main thread, and you passed a "timeout" to sleep() -- the interpreter will get a chance to raise KeyboardError.

Your pesos thread will not shutdown cleanly though (marked daemon and never called stop())... not necessarily a problem, but something to keep in mind if you perform any state/syncing activities.

@tarnfeld
Copy link
Contributor

tarnfeld commented May 6, 2015

Your pesos thread will not shutdown cleanly though (marked daemon and never called stop())... not necessarily a problem, but something to keep in mind if you perform any state/syncing activities.

Good shout, i'll double check this. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants