Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Robustify Process handling and extend a bit #127

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

krono
Copy link
Contributor

@krono krono commented Oct 1, 2019

  • Isolate persistence writing to a single thread
    (trivial in default exclusive case, not possible in non-exclusive case)
  • Add a shaed-queue-based scheduler for parallel execution
  • Caveat lector: Adds a new dependency: subprocess32

Python from 3.2 on support timeouts on subprocesses directly
subprocess32 backports that to Python 2
Easier to read. Also, `timeout=None` because that's what modern
subprocess expects
There are many caveats, but at least Ctrl-C now works as expected.
There were out-of-order writes to the persitence file when there were
multiple threads. This confines file-writing to a dedicated thread that
works a Queue.
@smarr
Copy link
Owner

smarr commented Oct 1, 2019

Sorry, I don't think I will get to this or the other PR in a while.
Here you seem to be changing one of the most crucial bits, and I want to be sure that I understand the consequences.

So, just briefly:
"Robustify subprocess handling" is unfortunately not helping me in understanding what was broken with the current solution.

I suppose, I am getting a bit defensive here. But so far, the current solution seem to have worked. While the new one is, well, new to me.

@krono
Copy link
Contributor Author

krono commented Oct 1, 2019

What was broken?

  • Ctrl-C did kill neither rebench nor the executable(s) in certain circumstances
  • Rebench wrote results out-of-order to the data file, which meant they did not read correctly on re-run and were hard to analyze.

This fixes both.

@krono
Copy link
Contributor Author

krono commented Oct 1, 2019

Also, there were differences for Python2 and Python3 regarding process handling.

With subprocess32 this is more unified and does not require the custom Timeout handling anymore.

The existing parallel scheduler behaves as is
(With common behavior lifted to a new base clase).

A new parallel scheulder maintains a work queue and the workers retrieve
directly from there.

a distinction is made for py2 vs py3, to make use of the built-in thread
pool executor when possible.

No new magic so far
@krono
Copy link
Contributor Author

krono commented Oct 7, 2019

I reverted the changes to the existing parallel scheduler and made a new one.
The new one has a Queue where worker threads pull from.
Apparently, Queue.join() does not suffer the same problem as Thread.join on Python 2, so the previous hack is not necessary.

@smarr smarr added this to the later milestone May 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants