-
Notifications
You must be signed in to change notification settings - Fork 5
Terminating invest subprocesses | Automated browser tests #50
Terminating invest subprocesses | Automated browser tests #50
Conversation
This reverts commit a9621a4.
… values during SetupTab initialization.
…rgs dicts and datastacks - specifically hidden args and n_workers.
src/InvestJob.jsx
Outdated
this.investRun.terminate = () => { | ||
if (this.state.jobStatus === 'running') { | ||
// process.kill(-this.investRun.pid, 'SIGTERM'); // does not kill | ||
// this.investRun.kill(); // does not kill | ||
// This kills, but no chance to handle it and show 'Run Cancelled' | ||
exec(`taskkill /pid ${this.investRun.pid} /t /f`); | ||
// exec(`taskkill /pid ${this.investRun.pid} /t`) // does not kill | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a little bit of looking into this and I probably found what you did; a lot of nodejs issues about how process.kill
doesn't function correctly on Windows. It also seems like Windows doesn't use Signals in the same way.
The threads I was on:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, looks like only POSIX systems implement signals, but even Windows should handle interrupts of some sort. One SO post I found mentioned Win32 messages, which makes me wonder if there's just an alternate, windows-specific way to handle events. If that's the case, then how would we do this in node? All of the child_process
docs talk about processes like they're expecting POSIX signals, which sounds like it would work great until signals don't exist on Windows.
tests/app.test.js
Outdated
.toBeInTheDocument(); | ||
const cancelButton = await findByText('Cancel Run'); | ||
fireEvent.click(cancelButton); | ||
expect(await findByText('Run Canceled')).toBeInTheDocument(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So on Windows this will error because we can't properly kill
a process?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Tehcnically it fails because that "Run Canceled" message only shows when the subprocess has time to listen and run code on an exit
event. Right now on Windows we're resorting to a force kill and there's no such time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh interesting! So then the callback is executed by the child process rather than the parent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phargogh Maybe I mistated that. The callback is executed by the parent. Node (the parent) has a reference to the child process (this.investRun
) that is listening for events from the child. So maybe the problem here is that the child never has the chance to emit the event?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So now that I'm actually testing this on Windows, my last two statements are both wrong. The exit code on Windows after the force-kill is 1
.
// In the CI the flask app takes more than 10x as long to startup. | ||
// Especially so on macos. | ||
// So, allowing many retries, especially because the error | ||
// that is thrown if all retries fail is swallowed by jest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
graphic!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🦈
Looks like it's not working on my mac because of the meddlesome space character in |
Aha! Great catch, @emlys ! Some quotes should do that job. I'll patch that up. |
# add rtree dependency dynamic libraries from conda environment | ||
invest_a.binaries += [ | ||
(os.path.basename(name), name, 'BINARY') for name in | ||
glob.glob(os.path.join(conda_env, 'lib/libspatialindex*'))] | ||
elif is_win: | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we've made the switch to conda for the invest build workflows, could we add:
# add rtree dependency dynamic libraries from conda environment
invest_a.binaries += [
(os.path.basename(name), name, 'BINARY') for name in
glob.glob(os.path.join(conda_env, 'Library/bin/spatialindex*.dll'))]
This would also mean a change on line 9,10 to set conda_env
for windows as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @dcdenu4 , I'm on the fence about touching this stuff here in this PR vs. moving the whole python side of this project into invest. If we changed this PyInstaller script here, we'd also have to change the Actions workflow, because it's not using conda for Windows right now.
Now that you have all that set up for invest in the release branch, it could be time to make a new branch like release/workbench-alpha
on natcap/invest and move all of the python & PyInstaller stuff out of the workbench project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Hi, Rich)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me @davemfish ! That node behavior on process exit is pretty surprising and I do wonder about what alternatives/workarounds we might be able to implement, like we talked about on the coffee call this morning. Maybe monitoring the process for cancellation would be best? Or maybe we can just trust that cancellation will always go through? I suppose if all else fails, there's always task manager.
src/InvestJob.jsx
Outdated
// This kills, but no chance to handle it and show 'Run Cancelled' | ||
exec(`taskkill /pid ${this.investRun.pid} /t /f`); | ||
// exec(`taskkill /pid ${this.investRun.pid} /t`) // does not kill |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we have to run a program to kill the process on windows, this looks right. So then it looks like the this.investRun.on('exit', (code) ...
part down below is the one that won't execute once this.investRun
exits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's how I understand it as well.
src/InvestJob.jsx
Outdated
this.investRun.terminate = () => { | ||
if (this.state.jobStatus === 'running') { | ||
// process.kill(-this.investRun.pid, 'SIGTERM'); // does not kill | ||
// this.investRun.kill(); // does not kill | ||
// This kills, but no chance to handle it and show 'Run Cancelled' | ||
exec(`taskkill /pid ${this.investRun.pid} /t /f`); | ||
// exec(`taskkill /pid ${this.investRun.pid} /t`) // does not kill | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, looks like only POSIX systems implement signals, but even Windows should handle interrupts of some sort. One SO post I found mentioned Win32 messages, which makes me wonder if there's just an alternate, windows-specific way to handle events. If that's the case, then how would we do this in node? All of the child_process
docs talk about processes like they're expecting POSIX signals, which sounds like it would work great until signals don't exist on Windows.
tests/app.test.js
Outdated
.toBeInTheDocument(); | ||
const cancelButton = await findByText('Cancel Run'); | ||
fireEvent.click(cancelButton); | ||
expect(await findByText('Run Canceled')).toBeInTheDocument(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh interesting! So then the callback is executed by the child process rather than the parent?
// In the CI the flask app takes more than 10x as long to startup. | ||
// Especially so on macos. | ||
// So, allowing many retries, especially because the error | ||
// that is thrown if all retries fail is swallowed by jest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🦈
puppet.test.js still isn't working for me :( It's still failing on |
Thanks for continuing to test this @emlys . That's a good thought because the Log tab does not become active immediately, but I just double-checked and the test should be waiting and retrying that
That same |
…treat it the same as other exit events
Oh 🤦♀️ I needed to |
Thanks for reviewing this, everyone. I pushed some changes to address the platform-dependent logic around killing the subprocess. I was partially mistaken when describing the problem earlier. The callback that listens for the The only thing outstanding now is the Mac Build Action. I've made an issue for that (#52 ). Mac builds are good, it's just the puppeteer test that's not running properly in Actions.I don't want that to stop us from merging this PR though. |
This PR adds a feature to gracefully terminate an invest run during execution, with a cancel button. It also adds an automated browser test, using puppeteer to drive the browser, to test this functionality on the final built application.
I'm especially interested in review and advice on spawning and terminating subprocesses from node on different operating systems. This comes up in a few places:
src/InvestJob.jsx:investExecute
method. -- the invest model subprocesssrc/main_helpers:createPythonFlaskProcess
function -- launching the flask apptests/binary_tests/puppet.test.js
-- launching the built electron app as a spawned processI believe the builds should all be successful, but
puppet.test.js
fails/errors in CI for various reasons:Unable to create basic Accelerated OpenGL renderer. Core Image is now using the software OpenGL renderer. This will be slow.
I think this only comes up in the context of puppeteer, not when installing and launching the electron app. Would be great if @phargogh or @emlys could try running the app and this puppet test locally.This PR also fixes #47, fixes #46, fixes #36