Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add forked docker process #39

Merged
merged 17 commits into from
Jun 24, 2024
Merged

Conversation

Courey
Copy link
Contributor

@Courey Courey commented May 31, 2024

Description

This work is to allow docker to be run as a child process that gets cleaned up when the process is terminated.

A child process is forked, the fork allows for communication between the parent and child processes.
The child monitors docker and notifies the parent if the container is exited.
The parent can send a signal to the child to exit the docker container and then the child process.
The child process DOES get the signal from the parent when the parent is terminated with a sigint, and so I am catching this signal and exiting the docker container before exiting the process.

Related Issue(s)

Resolves #40

Testing

  • There is a video on the developers channel in slack showing several tests I ran.
  • I was able to locally run a docker instance and search worked as expected
  • Dakota also tested the branch by pointing to my branch in the package.json, hard coding the variable bin to be null so that it will launch docker, confirming docker was launched, terminating the process, and confirming that the docker container had exited.

@Courey Courey marked this pull request as ready for review June 5, 2024 17:29
Copy link
Member

@lpsinger lpsinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work? The design is a bit different from what we discussed. The proposed design consisted of:

  • Parent forks process and detaches it from its process group
  • Parent periodically sends an IPC message to the child to indicate that it is still alive
  • The child listens for the periodic keep-alive messages; when the keep-alives fail to arrive after a specified timeout, the child process terminates the Docker container and then exits

package.json Outdated Show resolved Hide resolved
run.ts Outdated Show resolved Hide resolved
launchSearch.ts Outdated Show resolved Hide resolved
@Courey Courey changed the title add forked docker process WIP: add forked docker process Jun 5, 2024
@Courey
Copy link
Contributor Author

Courey commented Jun 6, 2024

This is still a work in progress. I'm testing it with Dakota Friday morning.

Does this work?

Yes. Look in developers in slack. I posted a video walkthrough of testing it. Let me know if you want me to explain anything or show you anything different or if you see some issue that I haven't seen. I'm happy to run any tests you think would be good to run. Dakota and I are running the tests on windows tomorrow before retro.

The design is a bit different from what we discussed. The proposed design consisted of:

Parent forks process and detaches it from its process group
Parent periodically sends an IPC message to the child to indicate that it is still alive
The child listens for the periodic keep-alive messages; when the keep-alives fail to arrive after a specified timeout, the child process terminates the Docker container and then exits

I did several designs. What I discovered is that there was no reason to detach the forked process if it would behave as expected. I also discovered it was unnecessary to have communication at regular intervals between the processes.

To handle a control + c (SIGINT), I put signal handlers on the child process. These signal handlers listen for the signal that is propagated from the parent to the child, catch the signal, exit docker, and then exit themselves. The forked process receives the signal from the parent as expected. Quitting out of the process in the terminal kills the docker container and quits the child process.

To handle the waiting functionality, I have the process itself waiting. Once the await in the child process resolves, it alerts the parent process that the container has stopped. The parent process's promise then resolves because it got a message that the container had stopped from the child. I figured a one time message was a lot more efficient than sending one at regular intervals to accomplish the same thing.

To handle the kill functionality, I send the child process a message to kill the docker container, it does kill the container, then exits the child process.

@Courey Courey changed the title WIP: add forked docker process add forked docker process Jun 7, 2024
@Courey Courey requested a review from lpsinger June 7, 2024 19:53
run.ts Outdated Show resolved Hide resolved
package.json Outdated Show resolved Hide resolved
run.ts Show resolved Hide resolved
launchSearch.ts Outdated Show resolved Hide resolved
launchSearch.ts Outdated Show resolved Hide resolved
launchSearch.ts Outdated Show resolved Hide resolved
@lpsinger
Copy link
Member

lpsinger commented Jun 8, 2024

Does this work?

Yes. Look in developers in slack. I posted a video walkthrough of testing it. Let me know if you want me to explain anything or show you anything different or if you see some issue that I haven't seen. I'm happy to run any tests you think would be good to run. Dakota and I are running the tests on windows tomorrow before retro.

That's pretty cool, but why does this work? This relies on Architect calling the sandbox end entry point when its the sandbox process receives SIGINT, which I thought is what was broken.

@Courey
Copy link
Contributor Author

Courey commented Jun 11, 2024

That's pretty cool, but why does this work? This relies on Architect calling the sandbox end entry point when its the sandbox process receives SIGINT, which I thought is what was broken.

I can not and have not been able to catch a signal in architect. So I can not catch the signal and handle cleanup in architect.

My best guess as to why this works is that I think the signal gets sent to the signal group, which sends to both the parent and the child. So while the parent doesn't catch the signal (even if I add in a signal handler), the child does.

A signal sent to the process group after the fork() should be delivered to both parent and child.

from the rationale section in these docs

So the forked process I can catch signals, while I still can't catch any in the parent process. Why exactly I can't catch anything in the architect process I don't know. I haven't been able to figure out what happens there to prevent it.

@Courey Courey requested a review from lpsinger June 11, 2024 15:45
run.ts Outdated Show resolved Hide resolved
run.ts Outdated
options,
}
const subprocess = fork(
'./node_modules/@nasa-gcn/architect-plugin-search/launchSearch.js',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not hardcode this path. Can you __filename like in the example code in https://nodejs.org/api/child_process.html#child_processforkmodulepath-args-options?

Copy link
Contributor Author

@Courey Courey Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 3 main ways to get where a file is located. __dirname to get the directory, __filename to get the file, or process.cwd() to get the current working directory of the process. I had to do a little bit of fiddling to get __dirname and __filename to work:
Screenshot 2024-06-12 at 1 24 52 PM

As long as the package is not symlinked it works as expected:
Screenshot 2024-06-12 at 1 37 54 PM

however, if you symlink the package to work locally, it will not work:
Screenshot 2024-06-12 at 1 23 36 PM
The problem is that it points to the path of code that you are symlinking, not the copy in .node_modules (even though it does exist in .node_modules, which is why hard coding worked)

I am not sure how often not being able to symlink is a dealbreaker for local development. I use it, but I don't know if it's widely used by others using the package.

Would you still like me to use __dirname + the file name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added code in the last commit that checks to see if the directory includes node_modules in the path. if it does it just uses __dirname but if it doesn't (which would mean a symlink) then it returns the hard coded path to the node_modules file.
It's not too much cleaner than just hard coding it, but a little better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take another look at the sample code fork child_process.fork in the Node.js documentation:

if (process.argv[2] === 'child') {
  setTimeout(() => {
    console.log(`Hello from ${process.argv[2]}!`);
  }, 1_000);
} else {
  const { fork } = require('node:child_process');
  const controller = new AbortController();
  const { signal } = controller;
  const child = fork(__filename, ['child'], { signal });
  child.on('error', (err) => {
    // This will be called with err being an AbortError if the controller aborts
  });
  controller.abort(); // Stops the child process
}

It's the same file. You have some variable --- in this case process.argv --- that you use to check if you are in the parent process or the child process.

This is analogous to the typical usage of the fork() syscall. See example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you are saying exactly here.
It sounds like you are giving me advice on how to tell what is the parent vs child process. This is not an issue that I am having.

The issue that you initially pointed out was that the filename was hard coded. You wanted me to use something similar to __filename. (__filename doesn't work out of the box in es6 so you have to do a few extra steps to define it.)

The issue I encountered is that if have symlinked architect-plugin-search in another application for development, the path to the launchSearch.js file is different than if it is not symlinked. When symlinked the path to the file points to my local architect-plugin-search repo which is a different path than when it is in node_modules.

Presumably the two are in sync since they are a simlink, so whatever changes you make in the imported package from the node_modules directory will be reflected in your local repo. Which makes this not really a big deal (it will run either way). It just is weird that the path would point to the linked repo instead of the node_module within the application. So my solution is just to define __dirname and leave it at that. It can run in the symlinked repo or it can run in the node_modules if not symlinked. Either way works.

So I just ended up removing the check to see if node_modules is in the path. It'll work either way, it'll just be pointing to the external repo instead of the node_modules if symlinked.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand now that __filename only works in CommonJS modules. Just use import.meta.url instead. No need to convert it to a path, because child_process.fork accepts URLs.

Why is it a problem that this might return the target of the symlink?

Copy link
Contributor Author

@Courey Courey Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import.meta.url is architect-plugin-search/index.js so that can't be used directly. Here is what I'm doing there and why it needs to have multiple steps:

  1. import.meta.url: file:///Users/courey/dev/architect-plugin-search/index.js this is getting the name of the file that is currently running. in this case it's index.js. I don't want index.js, I want launchSearch.js.
  2. so we then pass that into fileURLToPath so __filename would then be: /Users/courey/dev/architect-plugin-search/index.js. This is preparing it so that we can use path.
  3. I don't want the file, just the directory. Since we converted that to a path with fileURLToPath we can use path.dirname() to get just the path of the directory instead of the file that is running. that makes __dirname: /Users/courey/dev/architect-plugin-search
  4. I add /launchSearch.js to the end of that path to have the proper path and proper file name.

stuff that doesn't work

If I just used import.meta.url we get this error:

Error: Cannot find module '/Users/courey/dev/gcn.nasa.gov/file:/Users/courey/dev/architect-plugin-search/index.js'

if I added /launchSearch.js to the import.meta.url we get:

Error: Cannot find module '/Users/courey/dev/gcn.nasa.gov/file:/Users/courey/dev/architect-plugin-search/index.js/launchSearch.js

import.meta.url passed to fileURLToPath and appended with file gives us this: /Users/courey/dev/architect-plugin-search/index.js/launchSearch.js

import.meta.url passed to path.dirname(): file:///Users/courey/dev/architect-plugin-search

Error: Cannot find module '/Users/courey/dev/gcn.nasa.gov/file:/Users/courey/dev/architect-plugin-search/launchSearch.js'

if you pass import.meta.url directly into path.dirname() you get this:
file:///Users/courey/dev/architect-plugin-search

If I skipped all that and went straight to using import.meta.dirname (which should exist in nodejs 20.11) instead it is undefined, which would give us:

Error: Cannot find module '/Users/courey/dev/gcn.nasa.gov/undefined/launchSearch.js'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it undefined because you are doing a CommonJS build rather than an ESM build?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for posterity in case someone gets here and doesn't read all the other comments, we answered this in another comment. It is not supported prior to node v.20.11.0, so it is undefined in any version earlier than that.

launchSearch.ts Outdated Show resolved Hide resolved
@Courey Courey requested a review from lpsinger June 12, 2024 20:02
run.ts Outdated
Comment on lines 75 to 77
const __filename = fileURLToPath(import.meta.url)
const __dirname = path.dirname(__filename)
const subprocess = fork(`${__dirname}/launchSearch.js`, [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const __filename = fileURLToPath(import.meta.url)
const __dirname = path.dirname(__filename)
const subprocess = fork(`${__dirname}/launchSearch.js`, [
join(import.meta.dirname, 'launchSearch.js`, [

And of course add import { join } from 'node:path'.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2024-06-13 at 3 27 50 PM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I was one version behind. so if your local is not using the proper node version it's undefined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you know if most people are using nvm? because we can prevent it from being undefined because of the wrong version if we add an .nvmrc file. That tells nvm what version to use for the project. But that would only be effective if people are using nvm.
Because if I use import.meta.dirname and they are not at the proper version, doing it this way will break whereas the other way would work regardless of node version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't think we can expect that everyone is using NVM. What are you trying to do? Is importlib.meta.dirname only supported in recent Node.js versions? What is the minimum version that you need?

Copy link
Contributor Author

@Courey Courey Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

upon further testing, the way that requires a node version of 20.11 doesn't seem to be an option right now.
I just ran into something interesting. As soon as I updated my node version to be 20.11.0, in order to terminate the process I started having to hit control + c twice. I changed my architect-plugin-search branch back to main, re-ran npm prepare and ensured that the package had updated in gcn. I then ran GCN at main after an npm install. I had to control + c twice to exit the process.

Screenshot 2024-06-13 at 4 32 38 PM

when I roll my version back to v20.9.0 I can quit out of the process with one control + c.

Screenshot 2024-06-13 at 4 37 30 PM

To be clear, for the import.meta.dirname to work GCN would have to be run using node version >= 20.11 and it is when I am running GCN with version 20.11.0 that I encounter this issue.

I haven't looked into what is making the process stick around in version v20.11.0 yet, so I don't know why it's doing it. Just that it happens with both branches on main if my version is 20.11.0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

upon further testing, the way that requires a node version of 20.11 doesn't seem to be an option right now.

Alright. import.meta.url may be a better option, then. That has been supported at least as far back as Node.js 18. Please use import.meta.url.

Copy link
Contributor Author

@Courey Courey Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that brings us back to this part of my other comment (cutting and pasting it here since it's relevant in this thread too):

import.meta.url is architect-plugin-search/index.js so that can't be used directly. Here is what I'm doing there and why it needs to have multiple steps:

  1. import.meta.url: file:///Users/courey/dev/architect-plugin-search/index.js this is getting the name of the file that is currently running. in this case it's index.js. I don't want index.js, I want launchSearch.js.
  2. so we then pass that into fileURLToPath so __filename would then be: /Users/courey/dev/architect-plugin-search/index.js. This is preparing it so that we can use path.
  3. I don't want the file, just the directory. Since we converted that to a path with fileURLToPath we can use path.dirname() to get just the path of the directory instead of the file that is running. that makes __dirname: /Users/courey/dev/architect-plugin-search
  4. I add /launchSearch.js to the end of that path to have the proper path and proper file name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make this a lot simpler. There is no need to compute the path. Just bundle a single entry point with esbuild, as we did originally. Then launch the subprocess like this:

const subprocess = fork(import.meta.url, 'launch-docker-subprocess', JSON.stringify({
    dataDir,
    logsDir,
    engine,
    port,
    options,
  })

At file scope, add this code:

const [, , command, jsonifiedArgs] = process.argv
if (command === 'launch-docker-subprocess') {
  // put everything that you want the subprocess to do in here:
  // launch the container, attach signal handlers, wait for container to exit
}

@Courey Courey requested a review from lpsinger June 13, 2024 19:37
@Courey
Copy link
Contributor Author

Courey commented Jun 13, 2024

Carry on....
DO NOT MERGE THIS YET. I don't know that anything is funky with it, I was doing more testing and changed things, so it's likely that, but right now I am having to control + c twice to exit the process. that hasn't happened before, let me check into it and make sure I didn't mess anything up with recent changes.

UPDATE IN THIS COMMENT

run.ts Outdated
Comment on lines 75 to 77
const __filename = fileURLToPath(import.meta.url)
const __dirname = path.dirname(__filename)
const subprocess = fork(`${__dirname}/launchSearch.js`, [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make this a lot simpler. There is no need to compute the path. Just bundle a single entry point with esbuild, as we did originally. Then launch the subprocess like this:

const subprocess = fork(import.meta.url, 'launch-docker-subprocess', JSON.stringify({
    dataDir,
    logsDir,
    engine,
    port,
    options,
  })

At file scope, add this code:

const [, , command, jsonifiedArgs] = process.argv
if (command === 'launch-docker-subprocess') {
  // put everything that you want the subprocess to do in here:
  // launch the container, attach signal handlers, wait for container to exit
}

@Courey
Copy link
Contributor Author

Courey commented Jun 14, 2024

I implemented most of your requests, however import.meta.url is still not going to work. So we still have to do:
const __filename = fileURLToPath(import.meta.url)

This is the result if you just use import.meta.url:
Screenshot 2024-06-14 at 1 30 13 PM

@Courey Courey requested a review from lpsinger June 14, 2024 17:53
@Courey
Copy link
Contributor Author

Courey commented Jun 14, 2024

I did figure out a way to get import.meta.url to work, even though they are the same when you print them, if you instantiate a new URL it does work. So it either gets fileURLToPath(import.meta.url) or new URL(import.meta.url)

const subprocess = fork(new URL(import.meta.url), [
    "launch-docker-subprocess",
    JSON.stringify(argv)
  ]);

Since they both do the same thing and we're having to convert one way or another, I don't know that it makes any difference what we we do it.

@lpsinger
Copy link
Member

I implemented most of your requests, however import.meta.url is still not going to work. So we still have to do: const __filename = fileURLToPath(import.meta.url)

This is the result if you just use import.meta.url: Screenshot 2024-06-14 at 1 30 13 PM

That makes no sense. The string in the error message in your screenshot doesn't even look like a URL. The value of import.meta.url should start with file:///. Example:

$ cat test.mjs 
console.log(import.meta.url)
$ node
Welcome to Node.js v20.13.1.
Type ".help" for more information.
> await import('./test.mjs')
file:///Users/lpsinger/src/gcn.nasa.gov/test.mjs
[Module: null prototype] {  }

@lpsinger
Copy link
Member

I did figure out a way to get import.meta.url to work, even though they are the same when you print them, if you instantiate a new URL it does work. So it either gets fileURLToPath(import.meta.url) or new URL(import.meta.url)

Ah, that makes sense. I think that the optimal path is clear here: use new URL(import.meta.url) because import.meta is part of the ECMAScript standard whereas __filename is nonstandard and specific to Node.

@Courey
Copy link
Contributor Author

Courey commented Jun 14, 2024

typescript says no

string_arg

@lpsinger
Copy link
Member

Then the TypeScript type definitions are wrong. According to the Node.js documentation, child_process.fork has supported a URL argument since at least Node.js 16.x. Please submit an issue and/or fix at https://github.com/DefinitelyTyped/DefinitelyTyped.

@Courey
Copy link
Contributor Author

Courey commented Jun 14, 2024

ok, but I don't think this is something worth continuing to block docker development over. Can this get merged in as is and I can create a ticket to update it when it's updated in DefinitelyTyped?

@lpsinger
Copy link
Member

ok, but I don't think this is something worth continuing to block docker development over. Can this get merged in as is and I can create a ticket to update it when it's updated in DefinitelyTyped?

No, it won't block this PR. Please open an issue with DefinitelyTyped, add a FIXME comment to your PR that explains the bug and links to the issue, and work around it.

There are at least two possible workarounds:

  1. Convert the URL to a filesystem path, or
  2. Add a // @ts-expect-error comment to the affected line.

@Courey
Copy link
Contributor Author

Courey commented Jun 14, 2024

I have created a discussion in DefinitelyTyped. Their process is creating a discussion first when requesting a change in an existing package. Presumably if it is decided to be an issue an issue will then be created.

Screenshot 2024-06-14 at 3 45 59 PM

launchSearch.ts Outdated Show resolved Hide resolved
run.ts Outdated Show resolved Hide resolved
launchSearch.ts Show resolved Hide resolved
@Courey Courey requested a review from lpsinger June 24, 2024 16:25
@lpsinger lpsinger merged commit 32de9b7 into nasa-gcn:main Jun 24, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

troubleshoot 3rd party signal handling and child process termination
2 participants