-
-
Notifications
You must be signed in to change notification settings - Fork 539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ESM node processes being unable to fork into other scripts #1814
Conversation
Codecov Report
|
61e47c5
to
3cc59ec
Compare
@cspotcode Just FYI: I re-pushed because I didn't realize older versions are tested as well. Switched the tests from the node next resolution to just The nightly node failures seem to also happen in |
Yeah the nightly node failure is due to some ESM changes they're making, you can ignore those. Also, please don't hesitate to ping me for guidance on any other test failures, to avoid frustration. Sometimes having such a big test matrix means it can be difficult for a new contributor to understand what's going on, and I might have some hard-earned occult intuition to help out. |
8d97835
to
ac0bcd1
Compare
Okay, I pushed again. Should fix the previous CI failures due the use of https://github.com/TypeStrong/ts-node/runs/7043357305?check_suite_focus=true#step:15:609 is a little confusing to me, but that might have been just flakiness. I tried with the same Node version and couldn't reproduce. |
Hey, just FYI that this PR is ready from my side now. No rush, please let me know if I can do anything. Thanks! |
@@ -48,7 +48,7 @@ export function main( | |||
const state: BootstrapState = { | |||
shouldUseChildProcess: false, | |||
isInChildProcess: false, | |||
entrypoint: __filename, | |||
tsNodeScript: __filename, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this renamed to indicate that this is the path to ts-node's bootstrapper, but not the user's desired entrypoint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I renamed that mostly to avoid the ambiguity here (at least ambiguity to me). Happy to revert if you would want it the old way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renaming is good, makes sense to me.
* Stages before (phase 4) can and will be cached by the child process through the Brotli | ||
* configuration and entry-point information is only reliable in the final phase. More | ||
* details can be found in here: https://github.com/TypeStrong/ts-node/issues/1812. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding all these comments.
It makes sense to me that we must repeat resolution in phase 4. Are there any situations where we know, without any risk, that we can skip this duplicate resolution? It would be a performance optimization.
If we cannot safely do this, then we shouldn't. But if there's a safe way to do this, then I might try my hand at adding it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could re-use this when the argvResult
has not changed. We only need the rerun this function due to the patches in the child loader entry-point. We could indicate this change through a flag and request re-computation.
This feels like a micro optimization through, not worth it personally as this is just some simple logical expressions.
Thanks @devversion, I have a couple questions but otherwise this is excellent work. |
src/bin.ts
Outdated
|
||
/** Unresolved. May point to a symlink, not realpath. May be missing file extension */ | ||
const entryPointPath = executeEntrypoint | ||
? resolve(cwd, restArgs[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cwd
might be the wrong path, if the child process was forked with a different working directory than the parent. The cwd option has always been confusing, let me see if I can figure out why we're using it in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is an issue with the cwd
option. It's quite ambiguous: It doesn't actually control the working directory of the invoked process (kind of surprising). It's mainly used for the script to be resolved and e.g. the tsconfig
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cspotcode I added a new commit (which might be considered a breaking change? to me it sounds like a fix):
- The
--cwd
option now also sets the working directory properly - The working directory is properly preserved when we transition into the child process for ESM.
- Added test scenarios for non-ESM
--cwd
and also for the scenario you outlined in this comment (fork cwd)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll have to double-check. The cwd option has never been great, IMO, and as far as I remember, is only there for backwards compatibility to support some legacy use-cases.
It makes ts-node's logic pretend to be in that cwd but without actually changing the process's cwd. So it affects how ts-node's --cwdMode
locates the tsconfig, and affects paths in typechecking errors. --cwdMode
isn't even the default anymore because it was confusing and the newer default is better. Just goes to show, I don't love the cwd option.
As confusing as it may be, pretty sure we should not be changing it to affect process.cwd()
. But I'll have to double-check the old PRs where I cleaned up this logic; it was a couple years ago. I'm not at my computer; will check later today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible for projects to intentionally launch child processes in a different cwd, right? Either by passing the cwd directly to fork
options, or by changing the parent's working directory before forking. We'll need to support that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. Forking into different working directories should still work. I added a test for this in ESM, but should also make sure with CJS.
We are using the cwd option in a couple of places, and naturally I also expected it to change the initial working directory, similar to how e.g. yarn --cwd
works. This allows me to directly call TS node scripts e.g. in the package.json
without needing more code or an extra script.
I assume folks also expected the same here, because the option works that way in e.g. yarn and the docs kind of describe it that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed this, and added a test. Regardless of what you decide, the additional tests can be helpful. Happy to revert the cwd changes. I believe the semantics should be that way, or the option should be removed altogether (even though it would actually be useful that way -- like yarn --cwd
). Proposed semantics:
--cwd
changes the working directory before the user entry-point is invoked- The working directory flag is omitted when forking into child processes. It's up to the user then/or defaults to
process.cwd()
(which might be the initial--cwd
if specified)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, I think removal makes most sense to reduce complexity. The option predates my involvement with the project, it's confusing, and node
nor tsc
has an equivalent --cwd
option. There's also programmatic registration scenarios, where ts-node is being registered into an existing process and probably shouldn't be changing the cwd. Could make the cwd option a CLI-only thing, but it's really just messy feature creep.
I make frequent use of git -C
so I understand the appeal of --cwd
in general. But in this case, it feels like a maintenance liability to keep it around.
Can't remove till the next major, though, cuz it'd be a breaking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it definitely makes some things more difficult. Not too bad IMO though.
Anyway though, I agree that the removal is a good consideration. What are the next steps for this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the only thing left is: entrypoint resolution in the first child should use --cwd
, vs subsequent forked children should use process.cwd()
.
The first child process spawned by our --esm
flag needs to obey the CLI's --cwd
option for backwards compatibility. (ugh annoying) So if you are in /project
and you run ts-node --cwd foo bar.ts
then the entrypoint needs to be resolved to /project/foo/bar.ts
Subsequently forked children should behave like vanilla node and resolve their entrypoint relative to process.cwd()
If you process.chdir('/home/user')
and do fork('script.ts')
, the child's entrypoint needs to resolve to /home/user/script.ts
.
I think this means we need to pass a boolean to child processes telling them if they are the first child spawned by --esm
, or they are a subsequent child spawned by fork()
. child-entrypoint.ts
will need to modify the execArgv
array to pass this flag to subsequent children. One method is a new --is-first-child=true|false
flag, another method is to modify and re-compress the brotli payload.
Maybe there are other benefits to re-compressing the brotli payload: #1831
I'm not suggesting we implement #1831 in this PR, more thinking that re-compressing the brotli payload in this PR will have benefits for future work.
b10683d
to
65fb22d
Compare
src/bin.ts
Outdated
return callInChild(state); | ||
// Note: When transitioning into the child-process after `phase2`, | ||
// the updated working directory needs to be preserved. | ||
return callInChild(state, state.phase2Result.targetCwd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This explicit cwd pass-through is not strictly needed since spawn would just use the working directory we switched to, but I think being explicit is helping with readability and also makes this safer
Currently, Node processes instantiated through the `--esm` flag result in a child process being created so that the ESM loader can be registered. This works fine and is reasonable. The child process approach to register ESM hooks currently prevents the NodeJS `fork` method from being used because the `execArgv` propagated into forked processes causes `ts-node` (which is also propagated as child exec script -- this is good because it allows nested type resolution to work) to always execute the original entry-point, causing potential infinite loops because the designated fork module script is not executed as expected. This commit fixes this by not encoding the entry-point information into the state that is captured as part of the `execArgv`. Instead the entry-point information is always retrieved from the parsed rest command line arguments in the final stage (`phase4`). Fixes TypeStrong#1812.
…hild process Currently the `--esm` option does not necessarily do what the documentation suggests. i.e. the script does not run as if the working directory is the specified directory. This commit fixes this, so that the option is useful for TSConfig resolution, as well as for controlling the script working directory. Also fixes that the CWD encoded in the bootstrap brotli state for the ESM child process messes with the entry-point resolution, if e.g. the entry-point in `child_process.fork` is relative to a specified `cwd`.
65fb22d
to
d78e437
Compare
@cspotcode Ah thanks for helping out here. I was actually working quite some time today and yesterday on always using the brotli payload (even without esm). I have something for this pretty soon, but I'm not sure if we want to land it as part of this PR. It's a breaking change as you realized as well because it will always use the same preloaded project. I think that is worth doing though. |
do you want me to push that here or how should we proceed? The cwd option will behave like before (like you suggested) |
Oh that's great, probably it should be a separate PR tied to #1831 to be shipped in the next major. I think as soon as this PR lands and ships, I should switch gears fully to powering through all the breaking changes for the next major. That'll also drop support for ancient node and TS versions, so it's simplify the development process. Sorry if I was stepping on your toes a bit, TBH sometimes PRs get dropped when people get tired of my notes about backwards-compat, so I just carry them over the finish line myself. :) My changes: I removed the sanitizeArgv logic in favor of #1834 and/or #1831 If the tests pass, then I think this is ready to merge and publish. But please do let me know if I missed anything. |
Oops, I forgot to update the tests since they use |
src/test/esm-loader.spec.ts
Outdated
}); | ||
|
||
test.suite( | ||
'with NodeNext TypeScript resolution and `.mts` extension', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this .mts
variant still required? I see that the previous test case used regexp matching to avoid import.meta.url
. But if we can avoid that issue via transpileOnly
, then I think this test case is identical to the previous one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likely yes. Although these tests are still a little different:
- One loads the worker through an absolute path
- One loads the worker relatively
To avoid duplicated effort: I've started fixing the tests, and I'll push what I have soon. |
process.exitCode = 1; | ||
|
||
const projectDir = dirname(fileURLToPath(import.meta.url)); | ||
const workerProcess = fork(join(projectDir, 'worker.mts'), [], { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this test case use an absolute path for the worker's entrypoint? Is it testing something that hasn't occurred to me? I figure it should be same as the other tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See response above. One test, with .mts
I used import.meta.url
to construct an absolute path to the worker. The other tests were not using import.meta
(and couldn't --without transpile-only) and use a relative path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok cool, sounds like all the things we need to test are:
- entrypoint script runs with correct
process.cwd()
fork()
entrypoint resolves with absolute pathfork()
entrypoint resolves with relative path relative toprocess.cwd()
, not parent'sts-node --cwd
flag
...for both CJS and ESM, with 3. disabled in the non-ESM case since that bug will be fixed by other tickets.
I'll update the PR now.
… resolution but not process.cwd()
disable non --esm test with comment about known bug and link to tickets make tests set cwd for fork() call, to be sure it is respected and not overridden by --cwd
…nding import.meta.url syntax
…ion in code review)
@devversion I think this is ready to merge and publish. I removed the As soon as this is merged, I can publish 10.9.0 (https://github.com/TypeStrong/ts-node/milestone/16), and then I think I can start merging breaking changes for 11.0.0 (will combine the https://github.com/TypeStrong/ts-node/milestone/13 and https://github.com/TypeStrong/ts-node/milestone/17 milestones) |
@cspotcode Thanks for fixing up the rest. I was working on it already but you came first. Much appreciated. |
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [ts-node](https://typestrong.org/ts-node) ([source](https://github.com/TypeStrong/ts-node)) | devDependencies | minor | [`10.8.2` -> `10.9.1`](https://renovatebot.com/diffs/npm/ts-node/10.8.2/10.9.1) | --- ### Release Notes <details> <summary>TypeStrong/ts-node</summary> ### [`v10.9.1`](https://github.com/TypeStrong/ts-node/releases/tag/v10.9.1) [Compare Source](TypeStrong/ts-node@v10.9.0...v10.9.1) **Fixed** - Workaround nodejs bug introduced in 18.6.0 ([#​1838](TypeStrong/ts-node#1838)) [@​cspotcode](https://github.com/cspotcode) - Only affects projects on node >=18.6.0 using `--esm` - Older versions of node and projects without `--esm` are unaffected https://github.com/TypeStrong/ts-node/milestone/18?closed=1 ### [`v10.9.0`](https://github.com/TypeStrong/ts-node/releases/tag/v10.9.0) [Compare Source](TypeStrong/ts-node@v10.8.2...v10.9.0) **Added** - `--project` accepts path to a directory containing a `tsconfig.json` ([#​1829](TypeStrong/ts-node#1829), [#​1830](TypeStrong/ts-node#1830)) [@​cspotcode](https://github.com/cspotcode) - previously it required an explicit filename - Added helpful error message when swc version is too old to support our configuration ([#​1802](TypeStrong/ts-node#1802)) [@​cspotcode](https://github.com/cspotcode) - Added `experimentalTsImportSpecifiers` option which allows using voluntary `.ts` file extensions in import specifiers (undocumented except for [API docs](https://typestrong.org/ts-node/api/interfaces/CreateOptions.html#experimentalTsImportSpecifiers)) ([#​1815](TypeStrong/ts-node#1815)) [@​cspotcode](https://github.com/cspotcode) **Fixed** - Fixed bug where `child_process.fork()` would erroneously execute the parent's entrypoint script, not the intended child script ([#​1812](TypeStrong/ts-node#1812), [#​1814](TypeStrong/ts-node#1814)) [@​devversion](https://github.com/devversion) - Fixed support for jsx modes `"react-jsx"` and `"react-jsxdev"` in swc transpiler ([#​1800](TypeStrong/ts-node#1800), [#​1802](TypeStrong/ts-node#1802)) [@​cspotcode](https://github.com/cspotcode) - Fixed support for import assertions in swc transpiler ([#​1817](TypeStrong/ts-node#1817), [#​1802](TypeStrong/ts-node#1802)) [@​cspotcode](https://github.com/cspotcode) - Fixed bug where calling `repl.evalCode()` with code not ending in a newline would not update the typechecker accordingly ([#​1764](TypeStrong/ts-node#1764), [#​1824](TypeStrong/ts-node#1824)) [@​cspotcode](https://github.com/cspotcode) https://github.com/TypeStrong/ts-node/milestone/16?closed=1 </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, click this checkbox. --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzMi4xMTEuMSIsInVwZGF0ZWRJblZlciI6IjMyLjExMi4wIn0=--> Co-authored-by: cabr2-bot <cabr2.help@gmail.com> Reviewed-on: https://codeberg.org/Calciumdibromid/CaBr2/pulls/1461 Reviewed-by: Epsilon_02 <epsilon_02@noreply.codeberg.org> Co-authored-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org> Co-committed-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org>
Currently, Node processes instantiated through the
--esm
flag resultin a child process being created so that the ESM loader can be
registered. This works fine and is reasonable.
The child process approach to register ESM hooks currently prevents
the NodeJS
fork
method from being used because theexecArgv
propagated into forked processes causes
ts-node
(which is alsopropagated as child exec script -- this is good because it allows nested
type resolution to work) to always execute the original entry-point,
causing potential infinite loops because the designated fork module
script is not executed as expected.
This commit fixes this by not encoding the entry-point information into
the state that is captured as part of the
execArgv
. Instead theentry-point information is always retrieved from the parsed rest command
line arguments in the final stage (
phase4
).Fixes #1812.