Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module: add support for nodeEntryPointConfig.loaders in package.json #43973

Closed
wants to merge 2 commits into from

Conversation

aduh95
Copy link
Contributor

@aduh95 aduh95 commented Jul 24, 2022

Currently, starting a .mjs or .cjs file allow to skip the lookup and the parsing of a package.json file, I'm reluctant to change that but I'm curious what others think.

I had to make the default resolve function sync – but we don't use anything async inside anyway, I assume it's fine. If we move the loaders off-thread (or even to make the loading of loaders itself go through non-default loaders), we might need to think of an async way of resolving those loaders, not sure.

Opening as draft because it's just at the PoC stage, and I would like to get some feedback on the initial implementation before going further.

/cc @nodejs/loaders

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/modules

@nodejs-github-bot nodejs-github-bot added esm Issues and PRs related to the ECMAScript Modules implementation. needs-ci PRs that need a full CI run. labels Jul 24, 2022
@guybedford
Copy link
Contributor

Thanks for picking this problem up. Agreed having it not apply to .mjs and .cjs would be odd, and that needs to be addressed.

I do still think there might be confusion in the package.json if it will apply when loading a dependency. I tend to think a separate Node.js environment configuration manifest might make more sense.

Another possibility could be a special comment in the entry point file only. Similarly to what we have in the test files. That would then also solve the issue of standalone executions. That Node.js itself needs it for tests demonstrates it's a relatively useful feature.

@GeoffreyBooth
Copy link
Member

Currently, starting a .mjs or .cjs file allow to skip the lookup and the parsing of a package.json file, I’m reluctant to change that but I’m curious what others think.

Yes I think we need to change this. I assume the prior behavior was for performance? Doing a single stat (or stat lookup chain) for the entrypoint seems like a very minor hit.

I had to make the default resolve function sync – but we don’t use anything async inside anyway, I assume it’s fine.

Not opposed, but what made this necessary? This makes me think we should land the “move off thread” PR before we tackle this part.

Opening as draft because it’s just at the PoC stage, and I would like to get some feedback on the initial implementation before going further.

I feel like we should support all of Node’s configuration (or at least, as much of it as possible) rather than just a field for loaders. Within the configuration file we also need a way to define different configurations per environment, like production vs development configuration (perhaps aligning with --conditions?) since it’s very likely that users will have different loaders and other settings between prod and dev.

@GeoffreyBooth GeoffreyBooth added the loaders Issues and PRs related to ES module loaders label Jul 25, 2022
@aduh95
Copy link
Contributor Author

aduh95 commented Jul 25, 2022

@guybedford

I do still think there might be confusion in the package.json if it will apply when loading a dependency.

Note that the package.json field uses the wording "node entry point" to try to avoid that confusion. Do you think that's good enough?

I tend to think a separate Node.js environment configuration manifest might make more sense.

One reason to use package.json is that we have to parse it anyway, and it seems easier to explain that it follows the same rules as "type" field if it's in the same file.

Another possibility could be a special comment in the entry point file only. Similarly to what we have in the test files. That would then also solve the issue of standalone executions. That Node.js itself needs it for tests demonstrates it's a relatively useful feature.

That's smart, that would not work for entry point files that do not follow the same comment convention as JS though (e.g. a WASM file).


@GeoffreyBooth

I feel like we should support all of Node’s configuration (or at least, as much of it as possible) rather than just a field for loaders.

To be clear, that's very much not a goal for this PR, I intend to focus on resolving the loaders use-case only. Adding gradually more options should be possible though, and can happen in follow up PRs.

Within the configuration file we also need a way to define different configurations per environment, like production vs development configuration (perhaps aligning with --conditions?) since it’s very likely that users will have different loaders and other settings between prod and dev.

It's already possible to do the following:

{
  "name": "pkg",
  "imports": {
    "#loader": {
      "development": "typescript-loader/with-type-checks",
      "default": "typescript-loader/no-type-checks"
    }
  },
  "nodeEntryPointConfig": {
    "loaders": ["#loader"]
  }
}

I'm tempted to say it's good enough, wdyt?

Yes I think we need to change this. I assume the prior behavior was for performance? Doing a single stat (or stat lookup chain) for the entrypoint seems like a very minor hit.

It's more than a stat, we also have to parse that JSON, so the time it takes depends on the size of the package.json. But yeah, that might be fine in the end, users can still opt-out of that by using a command-line loader.

@GeoffreyBooth
Copy link
Member

GeoffreyBooth commented Jul 25, 2022

I’m tempted to say it’s good enough, wdyt?

No, that only works for --loader and other flags that expect a file path as an argument. Once we support defining --inspect in configuration, and we want that changed by environment, how would that be handled?

I think this API needs to be designed to eventually support all of Node’s config options (or at least, all the ones that make sense to define in a config file; probably all the ones supported by NODE_OPTIONS). It also needs to support sets of configurations such as per environment. We don’t need to support all 100+ options right away, but the API needs to be designed such that all the others can be added without causing breaking changes.

I think this should also be an experimental API, possibly even needing its own flag to enable. There’s no versioning in package.json, so we need to be very careful what we start telling users that we support in there.

I also think that we need to look at what flags get parsed by C++ on startup, long before we get to module loading. I think users will consider this feature broken if it doesn’t support most of the options they expect (such as everything in NODE_OPTIONS) and so we might need to move the “find the controlling package.json for the entry point” logic to a much earlier part of the bootstrap than the module loader. We should probably figure that part out first before implementing a solution that relies on the module loader and eventually needs to get removed when we add support for the other flags.

@bnb
Copy link
Contributor

bnb commented Jul 28, 2022

I'm -1 on the specific name nodeEntryPointConfig as it's significantly too verbose.

This seems to be a part of a bigger/longer-term trend of Node.js begnning to use package.json for configuration more. If we're going down that path I'd strongly prefer we do something like node or nodeOptions or noderc in package.json and then having entrypoint with the rest of the API under that.

@zackschuster
Copy link

If we're going down that path I'd strongly prefer we do something like node or nodeOptions or noderc in package.json and then having entrypoint with the rest of the API under that.

i wonder what number of people would expect, based on one or more of those names, that this configuration object would support nested configurations like exports. if it did, that would allow you to specify different environment variables for different sections of the program. that either sounds to me like a really cool or a really terrible idea.

Copy link
Member

@JakobJingleheimer JakobJingleheimer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this moving!

Seconding Tierney: I think we should create a top-level field in package.json such as node and nest this and others within it following the same environment conditional conventions as packageJSON.exports:

{
  "node": {
    "development": {
      "entryPoint¹": {
        "loaders": [""]
      },

      "//eventually": "others like below",

      "imports": [""],
      "policy": "./policy.dev.json",
      "//policy": {
        "integrity": "",
        "manifest": "./policy.dev.json"
      }
    }
  }
}

This is already supported by many, many packages, and I find it works very well. So if it ain't broke.

Also, yes: we decided to "wait and see" about converting resolve to sync until after loaders are moved off-thread. So if that's needed, I think this should be blocked from landing til then.

--

¹ not sure whether entrypoint or entryPoint is more common/correct

lib/internal/modules/cjs/loader.js Show resolved Hide resolved
@aduh95
Copy link
Contributor Author

aduh95 commented Sep 15, 2022

I'm not sure what to do with this, I'm not happy with the solution I came up, and it doesn't look like any other solution make consensus so far. Also, I feel trying to come up with a JSON representation that pleases everyone of any value that could configure Node.js is a doomed project, it doesn't seem realistic to set this goal.

I'm thinking a better approach would be to use .env file (it's already a widely use syntax, users are more likely to understand them without much effort from us, there are a ton of information online about it already, it is useful for more than just node config for the end user, etc.), the problem is that we don't have a parser to parse the value and use them, and I assume this would need to be parsed before V8 has started to allow its configuration, therefor happen in C++ land, which is outside of my capacity.

Anyway closing this for now, if someone else to give it a shot, you'll be very welcome. If you are looking for a solution to do that and don't care about Windows support, create your own executable:

#!/bin/sh
exec node --loader '#ts-loader' "${@}"

@aduh95 aduh95 closed this Sep 15, 2022
@aduh95 aduh95 deleted the nodeEntryPointConfig.loaders branch September 15, 2022 12:18
@GeoffreyBooth
Copy link
Member

I’m thinking a better approach would be to use .env file

I’m leaning toward this as well. For comparison, Bun supports this approach:

bun automatically reads configuration from .env.local, .env.development and .env (in that order)

In our case, a .env file (or .env.local etc. file) would map to process.env, and so Node flags would be specified via NODE_OPTIONS='--whatever'. I feel like this should be relatively simple to implement, as it’s just breaking the string of the file into lines and then breaking the line into before/after the =, then adding the key and value to the environment. If we do this before the C++ code that reads NODE_OPTIONS, then I’d think we should be done?

We can always support .env and configuration files, as Bun does, so this doesn’t foreclose other solutions in the future.

@bnb
Copy link
Contributor

bnb commented Sep 17, 2022

I mean, if .env support is on the table, I'm wholly +1 to that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
esm Issues and PRs related to the ECMAScript Modules implementation. loaders Issues and PRs related to ES module loaders needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants