Skip to content
This repository has been archived by the owner on Apr 16, 2020. It is now read-only.

Entry points proposal spec and implementation #32

Merged
merged 2 commits into from
Feb 14, 2019

Conversation

guybedford
Copy link
Contributor

@guybedford guybedford commented Feb 11, 2019

This implements a top-level --type flag for --experimental-modules with the following behaviours.

In addition "type": "esm" in the package.json is renamed to "type": "module", and a new extension .cjs is supported.

These behaviours are as defined in https://github.com/GeoffreyBooth/node-esm-entry-points-proposal.

  • node --experimental-modules ./x.unknown --type=module will always execute the top-level file as an ES module, unless it ends in .cjs, in which case an error is thrown.
  • node --experimental-modules ./x.unknown --type=commonjs will always execute the top-level file as a CommonJS module.
  • Support node --experimental-modules flags --check, --print, -e and stdin, including compatibility with --type
  • New error types are defined to handle the failure cases on these new flags.

With this PR, we get both the package.json boundary as the module format hint, as well as an ability to now override the format to a known format.

As with all previous PRs there are two commits - one with just the spec/doc changes, and another with the implementation.

The following features in the entry points proposal are not yet implemented in this PR and should be implemented as follow-up work:

  • --type=auto for syntax detection
  • symlink handling at the top-level boundary
  • -m alias for --type=module

@guybedford guybedford changed the title Entry points proposal: top-level --type, -m flags, spec updates Entry points proposal spec and implementation Feb 11, 2019
@ljharb
Copy link
Member

ljharb commented Feb 11, 2019

I think a --type option on node is excellent, since it can only ever map to one unit of source code (stdin, the repl, or a single file). However, can you clarify:

With this PR, we get both the package.json boundary as the module format hint

@guybedford
Copy link
Contributor Author

@ljharb glad to hear that! When no --type is specified, the default behaviour is to use the hint from the package boundary.

@GeoffreyBooth
Copy link
Member

@ljharb glad to hear that! When no --type is specified, the default behaviour is to use the hint from the package boundary.

The file extension has the highest precedence. If the filename has .mjs or .cjs, that’s all we need, Node parses as ESM or CommonJS respectively. Nothing can override that, and node --type=commonjs file.mjs (or the inverse) would throw.

If the file extension is ambiguous (.js) or missing (--eval etc.) then --type takes the next highest precedence.

If --type is unspecified, the package.json hint is used.

If there is no package.json hint, we parse as CommonJS for backward compatibility.

@devsnek
Copy link
Member

devsnek commented Feb 11, 2019

can it please be renamed to source-text-module so its correct and we can add more types later? also, we use cjs instead of commonjs throughout the rest of the codebase as a goal key. its fine if we change it but they should all be the same thing.

@GeoffreyBooth
Copy link
Member

GeoffreyBooth commented Feb 11, 2019

can it please be renamed to source-text-module so its correct and we can add more types later?

No. The point is to match <script type="module">. Users already associate "module" with ES modules, both from <script type="module"> and the package.json "module" key that many users have already been using for years.

also, we use cjs instead of commonjs throughout the rest of the codebase as a goal key. its fine if we change it but they should all be the same thing.

We considered --type=cjs, but with the .cjs file extension we don’t want users to be confused that type is referring to a file extension (like cjs) as opposed to a module type. If we had --type=cjs then users might also expect --type=mjs to work, and node --type=mjs file.js is nonsensical. Within the Node codebase, sure, let’s rename it to commonjs everywhere if that’s fine with everyone, that might also help disambiguate from the file extension there too.

Again, Stage 4 is designated for UX review and feedback. What we name things now doesn’t matter, everything is temporary until we get broader exposure and feedback from a wider audience of users.

@devsnek
Copy link
Member

devsnek commented Feb 11, 2019

The point is to match....

kinda unfortunate design 🤷 i guess we can discuss it later though

[cjs stuff]

like i said, its fine to be "commonjs", it just needs to be updated in all the other places too.


--type

personal reminder for --entry-type

@bmeck
Copy link
Member

bmeck commented Feb 11, 2019

What is the expected behavior of having colliding CLI flag and package.json?

package.json#type:commonjs
bin/entry.js
node -m bin/entry

The spec makes it look like package.json wins. The docs/this PR make it look like the CLI wins. I would heavily prefer either throwing on collision or not allowing the CLI to alter the format of a resource if it is already unambiguous.

@GeoffreyBooth
Copy link
Member

What is the expected behavior of having colliding CLI flag and package.json?

The spec says that the flag will “tell Node to parse as ESM entry points that would otherwise be ambiguous.” Hence the flag throws on .mjs or .cjs, since they’re never ambiguous. A package in general is ambiguous. It can always contain either CommonJS or ESM files, regardless of what gets put in package.json. A package with a package.json of {} can have .mjs files, or a package with a package.json of {"type": "module"} can have .cjs files. A package can be dual-mode, etc.

I would be wary of throwing in the example you listed above, because the user’s intent is clear. They need to tell Node the module type to use for the entry point either via a flag or package.json; they don’t necessarily need to set both. What would the error message even be for that case? “Error: bin/entry.js cannot be executed in module mode because its nearest parent package.json lacks "type": "module"“?

We can always add more error-checking later on to increase the cases under which we throw errors; we don’t need to throw for this case. Even though we could, it’s probably much simpler UX to just tell users that --type is always applicable for .js files, regardless of what package.json may say; especially now, since at least for a while there will be lots and lots of packages and projects out there with ESM in .js files and no "type": "module" in their package.jsons.

@GeoffreyBooth
Copy link
Member

On further thought, I guess this is a case of user intent versus author intent. Currently author intent overrides user intent for file extensions but nothing else (I think). I could also see a UX case for user intent always taking priority.

@bmeck
Copy link
Member

bmeck commented Feb 11, 2019

The spec says that the flag will “tell Node to parse as ESM entry points that would otherwise be ambiguous.” Hence the flag throws on .mjs or .cjs, since they’re never ambiguous. A package in general is ambiguous. It can always contain either CommonJS or ESM files, regardless of what gets put in package.json. A package with a package.json of {} can have .mjs files, or a package with a package.json of {"type": "module"} can have .cjs files. A package can be dual-mode, etc.

I don't understand this statement. A package is unrelated to my question really except that it is used as a configuration point, only files have specific formats and the configuration points seem to have some conflict in this PR about specifying formats for files. The question isn't about if we can have dual mode packages, but rather what happens when a resource has conflicting configuration.

I would be wary of throwing in the example you listed above, because the user’s intent is clear. They need to tell Node the module type to use for the entry point either via a flag or package.json; they don’t necessarily need to set both. What would the error message even be for that case? “Error: bin/entry.js cannot be executed in module mode because its nearest parent package.json lacks "type": "module"“?

I don't understand this either, the user has 2 configuration points via CLI and package.json, and they are conflicting in my example. Since we have this conflict, why not just state that the conflict itself is an error "Cannot treat bin/entry.js as ESM via --type because a package.json has marked it as CJS" or somesuch; that would cause the user to resolve the conflict.

To clarify, the design of this PR currently means that files are not having a well defined format statically, as they can be altered depending on how they are consumed. importing bin/entry.js loads it as CJS, but you can use the CLI to make it load as ESM. Therefore, given the behaviors of this PR we have of no way of knowing how files are interpreted for bin/entry.js because the conflict makes a "it depends" scenario. Once bin/entry.js has been loaded it will permanently be in a format for the purposes of the Module Map, which also adds a wrinkle here as that means even though normally importing it would give a CJS format, if loaded via the CLI in the example above it would always be loaded as ESM even though we are using import:

node --type=module bin/entry.js
// bin/entry.js
// package.json denotes this file as CJS
// CLI denotes this file as ESM

// works if loaded via CLI with --type=module
// fails if loaded via static information from package.json
import('./entry.js');

We can always add more error-checking later on to increase the cases under which we throw errors; we don’t need to throw for this case. Even though we could, it’s probably much simpler UX to just tell users that --type is always applicable for .js files, regardless of what package.json may say; especially now, since at least for a while there will be lots and lots of packages and projects out there with ESM in .js files and no "type": "module" in their package.jsons.

We don't need to do anything really since we can mandate any design decision we can agree upon and we can write up rationalizations afterwards. The concern I have is that we have multiple configuration points and they are contradicting each other.

@GeoffreyBooth
Copy link
Member

@bmeck Sure, we can error if the package scope and --type are in conflict. Perhaps that makes the most sense.

For the purposes of development, though, I think that that can be a separate PR, if @guybedford doesn’t have time to add that before this week’s meeting.

@bmeck
Copy link
Member

bmeck commented Feb 11, 2019

Seems fine as a diff PR

@GeoffreyBooth
Copy link
Member

Seems fine as a diff PR

The other remaining thing we need to work out is how exactly dual-mode packages will work, since we would want to throw only if the flag conflicts with either way that a dual-mode package allows its files to be run. So maybe this error handling can be part of figuring out the details of how dual-mode packages can be specified by users and run by Node.

Copy link
Member

@bmeck bmeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@jdalton jdalton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@MylesBorins MylesBorins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rubber stamp LGTM

@MylesBorins
Copy link
Contributor

MylesBorins commented Feb 13, 2019

please run CI before landing and make commit messages follow upstream guidelines

@SMotaal
Copy link

SMotaal commented Feb 13, 2019

Finally… this is an amazing step in my books 🙏

Copy link

@SMotaal SMotaal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Alhadis added a commit to file-icons/atom that referenced this pull request Feb 14, 2019
@GeoffreyBooth
Copy link
Member

☝️ someone’s following along closely

@guybedford
Copy link
Contributor Author

@guybedford guybedford merged commit 35127a5 into nodejs:modules-lkgr Feb 14, 2019
@guybedford
Copy link
Contributor Author

CI all green, merged.

@thysultan
Copy link

How would libraries(in node_modules) provide a path to the module, considering that "pkg.main" already points to for example a "umd" source file. Will there be a "pkg.src" field in symmetry with <script type=module src=url>?

@GeoffreyBooth
Copy link
Member

GeoffreyBooth commented Feb 15, 2019

How would libraries(in node_modules) provide a path to the module, considering that "pkg.main" already points

Currently in this implementation main points to the entry point for the package whether it's ESM or CommonJS. This precludes dual ESM/CommonJS packages, though, so we've been discussing creating a new field to handle defining separate entry points for each. That's another proposal yet to come.

@ljharb
Copy link
Member

ljharb commented Feb 15, 2019

It doesn't preclude it if we do extension look up - you'd omit the extension (which is a reigning best practice anyways with cjs) and then have one extension for ESM and another for CJS.

@Alhadis
Copy link
Contributor

Alhadis commented Feb 15, 2019

you'd omit the extension (which is a reigning best practice anyways with cjs)

That's what I've been doing with Roff.js (and a few other projects published to NPM with both .mjs and .js versions of the package entry-point:

/* package.json */
{
	
	"main": "./pan-and-zoom",
	
}

With package contents like this:

├── Makefile
├── README.md
├── package.json
├── pan-and-zoom.js
└── pan-and-zoom.mjs

Are you saying this will stop working at some point?

@GeoffreyBooth
Copy link
Member

Automatic extension resolution contradicts our goal of browser equivalence, so I assume it won't be enabled by default; though I expect it would be possible via opt-in configuration or a package-level loader.

In ESM mode .js files are treated as ESM, so at the very least it would be confusing to rely on extensions for disambiguation. A better UX is probably a way to explicitly define each entry point. Such a new field would also allow specifications of other types of entry points, like for browsers (see the commonly used browser field today) and future potential package types.

@devsnek

This comment has been minimized.

@ljharb

This comment has been minimized.

@Alhadis

This comment has been minimized.

@GeoffreyBooth

This comment has been minimized.

@Alhadis

This comment has been minimized.

@devsnek

This comment has been minimized.

@GeoffreyBooth

This comment has been minimized.

@devsnek

This comment has been minimized.

@GeoffreyBooth

This comment has been minimized.

@devsnek

This comment has been minimized.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.