Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

module: change default resolver to not throw on unknown scheme #47824

Merged
143 changes: 86 additions & 57 deletions doc/api/esm.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,8 +143,9 @@ There are three types of specifiers:
* _Absolute specifiers_ like `'file:///opt/nodejs/config.js'`. They refer
directly and explicitly to a full path.

Bare specifier resolutions are handled by the [Node.js module resolution
algorithm][]. All other specifier resolutions are always only resolved with
Bare specifier resolutions are handled by the [Node.js module
resolution and loading algorithm][].
All other specifier resolutions are always only resolved with
the standard relative [URL][] resolution semantics.

Like in CommonJS, module files within packages can be accessed by appending a
Expand Down Expand Up @@ -1029,28 +1030,6 @@ and there is no security.
// https-loader.mjs
import { get } from 'node:https';

export function resolve(specifier, context, nextResolve) {
const { parentURL = null } = context;

// Normally Node.js would error on specifiers starting with 'https://', so
// this hook intercepts them and converts them into absolute URLs to be
// passed along to the later hooks below.
if (specifier.startsWith('https://')) {
return {
shortCircuit: true,
url: specifier,
};
} else if (parentURL && parentURL.startsWith('https://')) {
return {
shortCircuit: true,
url: new URL(specifier, parentURL).href,
};
}

// Let Node.js handle all other specifiers.
return nextResolve(specifier);
}

export function load(url, context, nextLoad) {
// For JavaScript to be loaded over the network, we need to fetch and
// return it.
Expand Down Expand Up @@ -1091,9 +1070,7 @@ prints the current version of CoffeeScript per the module at the URL in
#### Transpiler loader

Sources that are in formats Node.js doesn't understand can be converted into
JavaScript using the [`load` hook][load hook]. Before that hook gets called,
however, a [`resolve` hook][resolve hook] needs to tell Node.js not to
throw an error on unknown file types.
JavaScript using the [`load` hook][load hook].

This is less performant than transpiling source files before running
Node.js; a transpiler loader should only be used for development and testing
Expand All @@ -1109,25 +1086,6 @@ import CoffeeScript from 'coffeescript';

const baseURL = pathToFileURL(`${cwd()}/`).href;

// CoffeeScript files end in .coffee, .litcoffee, or .coffee.md.
const extensionsRegex = /\.coffee$|\.litcoffee$|\.coffee\.md$/;

export function resolve(specifier, context, nextResolve) {
if (extensionsRegex.test(specifier)) {
const { parentURL = baseURL } = context;

// Node.js normally errors on unknown file extensions, so return a URL for
// specifiers ending in the CoffeeScript file extensions.
return {
shortCircuit: true,
url: new URL(specifier, parentURL).href,
};
}

// Let Node.js handle all other specifiers.
return nextResolve(specifier);
}

export async function load(url, context, nextLoad) {
if (extensionsRegex.test(url)) {
// Now that we patched resolve to let CoffeeScript URLs through, we need to
Expand Down Expand Up @@ -1220,27 +1178,99 @@ loaded from disk but before Node.js executes it; and so on for any `.coffee`,
`.litcoffee` or `.coffee.md` files referenced via `import` statements of any
loaded file.

## Resolution algorithm
#### "import map" loader

The previous two loaders defined `load` hooks. This is an example of a loader
that does its work via the `resolve` hook. This loader reads an
`import-map.json` file that specifies which specifiers to override to another
URL (this is a very simplistic implemenation of a small subset of the
"import maps" specification).

```js
// import-map-loader.js
import fs from 'node:fs/promises';

const { imports } = JSON.parse(await fs.readFile('import-map.json'));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const { imports } = JSON.parse(await fs.readFile('import-map.json'));
const { imports } = JSON.parse(await fs.readFile(new URL('./import-map.json', import.meta.url)));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import-map.json should be given by the user of the loader, so it should be taken from current working directory and not next to the source code of the loader. So I would prefer leaving it like this and not accepting your suggestion. Having said that, I don't feel strongly about it, as this is demo code, so just say the word and I'll commit the suggestion.

(obviously, cwd is naive and there should be an env variable or some such, but this is naive code for demo purpose only)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just make the example the esm.sh one then? That’s much simpler and involves only one file, so the overall example would also be shorter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you wrap it in path.resolve and/or add a comment? Without that, it’s a bit confusing IMO.

Why not just make the example the esm.sh one then? That’s much simpler and involves only one file, so the overall example would also be shorter.

We would have trouble with Windows compatibility, right?

Copy link
Member

@GeoffreyBooth GeoffreyBooth May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just make the example the esm.sh one then? That’s much simpler and involves only one file, so the overall example would also be shorter.

We would have trouble with Windows compatibility, right?

That example I had in mind was to rewrite all bare specifiers like lodash to https://esm.sh/lodash, and that’s it. Add a comment that load would fail without --experimental-network-imports, or without chaining the example HTTPS loader, and call it a day. It should be like a five-line example I’d think.

It’s still a bit contrived as presumably you wouldn’t always want to load the latest version of every dependency, but for an example I think it gets the idea across and then real-world implementations can fill in details.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems unwise for node docs to bless an arbitrary site in the ecosystem by including it in examples.


export async function resolve(specifier, context, nextResolve) {
if (Object.hasOwn(imports, specifier)) {
return nextResolve(imports[specifier], context);
}

return nextResolve(specifier, context);
}
```

Let's assume we have these files:

```js
// main.js
import 'a-module';
```

```json
// import-map.json
{
"imports": {
"a-module": "./some-module.js"
}
}
```

```js
// some-module.js
console.log('some module!');
```

If you run `node --experimental-loader ./import-map-loader.js main.js`
the output will be `some module!`.

## Resolution and loading algorithm

### Features

The resolver has the following properties:
The default resolver has the following properties:

* FileURL-based resolution as is used by ES modules
* Support for builtin module loading
* Relative and absolute URL resolution
* No default extensions
* No folder mains
* Bare specifier package resolution lookup through node\_modules
* Does not fail on unknown extensions or protocols
guybedford marked this conversation as resolved.
Show resolved Hide resolved
* Can optionally provide a hint of the format to the loading phase

The default loader has the following properties

### Resolver algorithm
* Support for builtin module loading via `node:` URLs
* Support for "inline" module loading via `data:` URLs
* Support for `file:` module loading
* Fails on any other URL protocol
* Fails on unknown extensions for `file:` loading
(supports only `.cjs`, `.js`, and `.mjs`)

### Resolution algorithm

The algorithm to load an ES module specifier is given through the
**ESM\_RESOLVE** method below. It returns the resolved URL for a
module specifier relative to a parentURL.

The resolution algorithm determines the full resolved URL for a module
load, along with its suggested module format. The resolution algorithm
does not determine whether the resolved URL protocol can be loaded,
or whether the file extensions are permitted, instead these validations
are applied by Node.js during the load phase
(for example, if it was asked to load a URL that has a protocol that is
not `file:`, `data:`, `node:`, or if `--experimental-network-imports`
is enabled, `https:`).

The algorithm also tries to determine the format of the file based
on the extension (see `ESM_FILE_FORMAT` algorithm below). If it does
not recognize the file extension (eg if it is not `.mjs`, `.cjs`, or
`.json`), then a format of `undefined` is returned,
which will throw during the load phase.

The algorithm to determine the module format of a resolved URL is
provided by **ESM\_FORMAT**, which returns the unique module
provided by **ESM\_FILE\_FORMAT**, which returns the unique module
format for any file. The _"module"_ format is returned for an ECMAScript
Module, while the _"commonjs"_ format is used to indicate loading through the
legacy CommonJS loader. Additional formats such as _"addon"_ can be extended in
Expand All @@ -1267,7 +1297,7 @@ The resolver can throw the following errors:
* _Unsupported Directory Import_: The resolved path corresponds to a directory,
which is not a supported target for module imports.

### Resolver Algorithm Specification
### Resolution Algorithm Specification

**ESM\_RESOLVE**(_specifier_, _parentURL_)

Expand Down Expand Up @@ -1301,7 +1331,7 @@ The resolver can throw the following errors:
> 8. Otherwise,
> 1. Set _format_ the module format of the content type associated with the
> URL _resolved_.
> 9. Load _resolved_ as module format, _format_.
> 9. Return _format_ and _resolved_ to the loading phase

**PACKAGE\_RESOLVE**(_packageSpecifier_, _parentURL_)

Expand Down Expand Up @@ -1506,9 +1536,9 @@ _isImports_, _conditions_)
> 7. If _pjson?.type_ exists and is _"module"_, then
> 1. If _url_ ends in _".js"_, then
> 1. Return _"module"_.
> 2. Throw an _Unsupported File Extension_ error.
> 2. Return **undefined**.
> 8. Otherwise,
> 1. Throw an _Unsupported File Extension_ error.
> 1. Return **undefined**.

**LOOKUP\_PACKAGE\_SCOPE**(_url_)

Expand Down Expand Up @@ -1552,7 +1582,7 @@ for ESM specifiers is [commonjs-extension-resolution-loader][].
[Import Assertions proposal]: https://github.com/tc39/proposal-import-assertions
[JSON modules]: #json-modules
[Loaders API]: #loaders
[Node.js Module Resolution Algorithm]: #resolver-algorithm-specification
[Node.js Module Resolution And Loading Algorithm]: #resolution-algorithm-specification
[Terminology]: #terminology
[URL]: https://url.spec.whatwg.org/
[`"exports"`]: packages.md#exports
Expand Down Expand Up @@ -1581,7 +1611,6 @@ for ESM specifiers is [commonjs-extension-resolution-loader][].
[custom https loader]: #https-loader
[load hook]: #loadurl-context-nextload
[percent-encoded]: url.md#percent-encoding-in-urls
[resolve hook]: #resolvespecifier-context-nextresolve
[special scheme]: https://url.spec.whatwg.org/#special-scheme
[status code]: process.md#exit-codes
[the official standard format]: https://tc39.github.io/ecma262/#sec-modules
Expand Down
31 changes: 31 additions & 0 deletions lib/internal/modules/esm/load.js
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,8 @@ async function defaultLoad(url, context = kEmptyObject) {
source,
} = context;

throwIfUnsupportedURLScheme(new URL(url), experimentalNetworkImports);

if (format == null) {
format = await defaultGetFormat(url, context);
}
Expand All @@ -102,6 +104,35 @@ async function defaultLoad(url, context = kEmptyObject) {
};
}

/**
* throws an error if the protocol is not one of the protocols
* that can be loaded in the default loader
* @param {URL} parsed
* @param {boolean} experimentalNetworkImports
*/
function throwIfUnsupportedURLScheme(parsed, experimentalNetworkImports) {
// Avoid accessing the `protocol` property due to the lazy getters.
const protocol = parsed?.protocol;
if (
protocol &&
protocol !== 'file:' &&
protocol !== 'data:' &&
protocol !== 'node:' &&
(
!experimentalNetworkImports ||
(
protocol !== 'https:' &&
protocol !== 'http:'
)
)
) {
const schemes = ['file', 'data', 'node'];
if (experimentalNetworkImports) {
ArrayPrototypePush(schemes, 'https', 'http');
}
throw new ERR_UNSUPPORTED_ESM_URL_SCHEME(parsed, schemes);
}
}

/**
* For a falsy `format` returned from `load`, throw an error.
Expand Down
36 changes: 0 additions & 36 deletions lib/internal/modules/esm/resolve.js
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
const {
ArrayIsArray,
ArrayPrototypeJoin,
ArrayPrototypePush,
ArrayPrototypeShift,
JSONStringify,
ObjectGetOwnPropertyNames,
Expand Down Expand Up @@ -51,7 +50,6 @@ const {
ERR_PACKAGE_PATH_NOT_EXPORTED,
ERR_UNSUPPORTED_DIR_IMPORT,
ERR_NETWORK_IMPORT_DISALLOWED,
ERR_UNSUPPORTED_ESM_URL_SCHEME,
} = require('internal/errors').codes;

const { Module: CJSModule } = require('internal/modules/cjs/loader');
Expand Down Expand Up @@ -941,37 +939,6 @@ function throwIfInvalidParentURL(parentURL) {
}
}

function throwIfUnsupportedURLProtocol(url) {
// Avoid accessing the `protocol` property due to the lazy getters.
const protocol = url.protocol;
if (protocol !== 'file:' && protocol !== 'data:' &&
protocol !== 'node:') {
throw new ERR_UNSUPPORTED_ESM_URL_SCHEME(url);
}
}

function throwIfUnsupportedURLScheme(parsed, experimentalNetworkImports) {
// Avoid accessing the `protocol` property due to the lazy getters.
const protocol = parsed?.protocol;
if (
protocol &&
protocol !== 'file:' &&
protocol !== 'data:' &&
(
!experimentalNetworkImports ||
(
protocol !== 'https:' &&
protocol !== 'http:'
)
)
) {
const schemes = ['file', 'data'];
if (experimentalNetworkImports) {
ArrayPrototypePush(schemes, 'https', 'http');
}
throw new ERR_UNSUPPORTED_ESM_URL_SCHEME(parsed, schemes);
}
}

function defaultResolve(specifier, context = {}) {
let { parentURL, conditions } = context;
Expand Down Expand Up @@ -1048,7 +1015,6 @@ function defaultResolve(specifier, context = {}) {
// This must come after checkIfDisallowedImport
if (parsed && parsed.protocol === 'node:') return { __proto__: null, url: specifier };

throwIfUnsupportedURLScheme(parsed, experimentalNetworkImports);

const isMain = parentURL === undefined;
if (isMain) {
Expand Down Expand Up @@ -1095,8 +1061,6 @@ function defaultResolve(specifier, context = {}) {
throw error;
}

throwIfUnsupportedURLProtocol(url);

return {
__proto__: null,
// Do NOT cast `url` to a string: that will work even when there are real
Expand Down
2 changes: 2 additions & 0 deletions test/es-module/test-esm-import-meta-resolve.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ assert.strictEqual(
code: 'ERR_INVALID_ARG_TYPE',
})
);
assert.strictEqual(import.meta.resolve('http://some-absolute/url'), 'http://some-absolute/url');
assert.strictEqual(import.meta.resolve('some://weird/protocol'), 'some://weird/protocol');
assert.strictEqual(import.meta.resolve('baz/', fixtures),
fixtures + 'node_modules/baz/');

Expand Down
Loading