Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

esm: align sync and async load implementations #49152

Merged
merged 1 commit into from
Aug 20, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 20 additions & 11 deletions lib/internal/modules/esm/load.js
Original file line number Diff line number Diff line change
Expand Up @@ -70,25 +70,30 @@ async function getSource(url, context) {
return { __proto__: null, responseURL, source };
}

/**
* @param {URL} url URL to the module
* @param {ESModuleContext} context used to decorate error messages
* @returns {{ responseURL: string, source: string | BufferView }}
*/
function getSourceSync(url, context) {
Comment on lines +74 to 78
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @param {URL} url URL to the module
* @param {ESModuleContext} context used to decorate error messages
* @returns {{ responseURL: string, source: string | BufferView }}
*/
function getSourceSync(url, context) {
* @param {URL} urlInstance URL to the module
* @param {ESModuleContext} context used to decorate error messages
* @returns {{ responseURL: string, source: string | BufferView }}
*/
function getSourceSync(urlInstance, context) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think that is not necessary because the code doc already identifies it as type URL, and I think (aside from mistakenly constructing a new instance), the difference likely would not matter because it's castable to a string.

That said, I do sometimes do similar things, like fooRgx, so I'm not opposed to it (just pointing out that info is already available).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing to consider is that the async implementation also uses url and not urlInstance as a name, and since this PR goal is to align both implementation, that seems to be another reason not to take that suggestion in that PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really care, you can feel free to disregard. I suggested it because I think I saw we called this urlInstance somewhere else in the file, so we might want to standardize at some point.

const parsed = new URL(url);
const responseURL = url;
const { protocol, href } = url;
const responseURL = href;
let source;
if (parsed.protocol === 'file:') {
source = readFileSync(parsed);
} else if (parsed.protocol === 'data:') {
const match = RegExpPrototypeExec(DATA_URL_PATTERN, parsed.pathname);
if (protocol === 'file:') {
source = readFileSync(url);
} else if (protocol === 'data:') {
const match = RegExpPrototypeExec(DATA_URL_PATTERN, url.pathname);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth pulling this off of url on line 79, in case there's a getter, so as to unconditionally trigger it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems wasteful, I'm not sure I understand what would be the upside of doing so

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example benefits:

  • a getter that throws will do so at the top of the method, before expensive stuff happens
  • algorithm changes won't be as observable - it would otherwise be part of the API of this to only do a Get of "pathname" in this branch, and to only do one total, so refactors could become unintentional breaking changes
  • it becomes much clearer what properties are accessed since they're all accessed once, statically, at the top of the method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we really care about this, I think we should rather adapt the regex to take the href rather than the pathname.

  • a getter that throws will do so at the top of the method, before expensive stuff happens

It's still the first operation after the destructuring in that function.

  • algorithm changes won't be as observable - it would otherwise be part of the API of this to only do a Get of "pathname" in this branch, and to only do one total, so refactors could become unintentional breaking changes

You mean if we start needing pathname for other protocols? But a case could be made for the other properties of URL, what if we end up needing search or origin, should we get them as well just in case?

Copy link
Member

@ljharb ljharb Aug 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep (to be clear, i don't mean get them now just in case, i mean as soon as they're used once in the function, get them unconditionally at the top)

if (!match) {
throw new ERR_INVALID_URL(url);
throw new ERR_INVALID_URL(responseURL);
}
const { 1: base64, 2: body } = match;
source = BufferFrom(decodeURIComponent(body), base64 ? 'base64' : 'utf8');
} else {
const supportedSchemes = ['file', 'data'];
throw new ERR_UNSUPPORTED_ESM_URL_SCHEME(parsed, supportedSchemes);
throw new ERR_UNSUPPORTED_ESM_URL_SCHEME(url, supportedSchemes);
}
if (policy?.manifest) {
policy.manifest.assertIntegrity(parsed, source);
policy.manifest.assertIntegrity(url, source);
}
return { __proto__: null, responseURL, source };
}
Expand Down Expand Up @@ -159,14 +164,18 @@ function defaultLoadSync(url, context = kEmptyObject) {
source,
} = context;

format ??= defaultGetFormat(new URL(url), context);
const urlInstance = new URL(url);

throwIfUnsupportedURLScheme(urlInstance, false);

format ??= defaultGetFormat(urlInstance, context);

validateAssertions(url, format, importAssertions);

if (format === 'builtin') {
source = null;
} else if (source == null) {
({ responseURL, source } = getSourceSync(url, context));
({ responseURL, source } = getSourceSync(urlInstance, context));
}

return {
Expand Down