Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected behavior in module customization hooks / loader, for require #55808

Open
nomagick opened this issue Nov 10, 2024 · 5 comments
Open
Labels
loaders Issues and PRs related to ES module loaders

Comments

@nomagick
Copy link

nomagick commented Nov 10, 2024

It seems I am one of the early adopters of module hooks.
I liked the idea of the module customization hooks, and tried to use it in my project.
However, I believe the current behavior of require handling in the esm loader is buggy.

Relevant source code, here

const requireFn = function require(specifier) {
let importAttributes = kEmptyObject;
if (!StringPrototypeStartsWith(specifier, 'node:') && !BuiltinModule.normalizeRequirableId(specifier)) {
// TODO: do not depend on the monkey-patchable CJS loader here.
const path = CJSModule._resolveFilename(specifier, module);
switch (extname(path)) {
case '.json':
importAttributes = { __proto__: null, type: 'json' };
break;
case '.node':
return wrapModuleLoad(specifier, module);
default:
// fall through
}
specifier = `${pathToFileURL(path)}`;
}
const job = cascadedLoader.getModuleJobForRequireInImportedCJS(specifier, url, importAttributes);
job.runSync();
return cjsCache.get(job.url).exports;
};
setOwnProperty(requireFn, 'resolve', function resolve(specifier) {
if (!StringPrototypeStartsWith(specifier, 'node:')) {
const path = CJSModule._resolveFilename(specifier, module);
if (specifier !== path) {
specifier = `${pathToFileURL(path)}`;
}
}
const { url: resolvedURL } = cascadedLoader.resolveSync(specifier, url, kEmptyObject);
return urlToFilename(resolvedURL);
});

The resolving of the specifier is done in the main thread, using Module._resolveFilename, before being sent to the customization hooks.
The resolver hook is then receiving a resolved file:// URL, instead of the real specifier.

These are not expected according to the documentation of the module customization hooks.
And for my use case, it has made the hooks useless.

To make it clear there are two bugs:

  1. The cjs specifiers are being resolved in the wrong thread.
    The hooks, therefore, cannot really work, because if the "expected" file does not really exist on the hard drive. The cjs Module._resolveFilename will throw an error in the first place.
  2. The resolve hook is receiving a wrong specifier, if the first bug would not trigger.

The root of this issue might be the default resolver in customization hooks does not support resolving CJS modules.
An easy fix would be to move the Module._resolveFilename into the default resolver.
The loading of .node native add-ons may also get in the way.
But IMO it should also go through the hooks, and eventually be loaded as {type: 'commonjs', source: undefined, url: 'file://... .node'}

@nomagick
Copy link
Author

Maybe I should mention my use case here:

I'm trying to use the module hooks/loader with SEA, to distribute my project, in 1+n files.
Where the node.js binary will be injected with a custom hook/loader, and then load and run code in one or more ASAR archives.

This requires the hooks to resolve and load paths that do not really exist on the hard drive.

@GeoffreyBooth GeoffreyBooth added the loaders Issues and PRs related to ES module loaders label Nov 16, 2024
@GeoffreyBooth
Copy link
Member

@nodejs/loaders

@joyeecheung
Copy link
Member

I feel that the approach taken in #47999 is just broken and we should just go back to using Module._load to load import(cjs) instead of trying to re-invent a ESM-loader specific require and require.resolve. That approach just cannot cover all the corner cases and there will always be some holes (like what's mentioned in this issue, and random createRequire(), and the re-exports detection for the cjs-module-lexer). Going back means module.register() will not work for CJS, but then if it's already having all these edge cases in CJS, it's not really reliable anyway, and at least by going back we can make sure module.registerHooks() is always reliable.

@nomagick
Copy link
Author

@joyeecheung I generally liked the idea of having a dedicated loader thread.

I agree with you about reusing cjs loader code, but not Module._load directly. I believe the resolving can happen in the loader thread and module.register() can still work. It's just the code loading part, that needs to be left in the caller thread, similar to how internal modules work.

Mixing the cjs and module loader is probably inevitable because cjs and es modules may load each other in user code.

For this specific issue, it's just a bug-level thing, no need to overturn the whole approach.

@joyeecheung
Copy link
Member

joyeecheung commented Jan 27, 2025

For context the overturn has already been discussed before: I even have to push back against deprecating or just removing module.register() now when I was adding module.registerHooks() because the off-thread approach have many bugs and edge cases that have not been fixed and there's generally a lack of interest in fixing them. But I was just not in favor in removing the async hooks immediately due to not wanting to disrupt existing usage, not that I think it's a path worth pursuing further (I for one, am not interested in fixing the off-thread code because it just doesn't fit well internally IMO, and provides no good story for CJS monkey patching migration. I would not oppose against others trying to fix it, but you can probably see a lack of interest in fixing it from the radio silence of this issue).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
loaders Issues and PRs related to ES module loaders
Projects
None yet
Development

No branches or pull requests

3 participants