Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements packageIterator #205

Merged
merged 1 commit into from
Jan 22, 2020
Merged

Conversation

arcanis
Copy link
Contributor

@arcanis arcanis commented Jan 3, 2020

Offers a fix for the problem described in #199 (comment)

This PR offers a way to alter the default package path generation. The default implementation still uses path.join('/.../node_modules', packageName) (which cannot work for PnP when using the link protocol for the reasons described here), but the PnP resolver (automatically configured thanks to #174) will directly return the path to the package regardless of the dependency type, shortcutting the default logic.

One big advantage of this diff is that it's semver-minor, which avoids having to upgrade all third-party consumers (which I really don't have the bandwidth to do 😅).

lib/async.js Outdated
};

var defaultPackageIterator = function defaultPackageIterator(x, start, getPackageCandidates, opts) {
return getPackageCandidates();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this is really confusing. I saw a function with the same name defined right before this function, which takes 3 arguments. It took me a while to realize that this isn't calling that function, but the one passed as a parameter.

Maybe the one on line 39 could be renamed to getPackageNodeModulesCandidates?

Copy link
Member

@ljharb ljharb Jan 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return getPackageCandidates();
return defaultGetPackageCandidates();

lib/async.js Outdated
};

var defaultPackageIterator = function defaultPackageIterator(x, start, getPackageCandidates, opts) {
return getPackageCandidates();
Copy link
Member

@ljharb ljharb Jan 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return getPackageCandidates();
return defaultGetPackageCandidates();

lib/async.js Outdated
return dirs;
};

var defaultPackageIterator = function defaultPackageIterator(x, start, getPackageCandidates, opts) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var defaultPackageIterator = function defaultPackageIterator(x, start, getPackageCandidates, opts) {
var defaultPackageIterator = function defaultPackageIterator(x, start, defaultGetPackageCandidates, opts) {

lib/async.js Outdated Show resolved Hide resolved
lib/sync.js Outdated Show resolved Hide resolved
lib/sync.js Outdated Show resolved Hide resolved
lib/sync.js Show resolved Hide resolved
readme.markdown Show resolved Hide resolved
@nicolo-ribaudo
Copy link
Contributor

nicolo-ribaudo commented Jan 5, 2020

@arcanis Is packageIterator meant to return the possible package folders or the possible resolved path?

i.e. for resolve("foo/bar") should it return "/.../node_modules/foo/bar" or "/.../node_modules/foo"? I am asking because this PR does the first thing, while https://github.com/yarnpkg/berry/blob/3e324d5383de1f4143c0afc4829c33bd1b8e5f50/packages/plugin-compat/extra/resolve/normalize-options.js#L22 does the second one.

nicolo-ribaudo added a commit to nicolo-ribaudo/yarn-berry that referenced this pull request Jan 5, 2020
lib/async.js Outdated Show resolved Hide resolved
lib/async.js Outdated Show resolved Hide resolved
lib/async.js Outdated Show resolved Hide resolved
lib/sync.js Outdated Show resolved Hide resolved
@arcanis
Copy link
Contributor Author

arcanis commented Jan 5, 2020

Some context on my last few commits:

  • About isDirectory, I found out it got added in [Refactor] check existence of node_modules #190 for performance reasons. Since it doesn't have semantic meaning, checking the existence of the parent directory is good enough to preserve the original intent, so that's what I did (this avoids having to make packageIterator sync or async depending on the caller).

  • Regarding @nicolo-ribaudo's question I initially meant to return the package folder only (which isn't the case atm). However, after a closer inspection of the current architecture I feel like this would require to do extra work in the code itself (to split the subpath) for little reason (since it's already done later in the resolution). So I think keeping the current behavior is preferable (ie @scope/foo/bar will produce /n_m/@scope/foo/bar, not /n_m/@scope/foo).

  • The isCore change simply removes an unreachable call in the sync codepath (it could only be reached if the request was both an absolute path and a core module, which isn't possible) and moves the isCore check in the async codepath to be done before trying the n_m resolution, similar to what happens in the sync codepath. Due to the way the check was done it doesn't change anything, except that packageIterator won't be called if the package is a core module.

lib/async.js Outdated
var file = path.join(dir, x);
loadAsFile(file, opts.package, onfile);
}
isDirectory(path.dirname(dir), function (err, status) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the extra path.dirname call? shouldn't the dirs list already have that done?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dirs may point to files, for example if you do resolve something like foo/package.json then dirs will be ['/n_m/foo/package.json'] (that makes dir a misnomer, do you want me to rename it into entry or similar?).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i mean, yes, but in that case foo should be the entry that's a directory and package.json should be the entry that's not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's why I call dirname: since dir is /n_m/foo/package.json, the isDir check has to be done against its parent. Even if it was /n_m/foo (which would happen if the request was just foo), the right behavior would be to check against /n_m (since foo is allowed to be a file per the Node resolution rules), so the dirname would still stand.

@ljharb ljharb force-pushed the mael/package-iterator branch from 847d93a to 4d138d7 Compare January 6, 2020 07:23
@ljharb
Copy link
Member

ljharb commented Jan 6, 2020

This is a lot of change in one PR; would you mind opening up a separate PR with all the bugfixes and refactors, and associated tests, so we can get this semver-minor PR as small as possible?

@arcanis arcanis mentioned this pull request Jan 6, 2020
@ljharb ljharb force-pushed the mael/package-iterator branch 3 times, most recently from 99aa42e to c8a6050 Compare January 7, 2020 02:07
@@ -265,19 +274,18 @@ module.exports = function resolve(x, options, callback) {
if (dirs.length === 0) return cb(null, undefined);
var dir = dirs[0];

isDirectory(dir, isdir);
isDirectory(path.dirname(dir), isdir);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when is the dirname not going to be a directory? the intention here, i believe, is to see if dir points to a file or not. It seems like hoisting the path.join prevents that from being determined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dirname isn't a directory when it doesn't exist. From what I gather from #190 and #116, this code was only meant to speed up the resolution by checking whether the n_m folder existed before doing the file stat calls.

Checking whether the parent of the subpath (ie /n_m/foo for foo/hello) exists has the same behaviours: if foo/hello can be resolved, then /n_m/foo will necessarily be a directory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words:

request partial resolution isDirectory (before) isDirectory (after)
foo /n_m/foo /n_m /n_m
foo/bar /n_m/foo/bar /n_m /n_m/foo
@foo/bar /n_m/@foo/bar /n_m /n_m/@foo

The effect will be the same before and after: the two stat calls that the code would have made before #190 (taking the first line as an example, one to check whether /n_m/foo.js exists and another to check whether /n_m/foo/index.js exists) won't be necessary because resolve is able to find out that the directory supposed to contain them doesn't even exist.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Do you think that this perf improvement could also be split out into a separate PR (or even just a separate commit prior to your addition)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this PR it's not a perf improvement, it's necessary to make my PR work with the changes brought by #190 (because with my PR we don't have access to n_m paths anymore in loadNodeModules - instead we only have /n_m/<subpath>). I guess I could revert it and re-apply it, but I'm not sure it would be much clearer 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mainly concerned about an unintentional breakage, and my hope would be to release this particular change as a patch prior to releasing this PR as a minor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think #190 has only been released in the 2.0.0-next.0 branch, so for the purpose of the patch release the path.dirname thing can be removed entirely. Do you want me to open a separate PR against the 1.x branch?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#192 was the PR into the 1.x branch.

@@ -38,6 +38,14 @@ var maybeUnwrapSymlink = function maybeUnwrapSymlink(x, opts) {
return x;
};

var getPackageCandidates = function getPackageCandidates(x, start, opts) {
var dirs = nodeModulesPaths(start, opts, x);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this function is the same in sync.js, maybe it could be extracted to a separate file?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that, but then it's a separate file that has to be supported for the rest of v1.

Copy link
Member

@ljharb ljharb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully the backport into 1.x won't be too tricky.

@ljharb ljharb force-pushed the mael/package-iterator branch from c8a6050 to 0f698c6 Compare January 22, 2020 06:33
@ljharb ljharb merged commit 0f698c6 into browserify:master Jan 22, 2020
ljharb pushed a commit that referenced this pull request Jan 22, 2020
ljharb added a commit that referenced this pull request Jan 22, 2020
 - [New] `sync`'/`async`: Implement `packageIterator` (#205)
 - [Refactor] `sync`: add function name
 - [Refactor] remove useless `exports` assignment
 - [Dev Deps] update `eslint`, `@ljharb/eslint-config`, `tape`
ljharb added a commit that referenced this pull request Jan 22, 2020
v1.15.0

 - [New] `sync`'/`async`: Implement `packageIterator` (#205)
 - [Refactor] `sync`: add function name
 - [Refactor] remove useless `exports` assignment
 - [Dev Deps] update `eslint`, `@ljharb/eslint-config`, `tape`
ljharb added a commit that referenced this pull request Jan 22, 2020
Changes since v2.0.0-next.0:

 - [Breaking] add ESM wrapper to "exports"; change paths to match
 - [Breaking] add "exports" to `package.json`

Including all changes in v1.14.2 and v1.15.0:

 - [New] `sync`'/`async`: Implement `packageIterator` (#205)
 - [Fix] `sync`/`async`: Fixes isCore check (#206)
 - [Refactor] `sync`: add function name
 - [Refactor] remove useless `exports` assignment
 - [Dev Deps] update `eslint`, `@ljharb/eslint-config`, `tape`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants