Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Jest to preprocess non-root node_modules #607

Closed
modernserf opened this issue Sep 8, 2016 · 32 comments
Closed

Allow Jest to preprocess non-root node_modules #607

modernserf opened this issue Sep 8, 2016 · 32 comments
Milestone

Comments

@modernserf
Copy link

modernserf commented Sep 8, 2016

Update: @modernserf is working on a PR to this issue. Please don't submit other PRs.

Currently, create-react-app allows you to structure your app like this:

node_modules/
└─ react/
src/
├─ __tests__/
├─ index.js
└─ node_modules/
   └─ @myapp/
      ├─ foo/
      └─ bar/

So you can import 3rd party modules with import "react" or local modules with import "@myapp/foo". Webpack will preprocess the files in src/node_modules, es6 features (including imports) are converted with babel. I imagine this behavior is intentional, because typical babel-loader configurations ignore all node_modules directories.

However, although Webpack will preprocess these files through Babel, Jest will not; if your tests imports one of these local modules, Jest will choke on the un-transpiled import statements.

I think all one needs to do to enable this is add preprocessorIgnorePatterns: ["<rootDir>/node_modules"] to the Jest config -- this will continue to not preprocess regular node_modules, but will preprocess internal node_modules, so it should have no impact on existing apps.

Update: @modernserf is working on a PR to this issue. Please don't submit other PRs.

@gaearon
Copy link
Contributor

gaearon commented Sep 8, 2016

Webpack will preprocess the files in src/node_modules, es6 features (including imports) are converted with babel. I imagine this behavior is intentional, because typical babel-loader configurations ignore all node_modules directories.

This is not intentional, no. We include instead of excluding so that npm link still works. I didn’t however realize somebody would create another node_modules inside src.

Can you please explain your reasoning for doing it like this? Are you trying to avoid relative paths by doing so?

@modernserf
Copy link
Author

Avoiding relative paths is the immediate goal; the long-term goal is to allow apps to be structured more like monorepos.

@gaearon gaearon added this to the 1.0.0 milestone Sep 8, 2016
@gaearon
Copy link
Contributor

gaearon commented Sep 8, 2016

This is interesting. I’m not sure this is the approach we would take but we need to provide support for something like this.

@gaearon
Copy link
Contributor

gaearon commented Sep 30, 2016

This will be a part of #741.

@gaearon gaearon closed this as completed Sep 30, 2016
@cecilemuller
Copy link

cecilemuller commented Sep 30, 2016

It still would be so useful if Jest had the option to not ignore non-root node_modules instead of having to use a workaround (given projects that use local modules aren't likely to change structure just to use Jest).

@gaearon
Copy link
Contributor

gaearon commented Sep 30, 2016

Oh, to clarify, I’m happy to review a PR that adds preprocessorIgnorePatterns: ["<rootDir>/node_modules"] to our Jest configuration.

@cecilemuller
Copy link

cecilemuller commented Sep 30, 2016

The trouble is, testPathIgnorePatterns seems to add to the defaults list instead of replacing the value?

For example, even with the following config, tests insides /src/node_modules aren't called:

"jest": {
    "testPathDirs": ["<rootDir>/src"],
    "testPathIgnorePatterns": ["<rootDir>/node_modules"]
}

But that's an issue with Jest itself, not CRA.

@gaearon
Copy link
Contributor

gaearon commented Sep 30, 2016

Please file it in Jest repo, and we'll discuss it there.

@gaearon gaearon reopened this Nov 20, 2016
@gaearon
Copy link
Contributor

gaearon commented Nov 20, 2016

I've changed my mind. Let's just make src/node_modules work well and document it. People can always use symlinks to make it prettier.

@modernserf
Copy link
Author

I'll take this!

@ericclemmons
Copy link
Contributor

Update: @modernserf is working on a PR to this issue. Please don't submit other PRs.
...
@modernserf commented on Nov 21, 2016

Is this being solved in Jest or CRA?

@gaearon
Copy link
Contributor

gaearon commented Feb 4, 2017

I think these are loose threads here:

#1081 (comment)
#1081 (comment)
jestjs/jest#2318 (comment)
jestjs/jest#2796

@modernserf
Copy link
Author

To be perfectly clear: I do not know how to make this work without breaking Facebook's specific use cases. Even getting Jest to work as it is currently documented -- that is, with testPathIgnorePatterns not automatically including node_modules -- appears to break something in haste.

@Reanmachine
Copy link

Reanmachine commented Feb 6, 2017

@modernserf It really comes down to the general concept of Jest wanting to "just work" by default. The default behavior is going to be to ignore node_modules/ folders.

I could submit a PR to Jest to completely change that assumption and get testPathIgnorePatterns to work out of the box as expected but that changes the responsibility of the developer using Jest to having to specify node_modules/ in every project they create to keep it fast.

If we built another system on top of it, like say a whitelist configuration then what we have is a double negative, "ignore this ignore" so to speak, and it becomes complicated. That's why in my facebook/jest#2796 PR I made the case that having the developer make the conscious decision to abandon the default jest behavior and then rely on their own ignore list was probably better.

If the testPathIgnorePatterns is configured correctly then the performance of haste/jest will still be optimal because the _ignore predicate for the haste map will not look at / process anything from the root node modules and jest will still be fast.

Currently jest does support something we can use for create-react-app:

{
  "jest": {
    "haste": {
      "providesModuleNodeModules": [
        "@my-module",
        "@my-utils"
      ]
    }
  }
}

This works by setting up a whitelist for node_modules/(@my-module|@my-utils) that allows slices of node_modules/ to be "ignored ignored" (double negative) but this won't work in CRA because we need the pre-processing to run on the modules when running jest, simply allowing tests in these folders to run won't work.

I think the underlying problem is that Jest's project-level assumptions are kicking this whole concept in the teeth, and because that assumption & jest's mantra are so well entrenched I don't see this changing any time soon.

I imagine @cpojer will have issues with any PR that changes big assumptions about how jest views your project without large discussions and maybe even a major version bump.

@gaearon CRA is has some pre-processing set up when running jest I think to allow jest to test ES6 import/export files. What are our limitations in the react-scripts side for having that process based on jest configuration?

@gaearon
Copy link
Contributor

gaearon commented Feb 24, 2017

Do I understand correctly that jestjs/jest#2796 is where we’re stuck right now, and we need to get feedback on it?

@Reanmachine
Copy link

Reanmachine commented Feb 24, 2017

@gaearon

Basically the way I understand Jest is that they aim for safe defaults that should work mostly out of the box. Jest seems to assume that all node_modules/ are delivered dependencies and blacklists them by default.

See Relevant Jest Source

To get testPathIgnorePatterns to work as documented out of the box we have 2 approaches:

  1. Abandon the assumption that node_modules/ is automatically bad entirely so the testPathIgnorePatterns is working like a blacklist model
  2. Add the inverse of the testPathIgnorePatterns to the whitelist that lets certain groups through the block.

I didn't feel comfortable doing Option 1 because that is a huge breaking change that Jest authors would probably throw out in a heartbeat. And for Option 2 it feels like a recipe for disaster as we now have a negated black list whitelisting a black list.

This is why I opted for the retainAllFiles option to let the user (or react-scripts) decide to turn off the safe defaults because it was going to set up custom defaults with testPathIgnorePatterns. retainAllFiles is an existing behavior of jest-haste-map and all we really are changing is the ability to use that option via configuration.

It feels dirty and I haven't gotten any feedback from the jest team on that PR. (It's also out of date, i need to rebase it on the new master) but without architectural decisions from the Jest team I think we're stuck.

TL;DR - Yes, for the reasons above it looks like we're stuck.

@Reanmachine
Copy link

Well @gaearon we're back to square one on this one, understandably the jest team wasn't comfortable making the neccesary large changes to support this, additionally it seems they're not comfortable exposing jest-haste-map configuration to the package.json configuration to allow the user to configure the behavior of jest's processing surface.

This kind of puts a decent sized snag in local node_modules packages and probably means #1065 is dead in the water.

@Timer
Copy link
Contributor

Timer commented May 11, 2017

From my understanding Jest 19+ fixed this with testMatch.

@Reanmachine
Copy link

Reanmachine commented May 11, 2017

@Timer are you referring to jestjs/jest#3538 ? I'd have to test but in my initial investigations into this it was a matter of how jest's haste map builds up the list of test file candidates by actively ignoring node_modules folders referenced here: https://github.com/facebook/jest/blob/master/packages/jest-haste-map/src/index.js#L701-L711

It basically would actively ignore it because it resides inside a node_modules folder. Haste Map has the ability to disable this, but that configuration isn't exposed by jest itself (not passed from jest -> jest haste map).

The testMatch cannot ever match these tests because the virtual file system built with jest-haste-map will have excluded it before that check is made.

@cpojer
Copy link
Contributor

cpojer commented May 11, 2017

Not completely opposed to it, we can make it work, but all the PRs were outdated and didn't do all the things that were necessary. We are a bit resource constrained with Jest atm but it should be possible to make this change properly, if somebody is thorough enough.

@Reanmachine
Copy link

@cpojer I think what's not clear is the direction the Jest team would like to take on this. As I outlined in my PR, it has to do with changing assumptions of the jest-haste-map or exposing existing functionality to the jest user so they can configure the behavior as they need.

it should be possible to make this change properly

That's incredibly ambiguous, what is "proper" in the Jest team's eyes? Obviously a less hacky solution that passing retailAllFiles is ideal, but as I noted in my PR (which was outdated? I updated it less than a month ago and it sat with no comments since its inception) without changing assumptions or the configuration / process by-which jest-haste-map operates it cannot move forward.

I recall another PR on this issue that did something similar to that, but was rejected because the changes broke internal Facebook dependencies that the author had no way of taking into account.

I'm not saying that the retainAllFiles solution is the way I'd like to solve this, but it's the only way I can see of solving this without the Jest Team acknowledging the assumption and communicating what's open to change and what's not, how can a contributor go about designing a solution.

You say you're resource constrained, we get that. That's why we're here, willing to submit PRs (I've counted at least 3 so far) and help move this forward. Hell I've already moved on in my own project and accepted that this probably wont see the light of day, at least not for a LOOOOONG time (it's already gone through 3-4 major jest versions and a couple CRA versions without any traction or discussion being completely ignored).

But tell me this, why would we spend hours of time coming up with a solution, when there's a decent chance it's just going to be shot down, with no discussion, over changes to the project that aren't compatible to your unspoken goals, or your undisclosed dependencies.

This is why I proposed the retainAllFiles solution, it's there already, its unobtrusive, all the defaults stay the same as they always were, the changes wont break internal Facebook dependencies, and it would allow the already heavily discussed solution to the local-modules problem in CRA to be addressed and properly documented until the time that the Jest team has time to weigh in.

Is it the best? No
Is it prone to performance problems if not applied correctly? Possibly

You say you're open to making this change if someone is thorough enough, in what regards? In magically navigating all the unspoken issues the Jest team will have with changing the jest-haste-map's behavior regarding node_modules they're not willing / don't have time to have a discussion on? Or was this an issue of not having a "Complete" enough PR?

@cpojer
Copy link
Contributor

cpojer commented May 12, 2017

@Reanmachine hey! That's a bit strong. Please keep in mind that right at this moment, there is no official Jest team at Facebook (this is changing currently, yay!). It's just me spending all my free time coding and being burned out on it, a bit. Because I rewrote large parts of Jest recently, I went through the PR list to close older pull requests that haven't had activity in a while and had merge conflicts.

The problem with retainAllFiles is that it actually means that Jest will go and read and parse every single JS file in node_modules as well as retaining metadata about them. For project with large dependency trees like CRA, this might add seconds to the startup time.

For most people, this kind of slowdown is probably not acceptable and will make people want to not use Jest. This issue has also not moved much in the last three months, telling me that it doesn't seem like something that many people are running into.

To sum up the current state: the solution with retainAllFiles that we came up with clearly falls short as it makes Jest do much more work than necessary for all projects. If we'd like to seriously fix this, the solution probably has to go quite a bit deeper. Since I'm not really able to spend any time on this, the ideal solution would come from an open source contribution and it would not alter the performance characteristics of jest-haste-map significantly.

I'm wondering if the solution could be to use module directories and a separate folder name that clearly make a distinction between first-party and third-party code. Most tools (webpack, Jest) support a configuration for that.

@Reanmachine
Copy link

Reanmachine commented May 12, 2017

@cpojer Sorry if I came off strong, it's just really frustrating to try and help when the other end is often-times a black box. It's good to hear jest is getting a more serious commitment from FB in the future.

I suppose where the big disconnect lies is between the testing layer & the indexing layer. Correct me if I'm wrong but the purpose of the jest-haste-map is to represent the file system that jest works inside in a performant way. The disconnect in the configuration (and documentation) is that there's a lot of 1st-class configuration options around determining what things to use/ignore/etc. (Eg. testPathIgnorePatterns is set up to default to ["node_modules"]).

The problem is that the assumptions that jest-haste-map makes when it scans the filesystem is that node modules folders are never included in the first place, so that ignore pattern is mostly un-used because there'll never be a test in the virtual file system inside that folder to begin with.

All of the config options in jest seem to be centered around safe/smart defaults. You look at coveragePathIgnorePatterns, testPathIgnorePatterns, transformIgnorePatterns they all default to /node_modules/ and can be combined with <rootDir> in the configuration system for some explicit behavior.

I think the ideal way for the user would be that if testPathIgnorePatterns behaved as documented, but right now the assumption that jest-haste-map makes about ALL node_modules paths is what runs counter to this.

Right now we can configure the whitelist that overrides this behavior with jest.haste.providesModuleNodeModules but this is done on a module-by-module basis, and it's about configuring jest-haste-map not jest. This isn't very discoverable for the user, and isn't going to be realistic/performant for jest to scan & load module names that don't match it's blacklists into this whitelist.

I think that the goal would be to allow a jest user to go "hey, I only have external modules in frontend/node_modules and backend/node_modules so I'd like jest to treat only these folders as dangerous. (Without having to understand jest-haste-map's virtual filesystem and the difference between it and their testPathIgnorePatterns

We could accomplish this by using something similar to moduleDirectories, the problem with moduleDirectories itself is that it doesn't follow the conventions other configuration does, and wouldn't allow us to say ["<rootDir>/frontend/node_modules", "<rootDir>/backend/node_modules"] (etc...) to allow /frontend/src/node_modules to be treated like what it is --- a module folder for node, but not managed by npm necessarily.

What lies after this is making sure that jest can configure jest-haste-map's _ignore() method sufficiently to allow it to index the correct folders. This is where my frustration comes (and the strong nature of my last post).

This is all stuff I'm willing to tackle but I'm left with a dillema:

How do I go about modifying jest-haste-map to support that behavior without breaking the internal Facebook things that rely on jest-haste-map that I've seen other PRs closed over. Since they're not OSS projects I cannot make patches to them or test that the changes made do not impact them.

I'd like to hear your comments on all of this @cpojer, I'm willing to make the OSS contribution but I'd rather not waste the time if undisclosed non-oss projects on FB will kibosh any attempt to improve this situation.

@cpojer
Copy link
Contributor

cpojer commented May 12, 2017

If you'd like to change jest-haste-map to be extended this way and the default will not generate larger amounts of unnecessary files to be kept track of by Jest, I think we can make this work. I dislike exposing new configuration options for something like this but if that's the only way to go and there are many people that would benefit from this, I'm not opposed to shipping it.

This isn't so much an FB concern as it is a concern with the memory usage (and disk cache) as well as performance of jest-haste-map. It's a critical piece of infra for Jest and react-native-packager, so altering its performance characteristics by default isn't good.

@gaearon gaearon modified the milestones: 0.11.0, 0.10.0 May 16, 2017
@gaearon
Copy link
Contributor

gaearon commented May 16, 2017

I’m pushing this back to 0.11 as we’re cutting 0.10 very soon, and there’s no solution yet.

@trungdq88
Copy link
Contributor

trungdq88 commented Jun 13, 2017

This is not going to be resolved soon, I guess.
Is there any workaround available? (without ejecting)

@trungdq88
Copy link
Contributor

Ah.. never mind, I found an acceptable workaround: Add NODE_PATH=src/alias and then create alias directory with bunch of symlinks in it.

@Reanmachine
Copy link

@trungdq88 the easiest workaround is to make this folder structure:

  • src/modules
  • src/modules/my-module-1
  • src/modules/my-module-2
  • src/node_modules -> ../modules (symlink)

By leveraging the node_modules directory, all tooling is compatible.

@mlesk
Copy link

mlesk commented Jun 17, 2017

Given the desire to continue ignoring node_modules unless it is a node_modules folder contained directly beneath src, I have gotten this to work by modifying createJestConfig.js in react-scripts-js by adding these patterns:

testPathIgnorePatterns: [
      '^(?!.*([/\\\\]src[/\\\\]node_modules[/\\\\])).*([/\\\\]node_modules[/\\\\])'
],
transformIgnorePatterns: [
      '^(?!.*([/\\\\]src[/\\\\]node_modules[/\\\\])).*([/\\\\]node_modules[/\\\\])'
],

The goal is to continue to filter out /node_modules/ unless it is contained within a part of the path that exactly matches /src/node_modules/ or \src\node_modules\

Now I am able to maintain absolute referenceable imports that are directly contained within the src/node_modules folder or are simply symlinked into the src/node_modules folder.

Regex patterns use negative lookahead rather than negative lookback ((?<!([/\\]src))[/\\]node_modules[/\\]) because javascript does not support negative lookback.

This seems like a straightforward solution that works for this specific project structure without requiring any changes to Jest or the need to expose any further configuration within CRA. Is there a particular scenario in which this approach will break. From my usage so far it appears to work well with CRA and with VSCode.

@modernserf
Copy link
Author

@mlesk did you verify that this works? When I was working on this, it didn't matter what you put in testPathIgnorePatterns; jest always ignored node_modules.

@mlesk
Copy link

mlesk commented Jun 17, 2017

@modernserf Ok, jumped the gun a bit after coming back to this a few times I got excited when it finally worked,

There are two scenarios:

  1. Symlink addressable folders into the /src/node_modules/ directory. With this approach the suggested changes work since in that case Jest does not need to actually find them in the src/node_modules directory.

  2. Placing addressable folder directly into src/node_modules directory. With this approach the suggested changes do not work unless you also add this to the jest config:

"haste": {
      "providesModuleNodeModules": [
        ".*"
      ]
    }, 

With the haste configuration jest picks up the files in src/node_modules. However, I suspect this is may not be an acceptable general solution since it causes Haste to spit out some duplicate file name warnings before the tests run. Once the warnings fly by the tests do all run.

The funny thing is that Haste does not always produce the warnings, only occasionally. Having absolutely no insight into how haste works it is hard for me to speculate exactly what this haste configuration does. This was an exercise in trial and error not first principles.

@gaearon
Copy link
Contributor

gaearon commented Jan 8, 2018

Closing in favor of #1333.

@gaearon gaearon closed this as completed Jan 8, 2018
@lock lock bot locked and limited conversation to collaborators Jan 20, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants