Skip to content
This repository has been archived by the owner on Jun 5, 2021. It is now read-only.

Enable non-relative imports with more node_modules granularity #19

Open
enriched opened this issue Aug 27, 2018 · 6 comments
Open

Enable non-relative imports with more node_modules granularity #19

enriched opened this issue Aug 27, 2018 · 6 comments

Comments

@enriched
Copy link
Contributor

Currently these rules recreate the same folder structure as the workspace for the build and symlink to an installed node_modules folder . To be able to refer to modules in a non-relative way (eg import { greet } from "greeter" rather than import { greet } from "../../../libs/shared-package/greeter" we can create node_modules folder with multiple hardlinks inside of it, similar to how pnpm does it.

Moving forward this could also enable more fine grained control of the npm modules that are available to a build. eg:

ts_library(
    name = "index",
    srcs = glob([
        "**/*.ts",
        "**/*.tsx",
    ]),
    tsconfig = "//:tsconfig.json",
    deps = [
        "@npm//lodash",
        "//packages/my-shared-lib",
    ],
)

To implement this, the existing NpmPackagesInfo provider will be handled by hardlinking all folders in installed_dir not starting with @ and all subfolders of folders starting with @. Any dep with a JsLibraryInfo provider will have its compiled_javascript_dir hardlinked into the node_modules directory if it has a new module_name field in it (the hardlink will be named module_name).

@fwouts does this proposal fit with these rules?

@fwouts
Copy link
Owner

fwouts commented Aug 27, 2018

I'll take this opportunity to document the history of the decisions that led to the current approach, before discussing your proposal :)

Approach #1: no package.json

I actually started this whole project because I thought that depending on a shared node_modules directory, which rules_nodejs encourages, was a strange approach in Bazel land. The fact that a build target can access any package, just because it happens to be listed in package.json, seemed wrong.

The first idea was to declare each external NPM module as its own Bazel target. There was no package.json at all. This worked, but there were two issues:

  1. It made package management very manual, and therefore would make it hard to convince engineers used to the comfort of NPM/Yarn to move over to Bazel.
  2. There was no lockfile. By definition, a lockfile would have to be shared across all build targets. This was at odds with the idea of each build target being defined separately from the rest.

Approach #2: shared package.json with requires = [...]

Once I saw that I'd have to introduce a lockfile, I decided that you might as well leverage NPM/Yarn and have a shared package.json. Reluctantly, I moved back to an approach similar to rules_nodejs, with one difference: each build target had to declare explicitly all the modules that they required. I used the TypeScript parser to parse each JavaScript/TypeScript source file, extract the name of modules it imported, and check whether they were in the requires = [...] list. This was approach #2.

Approach #2 created a separate node_modules directory for each build target, trying to only add the modules it required, as well as modules its dependencies required.

You can see the move from #1 to #2 in this commit: 32f5699 (left-hand is #1, right-hand is #2).

Approach #3: change of compiled tree structure

With approaches #1 and #2, other build targets were depended upon through an artificial directory in node_modules. For example, if a.js had a statement like import * from "../b.js", this would be transformed into something like import * from "__path__to__b/b.js" (and an artificial __path__to__b node module was created from its compiled build).

As I was trying to apply Bazel to existing codebases, this got a bit in the way. For example, tools like jest automatically ignore anything in the node_modules directory, making it hard to have a single test target (in hindsight, this may actually not have been a real issue).

So instead of //abc/def:ghi compiling down to just:

ghi.js
node_modules/
  ... other deps

The output was changed to:

abc/
  def/
    ghi.js
... other deps (same tree)

Here is the commit that made this change: fb285d4

Approach #4: shared package.json without requires

I then tried to migrate a few existing codebases to Bazel. I ran into a few issues:

  • one JavaScript codebase used Webpack aliases extensively, which were incorrectly detected as dependencies on an external NPM module
  • each TypeScript build target required a lot of @types/abc modules in the requires statements, but that couldn't be detected automatically unless we'd go look into the module abc's code itself to detect whether it needed a separate @types/abc module on top.

The whole "magic" to figure out whether a package was forgotten from requires was getting out of hand and I saw it getting in the way of developer productivity (having a failed build every time you add an import statement isn't a pleasant experience), so I decided to scrap it entirely.

This was done here: fa2a1a1

Tweak #5: Remove support for TypeScript aliases

In an effort to simplify the rules, I temporarily scrapped support for TypeScript aliases, which until then were automatically extracted from tsconfig.json. This seems relevant to your proposal, since part of your concern is relative imports.

The commit that removed it is d81c2e3#diff-0b90128bc225943469137bbd70c6f0c1.

This helped improve compilation errors, which often pointed to the wrong line because the TypeScript file that was compiled by Bazel was post-processed with the import changes, so line numbers didn't match. This could have been fixed differently, e.g. by not using the TypeScript library, maybe Babel instead.

Tweak #6: Support for Webpack aliases

Finally, in an effort to get a large codebase from one of my clients to compile, I added support for Webpack aliases with a new type of build target called js_module. This came as a bit of an afterthought, to be perfectly honest. I don't think it's a good solution with its current implementation (not TypeScript friendly). See fbae1b4.

Side project: Automatically generated Bazel configuration

A lot of the changes I tried to address as I was going through the different approaches related to manual effort of maintaining constantly-changing BUILD files.

Since then, I spent some time building an automatic BUILD file generator, which would address some of these concerns if we spent more time strengthening it up (e.g. automatically adding dependencies on NPM modules). You can find the code at https://github.com/zenclabs/js-deps/tree/master/src/bazel.

Your proposal

I think what you're proposing is slightly similar (not exactly the same) to approach #2. I actually don't mind going back to something like this, now that there's a way to automatically generate BUILD files for large codebases.

One question: how would you write import { greet } from "greeter" in your code? How would the IDE understand this? This is quite important in a TypeScript codebase, as you want to make sure you get all the right autocompletions to trigger.

Depending on how you declare the packages themselves, if you went with something like approach #1, my concern would be maintainability (how do engineers update packages to a new version without conflicts?). Not sure whether this is in the scope of this discussion or not.

PS: I hadn't heard about pnpm and now I'm itchy to try it :)

@enriched
Copy link
Contributor Author

Thank you for that great rundown on the history of javascript dependencies!

Here are more notes on how to make this work for a project and reasoning around why it would be a good idea to have more control of the node_modules directory.

IDE Comprehension via path mapping

Adding path mapping to tsconfig.json in the source would enable IDEs to resolve the module types correctly. Path mapping enables typescript to check within a list of paths when it encounters a non-relative import and resolve the path for the typings. Unlike webpack's resolve.modules it does not transpile the import to a relative import. People get pretty salty (as shown by this issue) about the lack of transpilation to realitive imports in the typescript compiler because it results in output that might not be able to resolve the modules that it compiled with.

Adding path mapping to the tsconfig.json appears to carry through so that the typescript compilation will work with these rules. But if you try to run the transpiled code it will fail without something like tsconfig-paths to assist with the module resolution at runtime or adding symlinks to the modules in node_modules.

Consistency with popular nodejs tools

Manipulating symlinks in the node_modules repository is how many of the popular nodejs tools manipulate dependencies locally.

Examples of tools that manipulate symlinks in the node_modules directory:

So having a finer grained control of the symlinks in the node_modules directory would enable more compatibility with situations that people have with other tools.

I think that it is also worth noting that npm is can install from a local tarball as well.

Conclusion

Currently, I agree that the build directory should maintain the same structure as the source directory, but having more control over what is in node_modules is a good idea. If a complete NpmPackagesInfo provider is added to the deps then all of the modules within it would be added. Additional modules can be added to node_modules by adding targets with a JsModuleInfo provider.

I'm not sure if I understand the issues that are introduced when an engineer updates to a new package. If there are multiple JsModuleInfo providers that expose the same module_name, the last one wins so that a target can override dependencies based on its needs.

Please let me know if I am not making any sense :) and thanks for pointing me in the direction of that JsModuleInfo provider. That seems like a good way to break out subpaths from the transpiled directory.

@fwouts
Copy link
Owner

fwouts commented Aug 28, 2018

When I mentioned the issue about an engineer updating a package, I had in mind cases like when upgrading react to 16, you also need to upgrade react-dom to a matching version. Reconciling versions and letting you know when there's a conflict is something that NPM and Yarn do, which I'd rather not re-implement. It doesn't seem to be an issue with the approach you're proposing though since we'd still have a single package.json.

If I understand the plan correctly, I think it sounds good. As I've said with #6, I'm not too happy with the current approach of JsModuleInfo (especially only using it at the Webpack alias level). Feel free to suggest more radical changes as part of this.

In terms of handling conflicts of module names, I'd rather we start by making it an error and then see whether it's actually necessary to allow overriding. I have a feeling when someone needs to override, it could be a sign that something else is wrong with how the codebase is structured maybe?

In terms of implementation, are you comfortable getting your hands dirty with this? I'm happy to provide guidance and help lay out the foundations that are required. I also want to make sure you don't spend a lot of time implementing an approach that wouldn't end up being approved, so feel free to send out early WIP drafts my way.

@enriched
Copy link
Contributor Author

Yeah, the engineer should still use a package manager to manage the dependencies on their end. Makes sure that the node_modules directory is created in the workspace where the IDE expects it.

I will share some WIP once I get going on this :)

@enriched
Copy link
Contributor Author

Another possible option that I have seen recently is yarn's plug-n-play. Don't know if this is ready for prime time, but could be efficient because we would not need to symlink in the node modules. The plug-n-play design would allow for generating a .pnp.js that would resolve modules directly from the node_modules directory.

@fwouts
Copy link
Owner

fwouts commented Sep 28, 2018

@enriched That sounds great actually. I think that feature has very high chances of becoming the de facto standard in the future, so leveraging it doesn't seems like a big risk.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants