Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RRFC] npm install --from-lockfile #415

Open
danShumway opened this issue Jul 15, 2021 · 11 comments
Open

[RRFC] npm install --from-lockfile #415

danShumway opened this issue Jul 15, 2021 · 11 comments

Comments

@danShumway
Copy link

danShumway commented Jul 15, 2021

Motivation ("The Why")

Based on the discussion on npm/cli#564, npm/feedback/discussions/340 and a few other issues.

The purpose of this proposal is to offer a middle ground between npm install and npm ci, providing the speed benefits of partial installs (npm install with an existing node_modules folder) alongside (some of) the deterministic guarantees of npm ci (exact package numbers and dependency numbers from package-lock.json).

Example

  1. At my current workplace, we run continuous deployment on a server for testing purposes. We have a large dependency tree with multiple dependencies that require node-gyp compilation.

    We do not use npm ci; the clean install process takes longer than we would like. We would like to push changes and very quickly get results back from the server. Additionally, because we are not building production builds, we are not concerned with doing a clean, completely deterministic install every time. We don't need that level of accuracy for a test server.

    However, we are interested in forcing our test server to respect package-lock.json. We have run into multiple failed builds that were the result of minor/bugfix package updates that automatically happened during a push. So ideally, we would be able to keep the speed of npm install, while respecting package-lock.json during the install.

  2. We are onboarding a contractor to look over the codebase, or we have an Open Source project that we are writing compile instructions for. The contractor/contributor should use npm ci after they pull our repo, so we know they're getting all of the correct versions. However, even when writing documentation about how to download or install a repo, a contractor/contributor might forget to run npm ci. They might not even know the command exists. In the case of a contributor, they might turn around a file an issue or complain that the project isn't compiling.

    We'd like to be able to specify that npm install should use --from-lockfile in our project's .npmrc, which would completely mitigate the above issue.

  3. One way we might decide to approach the above situations is by specifying exact versions of each package in our package.json. However, this would block us from being able to use wildcards and fuzzy package numbers to help with guiding the upgrading of packages. Additionally, contributors would need to remember to only install exact versions or to update package.json with exact numbers whenever they install a new package. In my experience, they often forget to do so.

    We'd like to be able to get the general behavior of using exact package numbers in our package.json without losing all of the benefits of fuzzy matching during package upgrades, and without complicating package installation.

How

Current Behaviour

There are two basic ways to install packages in NPM: npm install and npm ci:

A) npm install reads packages from package.json, which allows packages to be updated behind the scenes.
B) npm ci reads from package-lock.json, which allows much greater confidence that packages will not change behind someone else's back.
C) npm install allows partial installs: if a package is already up to date in node_modules, it will not be re-downloaded/compiled.
D) npm ci deletes the node_modules folder every run, forcing a clean install. This means every package will run its pre-install scripts, get copied over to node_modules, etc...

This comment on the original issue goes into a bit more detail, but the basic gist is that a full clean install is too slow for many non-production settings as soon as large packages with complicated pre/post-install scripts come into play. However, allowing non-production builds to grab the latest versions of dependencies is not reliable enough behavior for most environments.

Desired Behaviour

npm install --from-lockfile behaves according to B) and C) in "Current Behavior" above, while avoiding behavior A) and D).

  • from-lockfile reads from package-lock.json or another shrinkwrap for exact package numbers.
  • from-lockfile does not remove node_modules.
  • from-lockfile is ignored when installing new packages.
  • from-lockfile will error out if a package-lock.json or other shrinkwrap is not present.
  • npm install --from-lockfile is not a clean install. If an existing installed module is corrupted or edited beyond the point where a normal npm install would catch it, it will not be cleared out and reinstalled. It does not offer the same deterministic guarantees that npm ci offers.

This specific solution/behavior is loosely based off of @jdussouillez's comment and @DanielRuf's comment that both encouraged adding a new flags to npm install instead of npm ci.

It deviates from their offered solutions in that it does not go all the way and add a --clean option to npm install, nor does it try to alias npm ci to npm install --from-lockfile --clean. I think those are reasonable solutions, but I'm just trying to make the smallest change possible that will make all of the desired behaviors possible without messing with npm ci at all.

I think it's a much safer, smaller, more predictable change to add a single flag to npm install rather than to try and mess with existing behavior in a way that would possibly be a breaking change.

References

I have a basic implementation in progress at danShumway/cli/commit/bc8ef2.... Assuming no other changes, I still need to add a few more unit tests and I still need to add documentation. But of course, that's all subject to whether or not the implementation/proposal itself still requires work.

From my very basic testing, this seems to be working and comes with a pretty substantial performance increase in the scenarios I describe above.

@isaacs
Copy link
Contributor

isaacs commented Aug 4, 2021

I'm curious about the situations you're finding where a lockfile exists, and npm install does something different. That sounds like it's just a bug. The only time when install will change the lockfile is if you're adding/changing a dependency (eg, by doing npm install <pkg>) or something is missing or invalid (for example, if the lockfile was generated without peerDeps).

The only difference is:

  • npm ci will fail if there is no lockfile (npm install will use the existing node_modules tree as a starting point if no lockfile is present, or build the idealTree up based on the root dependencies if no lockfile or node_modules is present)
  • npm ci will delete an existing node_modules folder (npm install modifies the node_modules tree in-place)
  • npm ci will force --save=false to ensure it doesn't change anything (npm install can change the deps, so it does save by default)
  • npm ci does not take arguments to add/change deps (npm install does)

Here's what npm ci does:

    const opts = {
      ...this.npm.flatOptions,
      path: where,
      log: this.npm.log,
      save: false, // npm ci should never modify the lockfile or package.json
      workspaces: this.workspaceNames,
    }

    const arb = new Arborist(opts)
    await Promise.all([
      arb.loadVirtual().catch(er => {
        log.verbose('loadVirtual', er.stack)
        const msg =
          'The `npm ci` command can only install with an existing package-lock.json or\n' +
          'npm-shrinkwrap.json with lockfileVersion >= 1. Run an install with npm@5 or\n' +
          'later to generate a package-lock.json file, then try again.'
        throw new Error(msg)
      }),
      removeNodeModules(where),
    ])
    await arb.reify(opts)

And here's what npm install does:

    const opts = {
      ...this.npm.flatOptions,
      log: this.npm.log,
      auditLevel: null,
      path: where,
      add: args,
      workspaces: this.workspaceNames,
    }
    const arb = new Arborist(opts)
    await arb.reify(opts)

@nmm-shumway
Copy link

nmm-shumway commented Aug 4, 2021

Will need to wait until later tonight before I can run some tests and respond in more detail, but just so I understand the intended behavior, if the following steps happen:

  • I have a package.json specifying a dependency of "foo": "^1.1.0".
  • I run npm install and npm resolves foo to 1.1.3. It creates package-lock.json with the dependency tree.
  • The upstream author releases foo version 1.1.4 onto npm.
  • I run npm install again.

Are you expecting that the package-lock.json shouldn't change, or that it should?

I'll reconfirm later, but I think current behavior is to regenerate package-lock.json based off of foo@1.1.4. The --from-lockfile flag (assuming I haven't messed anything up in my implementation) is intended to make it so that in the above scenario, the package-lock.json does not get updated, and npm install still downloads version 1.1.3.

My understanding (again, I could be misreading how the code works) is that when I call arb.loadVirtual(), I'm essentially ignoring the package.json. Mostly I'm only worried about the example I listed above where dependencies might change invisibly, but my expectation is that npm install --from-lockfile would not regenerate package-lock.json even if I have deleted or edited the dependencies inside the package.json file.

@Aeolun
Copy link

Aeolun commented Oct 8, 2021

The fix to a problem I've had with npm for years, is literally 10 lines of code.

It boggles my mind that nobody working on the project did this before.

@danShumway
Copy link
Author

danShumway commented Oct 8, 2021

@Aeolun I'm still interested in finishing this pull request and merging it since I think that fully ignoring package.json is still a valid use-case for npm install. However, it's lower in my priority list because since the time that I opened this pull request, the newer versions of npm have several (bugfixes? changes?) to how it interprets package-lock.json v2 that mitigate many of the same problems this patch is trying to address.

See npm/cli#564 (comment) for more details.

Unfortunately, I don't see any reason to believe that this will get backported to npm v6, so people who are still on that version are probably just out of luck. But if you're tired of waiting, my recommendation is to (if possible) upgrade to npm 7 and then change your lockfiles over to be v2 and re-commit them. I think that will solve the problem.

I share the frustration that v7 is the first version that seems to address this problem, even though people in the linked issue have been fighting for years to even get acknowledgement that npm install was at that time ignoring pinned versions. And more than that I share the frustration that this isn't really documented anywhere and there doesn't seem to be any official policy about what's supposed to happen in these edge cases. I can't say with certainty that v2 is following some kind of formal spec about when package.json does or doesn't take precedent over the lockfile and that it will stay consistent in the future. I don't know if the previous behavior indicates v1 lockfiles were bugged, or if v2 lockfiles are supposed to work differently and it's a breaking change, or if the whole thing is just a happy accident. But regardless, there is at least a partial fix now.

I will likely at some point circle back around to this pull request, or open a new one at that point. There still should be an option for npm install to solely use the lockfile, there are still scenarios where package.json will override it. But it's lower priority for me since v2 lockfiles seem to for now mitigate the majority of issues I'm seeing at my current job.

@danShumway
Copy link
Author

danShumway commented Oct 8, 2021

Most users installing Node from the website will still have this issue. I guess it's recommended to them that they upgrade, but... v6 to v7 is a major release with breaking changes, so fixes should probably be backported to the LTS. If it's not a breaking change/version difference, then users probably shouldn't be shipped a version of npm that still exhibits incorrect(?) behavior.

That being said, Node 14 will go into maintenance on October 19th, and at that point Node 16 will swap to being the default download on the website. So I'm doubtful that there's going to be much motivation from anyone (myself included) to try and fix npm 6 before that date.

@ljharb
Copy link
Contributor

ljharb commented Oct 8, 2021

@danShumway npm doesn’t have an LTS version or policy.

@nmm-shumway
Copy link

nmm-shumway commented Oct 8, 2021

@lijharb Appologies, this is my fault; I sometimes have a problem phrasing things clearly in these comments and I tend to gloss over details.

npm does not have an LTS policy, but Node does, and Node is the default way that most users (especially enterprise users) will get npm. The current Node website recommends Node 14, which ships with npm 6. As a result, bugfixes in npm 7 effectively don't exist for users who are following Node's LTS release schedule.

Screen Shot 2021-10-08 at 9 32 48 AM

I suspect that there's a disconnect between the Node team and the npm team on this issue. My understanding is that npm maintainers recommend that users update to the most recent version of npm that can work with their Node version. It does not seem like this recommendation has ever been embraced by the Node team, Node does not upgrade users to the most recent version of npm, and does not recommend to users that they should upgrade their npm install.

This leads to situations like the current one, where users are (reasonably) asking about issues they're seeing in supported versions of Node, and npm maintainers are suggesting that the issues either don't exist or have already been fixed, even though for most Node users the issues won't be resolved until the next LTS Node release (which could be a half a year away or even longer if they're using a maintenance version of Node).

npm does not have an official LTS policy, but in practice that doesn't really matter, because the majority of npm users will not be upgraded outside of Node's release schedule.

Screen Shot 2021-10-08 at 9 33 32 AM

In this specific case, Node's active LTS for v14 is scheduled to end on October 19th, which is why there is likely to be little motivation to try and fix issues in npm 6. I assume that whenever npm 8 is released, we'll run into the same situation with npm 7, where it's the predominantly used version for most users, but bugs and triaging will all be based off of npm 8's behavior instead.


My feeling is that at some point this is probably something the npm team and the Node team should work out together. If npm's policy is that users should use the most up-to-date version, then that should be communicated to the Node team so that they can stop distributing outdated versions of npm. If the policy is that LTS Node versions shouldn't uptake any major version changes for npm, then in effect you kind of do have an LTS policy whether you like it or not, because most users will be using the version of npm that ships with Node, and many of them will be filing bugs against that version.

At the least, it seems like this would be good to document somewhere? I had to figure out myself that npm 6 and npm 7 have different behaviors with how they treat lockfiles. Even now, I don't know which behavior is "correct"; there is no formal spec I can find anywhere for when npm install should respect the pinned version of a dependency and when it should overwrite it. Is this a bug in npm 6, or a breaking change between v6 and v7? I don't know. There's no documentation anywhere I can find to indicate that they even have different behaviors at all.

That leads to issues like the linked bug, where at some point it seems like the npm maintainers and users are just talking past each other. If at some point in the linked bug someone from the npm team had said, "The version of npm installed with Node LTS is outdated and handled package-locks differently than npm 7, and after a new Node LTS install you should always run npm install -g npm@latest" -- that would have cleared a lot of the confusion very quickly.

In a broader sense, it seems like some clarity and agreement between Node and npm about which version of npm users are supposed to be running, and which versions of npm are supported if there are bugs, would probably go a long way towards avoiding these issues in the first place. It seems less than ideal that Node's official recommendation is that users install a version of npm that npm maintainers/contributors are telling me is unsupported.

@zkrising
Copy link

My problems with the current behaviour are slightly different to the original proposals. I had not read much documentation on how npm install works, but I had assumed that if a package-lock.json file was present, npm install would honor it and install exactly that.

Since that's not true, it seems to me like there's some counterintuitive behaviour. A lot of people do run npm install even on their CI servers, because it does just enough to be intuitively correct, but is still wrong.

Infact, the command npm ci isn't mentioned very frequently in places that tell people how to install dependencies. Almost every project I've seen has suggested you run npm install to get the dependencies for the project, when this is technically incorrect.


I would like to extend the proposal that npm install, if a package-lock.json file is present, should honor the dependencies specified in that file.


Since there are commands like npm update which more explicitly update your dependencies (and is clearer in what it does), this should not affect usage much (But this is still a breaking change).

I don't see many scenarios in which people are using npm install and depending on the behaviour that the package they recieve isn't the one they've specified in their package-lock. If they are depending on that behaviour, what they were already doing was broken/questionable.

My proposal is more extreme than the authors, but I think it makes more sense intuitively. npm install is what is documented and used in a lot of places to install dependencies, and people (myself included) definitely get the wrong idea that npm install is the way to install the dependencies in the lockfile.

My motivation for extending this proposal rather than having this behaviour behind a flag is that, if behind a flag, people are significantly less likely to use this behaviour, when it is more likely to be what they want. I think it is less likely that users want the current behaviour than the proposed one, so it would make more sense to put that behaviour behind a flag, i.e. npm install --from-lockfile=false.

@mikaelboff
Copy link

is it release yet?

@danShumway
Copy link
Author

danShumway commented Jan 21, 2022

@mikaelboff no, but if you're on a more recent version of npm (specifically using v2 lockfiles) the behavior is closer to expected, so should dodge the majority of issues people were seeing.

That being said, Node 14 will go into maintenance on October 19th, and at that point Node 16 will swap to being the default download on the website. So I'm doubtful that there's going to be much motivation from anyone (myself included) to try and fix npm 6 before that date.

The current LTS versions of Node are bundled with npm versions that use v2 lockfiles. This doesn't really change anything for anyone who is still using older versions, but I guess they'd be unlikely to get a patch anyway.

TLDR if possible update your version of Node and you should see this issue less often.

I'm leaving this open very temporarily while I try to figure out if there's still any value in trying to make the behavior more explicit or more reliable, but it should likely be closed. v1 lockfiles were basically unreliable for anything other than npm ci. v2 seems to be a lot better, but I have not sat down and meticulously tested it.

@mikaelboff
Copy link

mikaelboff commented Jan 22, 2022 via email

stevematney added a commit to WTW-IM/es-components that referenced this issue Oct 30, 2023
…odules

According to [this comment](npm/rfcs#415 (comment))
more recent lockfiles will be much more accurate when using regular
install as opposed to the behavior of previous lockfiles, where a
regular install would cause random unexpected updates.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants