Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: wrong node version installed when cache is corrupt #541

Closed
2 of 5 tasks
ben-styling opened this issue Jul 12, 2022 · 8 comments
Closed
2 of 5 tasks

bug: wrong node version installed when cache is corrupt #541

ben-styling opened this issue Jul 12, 2022 · 8 comments
Assignees
Labels
bug Something isn't working

Comments

@ben-styling
Copy link

Description:

I'm unsure of the cause, but somehow a version of node has been cached in the wrong directory.
The only way to fix this problem is to delete the cache manually if you have access to the runner.

Action version:
v3

Platform:

  • Ubuntu
  • macOS
  • Windows

Runner type:

  • Hosted
  • Self-hosted

Tools version:
Requesting v16.15.1 gave v14.16.1 (requested cache path was correct, but the version of node inside was wrong)

Repro steps:
I do not know how to reproduce this with normal usage, but to test on a self hosted runner, install a version of node into the wrong directory.

Expected behavior:
A version of node requested should be installed

Actual behavior:
A corrupted cached version is provided

@ben-styling ben-styling added bug Something isn't working needs triage labels Jul 12, 2022
@dmitry-shibanov
Copy link
Contributor

Hello @ben-styling. Thank you for your report. Could you please provide simple steps to reproduce the issue or debug logs ? Which version of runner do you use ?

@ben-styling
Copy link
Author

Hi @dmitry-shibanov, thank you for your reply!

I'm very sorry, but I don't know how to replicate the issue we're having without manually putting the wrong binary in the wrong cache directory.

There are not reproduction steps, but while working on a PR for this, I ran the action on a self-hosted runner to install v16.16.0
Then in _work/_tool/node I copied 16.16.0 to 16.15.1 cp -Ra ./16.16.0 ./16.15.1
When running the action to request v16.15.1, the actual version was 16.16.0.

I'm not sure what you mean when asking which version of runner we use. I know that our runners are x64 linux, but I think once the cache is corrupt, the behaviour is the same on all platforms.

I think these are the logs you're asking for, let me know if not!

##[debug]Evaluating condition for step: 'Setup node.js'
##[debug]Evaluating: success()
##[debug]Evaluating success:
##[debug]=> true
##[debug]Result: true
##[debug]Starting: Setup node.js
##[debug]Register post job cleanup for action: actions/setup-node@v3
##[debug]Loading inputs
##[debug]Evaluating: github.token
##[debug]Evaluating Index:
##[debug]..Evaluating github:
##[debug]..=> Object
##[debug]..Evaluating String:
##[debug]..=> 'token'
##[debug]=> '***'
##[debug]Result: '***'
##[debug]Loading env
Run actions/setup-node@v3
  with:
    node-version-file: .nvmrc
    check-latest: true
    registry-url: https://npm.pkg.github.com/
    scope: ***
    cache: npm
    cache-dependency-path: **/package-lock.json
    always-auth: false
    token: ***
  env:
    GITHUB_PKG_TOKEN: ***
Resolved .nvmrc as 16.15.1
Attempt to resolve the latest version from manifest...
##[debug]No manifest cached
##[debug]Getting manifest from actions/node-versions@main
##[debug]set auth
##[debug]check 18.5.0 satisfies 16.15.1
##[debug]check 18.4.0 satisfies 16.15.1
##[debug]check 18.3.0 satisfies 16.15.1
##[debug]check 18.2.0 satisfies 16.15.1
##[debug]check 18.1.0 satisfies 16.15.1
##[debug]check 18.0.0 satisfies 16.15.1
##[debug]check 16.16.0 satisfies 16.15.1
##[debug]check 16.15.1 satisfies 16.15.1
##[debug]x64===x64 && darwin===linux
##[debug]x64===x64 && linux===linux
##[debug]matched 16.15.1
Resolved as '16.15.1'
##[debug]isExplicit: 16.15.1
##[debug]explicit? true
##[debug]checking cache: /opt/hostedtoolcache/node/16.15.1/x64
##[debug]Found tool in cache node 16.15.1 x64
Found in cache @ /opt/hostedtoolcache/node/16.15.1/x64
/usr/bin/node --version
v14.16.1
::set-output name=node-version::v14.16.1%0A
##[debug]='v14.16.1
##[debug]'

---- truncated ----

##[debug]Evaluating condition for step: 'Debug version'
##[debug]Evaluating: success()
##[debug]Evaluating success:
##[debug]=> true
##[debug]Result: true
##[debug]Starting: Debug version
##[debug]Loading inputs
##[debug]Loading env
Run node -v && npm -v
##[debug]/bin/bash -e /tmp/runners/runner0/_temp/bca74845-3a89-444c-87ac-b3c968d6edca.sh
v14.16.1
6.14.12
##[debug]Finishing: Debug version

@dmitry-shibanov
Copy link
Contributor

Could you please confirm that the version was installed in the wrong directory ? It is an expected behaviour if you put binaries of node16 to node14 directory that it will use node14 binaries and the action won't throw an error. The toolkit/toolcache package does not call the tool's version and rely on SemVer version of directory.

If I understand correctly the initial issue is that node16 was pre-cached to wrong directory by the action.

@dmitry-shibanov dmitry-shibanov self-assigned this Jul 12, 2022
@ben-styling
Copy link
Author

I can confirm that node14 was installed in a node16 directory.

Yes, the initial issue is that node14 was somehow cached to the wrong directory.
I do not know if it was this action, or another action using the same toolcache that caused this error.

@dmitry-shibanov
Copy link
Contributor

Hello @ben-styling. I've tried to reproduce the issue but everything works as expected. Could you please provide yaml in which the issue was caused ?

@dmitry-shibanov
Copy link
Contributor

Hello @ben-styling, just a gentle ping.

@ben-styling
Copy link
Author

ben-styling commented Jul 20, 2022

Hi there, apologies for my delayed response.

Thank you again for looking into this issue!

Here's what the yaml looked like initially

name: PR

on: pull_request

jobs:
  build:
    name: Lint and unit-test

    runs-on: **

    env:
      GITHUB_PKG_TOKEN: **

    steps:
      - name: Checkout
        uses: actions/checkout@v2

      - name: Setup node.js
        uses: actions/setup-node@v2
        with:
          node-version: 14
          registry-url: https://npm.pkg.github.com/
          scope: **

      - name: Cache dependencies
        id: cache
        uses: actions/cache@v2
        with:
          path: ./node_modules
          key: npm-${{ hashFiles('package-lock.json') }}

      - name: Debug version
        run: node -v && npm -v

      - name: Install Dependencies
        if: steps.cache.outputs.cache-hit != 'true'
        run: npm ci

      - name: Lint
        run: npm run lint

      - name: Test
        run: npm run test

We then updated node-version to node-version: 16.15.1, which was working.

Then 12 days ago (July 8th) the same yaml file (the one above, but with node-version: 16.15.1) was giving us node 14.16.1
This is when we looked into why and found that the action was choosing the correct path /opt/hostedtoolcache/node/16.15.1/x64, but that path had node 14 inside somehow.

We have since updated our yaml to the following:

name: PR

on: pull_request

jobs:
  build:
    name: Lint and unit-test

    runs-on: **

    env:
      GITHUB_PKG_TOKEN: **
    steps:
      - name: Checkout
        uses: actions/checkout@v3

      - name: Setup node.js
        uses: actions/setup-node@v3
        with:
          node-version-file: '.nvmrc'
          check-latest: true
          registry-url: https://npm.pkg.github.com/
          scope: **
          cache: 'npm'
          cache-dependency-path: '**/package-lock.json'

      - name: Debug version
        run: node -v && npm -v

      - name: Install Dependencies
        run: npm ci
        env:
          NODE_AUTH_TOKEN: **

      - name: Lint
        run: npm run lint

      - name: Test
        run: npm run test

This didn't fix the issue, but in nvmrc we just increased the node version to 16.16.0

As this runner is used on several projects across multiple teams, I can't be sure if it was this action that caused a version of node to be placed in the wrong directory.

I'm suggesting that this action should check the version of node after a cached version is found and try at least once to correct the cache.
As far as I know there's no way to invalidate this cache unless you have access to the runner, and it seems the cache doesn't have a TTL.

edit:

A colleague of mine has suggested that a runner that's running out of disk space, or a network error could have caused this to happen. I'm struggling to recreate those conditions, though.

@ben-styling
Copy link
Author

Hi @dmitry-shibanov, thank you again for your help with this. After speaking with @vsafonkin, I no longer think this issue is within scope of this project. This is likely an external issue.

Closing this issue.

deining pushed a commit to deining/setup-node that referenced this issue Nov 9, 2023
Bumps [prettier](https://github.com/prettier/prettier) from 3.0.2 to 3.0.3.
- [Release notes](https://github.com/prettier/prettier/releases)
- [Changelog](https://github.com/prettier/prettier/blob/main/CHANGELOG.md)
- [Commits](prettier/prettier@3.0.2...3.0.3)

---
updated-dependencies:
- dependency-name: prettier
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants