Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flake: Getting Started (registry/yarn) #9325

Closed
Chris-Hibbert opened this issue May 6, 2024 · 9 comments · Fixed by #9550, #9740, #9799 or #10295
Closed

Flake: Getting Started (registry/yarn) #9325

Chris-Hibbert opened this issue May 6, 2024 · 9 comments · Fixed by #9550, #9740, #9799 or #10295
Assignees
Labels
bug Something isn't working flake flakey test

Comments

@Chris-Hibbert
Copy link
Contributor

Chris-Hibbert commented May 6, 2024

Describe the bug

The Getting Started (registry/yarn) is flakey

To Reproduce

The test failed, as shown in
https://github.com/Agoric/agoric-sdk/actions/runs/8973265730/job/24643072843?pr=9283

Error: rpc error: code = Unknown desc = rpc error: code = Unknown desc = account sequence mismatch, expected 60, got 59: incorrect account sequence [agoric-labs/cosmos-sdk@v0.46.16-alpha.agoric.2.1/x/auth/ante/sigverify.go:269] With gas wanted: '18446744073709551615' and gas used: '38588' : unknown request

hitting "Rerun failed tests" caused it to pass.

Other instances

Expected behavior

Tests should not be flakey.

Platform Environment

Running in CI.

@Chris-Hibbert Chris-Hibbert added bug Something isn't working flake flakey test labels May 6, 2024
@turadg
Copy link
Member

turadg commented May 9, 2024

I've been running into this lately too. Latest with registry/npm

lerna ERR! E503 one of the uplinks is down, refuse to publish
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

@mergify mergify bot closed this as completed in #9550 Jun 21, 2024
mergify bot added a commit that referenced this issue Jun 21, 2024
closes: #9325


## Description

Auto-retry the flaky test.

### Security Considerations
none

### Scaling Considerations
none

### Documentation Considerations
none

### Testing Considerations
The problem it's fixing is intermittent. If it passes CI once, let's let it in and see over the following days whether CI failure is down. (And reopen the issue that landing this closes)

### Upgrade Considerations
none
mhofman pushed a commit that referenced this issue Jun 22, 2024
closes: #9325


## Description

Auto-retry the flaky test.

### Security Considerations
none

### Scaling Considerations
none

### Documentation Considerations
none

### Testing Considerations
The problem it's fixing is intermittent. If it passes CI once, let's let it in and see over the following days whether CI failure is down. (And reopen the issue that landing this closes)

### Upgrade Considerations
none
@turadg
Copy link
Member

turadg commented Jul 11, 2024

#9550 set up a retry, but it doesn't always work.

CI log

workflow

yarn start:contract works

Difference (- actual, + expected):

  • 2
  • 0

› gettingStartedWorkflowTest (packages/agoric-cli/tools/getting-started.js:124:7)

1 test failed
error Command failed with exit code 1.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.
Warning: Attempt 1 failed. Reason: Child_process exited with error code 1
yarn run v1.22.22
$ yarn run create-agoric-cli /home/runner/bin/agoric
$ node ./scripts/create-agoric-cli.cjs /home/runner/bin/agoric
Script directory /home/runner/bin does not appear in $PATH
(You may want to export PATH=$PATH:/home/runner/bin' to add it to your PATH environment variable) ensuring /home/runner/bin exists creating /home/runner/bin/agoric Error: /home/runner/bin/agoric must not already exist; you should use a fresh path. at Object.<anonymous> (/home/runner/work/agoric-sdk/agoric-sdk/scripts/create-agoric-cli.cjs:45:11) at Module._compile (node:internal/modules/cjs/loader:1256:14) at Module._extensions..js (node:internal/modules/cjs/loader:1310:10) at Module.load (node:internal/modules/cjs/loader:1119:32) at Module._load (node:internal/modules/cjs/loader:960:12) at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:86:12) at node:internal/main/run_main_module:23:47 error Command failed with exit code 1. info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command. error Command failed with exit code 1. info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command. Warning: Attempt 2 failed. Reason: Child_process exited with error code 1 yarn run v1.22.22 $ yarn run create-agoric-cli /home/runner/bin/agoric $ node ./scripts/create-agoric-cli.cjs /home/runner/bin/agoric Script directory /home/runner/bin does not appear in $PATH (You may want to export PATH=$PATH:/home/runner/bin' to add it to your PATH environment variable)
ensuring /home/runner/bin exists
creating /home/runner/bin/agoric
Error: /home/runner/bin/agoric must not already exist; you should use a fresh path.
at Object. (/home/runner/work/agoric-sdk/agoric-sdk/scripts/create-agoric-cli.cjs:45:11)
at Module._compile (node:internal/modules/cjs/loader:1256:14)
at Module._extensions..js (node:internal/modules/cjs/loader:1310:10)
at Module.load (node:internal/modules/cjs/loader:1119:32)
at Module._load (node:internal/modules/cjs/loader:960:12)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:86:12)
at node:internal/main/run_main_module:23:47
error Command failed with exit code 1.

@turadg turadg reopened this Jul 11, 2024
@turadg
Copy link
Member

turadg commented Jul 17, 2024

@frazarshad to get this working I suggest making a PR in which each of the getting-started tests is run twice. The errors we're seeing seem to be about the jobs not being able to be repeated.

Once we solve that, the "retry upon failure" mechanism should solve the flakiness.

@frazarshad
Copy link
Contributor

@turadg worked on a solution for this but apparently a similar fix has been made recently

@turadg
Copy link
Member

turadg commented Jul 22, 2024

@michaelfig 's fix from 2 days ago (after your PR) does fix the leftover file problem. I'd approve changing it to #9740 because I think it's more clear and maintainable.

I'm not certain that's the only problem with retries. I'd still like to see a run inducing (unnecessary) repetition to confirm. But I'm okay with closing this and reopening it if the flake is encountered again.

@michaelfig
Copy link
Member

I'd approve changing it to #9740

So would I. My fix was expedient, but I'd be happy to see it structured better (and more idiomatically) as #9740 does.

@frazarshad
Copy link
Contributor

@turadg made #9740 ready for review

@turadg turadg reopened this Jul 23, 2024
@mergify mergify bot closed this as completed in #9799 Aug 5, 2024
mergify bot added a commit that referenced this issue Aug 5, 2024
closes: #9325, closes: #9710

## Description
This is a minor fix to make sure the `registry/yarn` getting-started CI test doesn't fail.  It embraces the current situation that `agoric install $DISTTAG` is not currently usable within the default [Agoric/dapp-offer-up](https://github.com/Agoric/dapp-offer-up) because of the long list of `packageJson.resolutions`  `dapp-offer-up` uses.

Thus, `getting-started.js` has `AGORIC_INSTALL_DISTTAG` set to `false`.  This ensures that `agoric install $DISTTAG` does not execute.  It is replaced by simply `yarn install`.
@mujahidkay mujahidkay reopened this Oct 18, 2024
@mujahidkay
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment