Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retry for docker pull to deal with occasional errors from ghcr #111

Merged
merged 6 commits into from
Oct 15, 2021

Conversation

peterbroadhurst
Copy link
Contributor

Split docker-compose pull into a separate step with retries, as a simple way to address hyperledger/firefly#244

Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
@peterbroadhurst peterbroadhurst changed the title Add retry for docker pull to deal with ocassional errors from ghcr Add retry for docker pull to deal with occasional errors from ghcr Oct 12, 2021
@nguyer
Copy link
Contributor

nguyer commented Oct 12, 2021

Closing and re-opening to trigger the DCO check again

@nguyer nguyer closed this Oct 12, 2021
@nguyer nguyer reopened this Oct 12, 2021
@peterbroadhurst peterbroadhurst marked this pull request as draft October 13, 2021 01:31
@peterbroadhurst peterbroadhurst marked this pull request as ready for review October 13, 2021 01:31
@peterbroadhurst
Copy link
Contributor Author

Tried draft/un-draft too. Think we might need a comment based confirmation of DCO in this case.

if err := docker.RunDockerComposeCommandWithRetry(workingDir, verbose, verbose, "pull"); err != nil {
return err
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already do a pull (if --no-pull is not set) here:

if !options.NoPull {
s.Log.Info("pulling latest versions")
if err := docker.RunDockerComposeCommand(workingDir, verbose, verbose, "pull"); err != nil {
return err
}
}

Should we just change that existing call to RunDockerComposeCommandWithRetry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, this is a great point… but what it points at (which I totally missed) is the problem is the e2e test is doing a no-pull 🤦‍♂️

So 👍 to moving the retry on the existing pull, but it means this won’t actually fix the issue.

We might need to punt this to after your manifest work, and have a new behavior for the e2e test where it no long specifies no-pull, but instead uses a different (local) image name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made the change on this PR, and am up for working on the follow-on change to generate a special manifest in the e2e test code once the manifest work is ready

Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
@peterbroadhurst
Copy link
Contributor Author

My earlier assertion that the repo part of the docker image would be enough to stop docker-compose pulling it, seems to be proven false:
docker/compose#3660

This leaves us in a bit of a quandary (as many others on that thread are).
We can:

  • Use --ignore-pull-failures and still do a pull to hopefully work around occasional ghcr failures
  • Move the retry to the whole docker-compose start

The latter worries me, because it could cause us to miss actual startup error conditions like hyperledger/firefly#254

Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
@peterbroadhurst
Copy link
Contributor Author

Ok - think this latest approach does what we need

  • Added a ff pull <STACK> --retires [X] command for e2e to use
  • Added a local: true option to the manifest, to allow a particular image to be skipped pulling
  • Automatically add local: true for the case the CLI auto-adds FireFly core today

Then a small tweak to the E2E is to do a pull -retires 3 before a start

Signed-off-by: Peter Broadhurst <peter.broadhurst@kaleido.io>
cmd/pull.go Outdated Show resolved Hide resolved
Copy link
Contributor

@nguyer nguyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for this!

Copy link
Contributor

@nguyer nguyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this version of the CLI with the corresponding core PR hyperledger/firefly#251 and all E2E tests passed locally! 😃

@nguyer nguyer merged commit 258c705 into hyperledger:main Oct 15, 2021
@nguyer nguyer deleted the fix-244 branch October 15, 2021 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants