Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(cd): improve Docker and gcloud usage without Cloud Build #3431

Merged
merged 60 commits into from
Feb 9, 2022

Conversation

gustavovalverde
Copy link
Member

@gustavovalverde gustavovalverde commented Jan 28, 2022

Motivation

We had a Dockerfile that was good enough but could be improved to be used on multiple scenarios, and caching leverage could also be improved (it's changed on this PR, but caching is not used

We've also experienced multiple issues with Cloud Build, from increasing timeouts at build time, not inheriting branch names, to not having a way to check logs directly from GitHub. This motivated using GHA as the default pipeline for all our requirements.

Pipelines could also be simpler, by leveraging the use of variables, secrets and removing repeated or not needed steps, or even merging manual and automated deployments on a single file.

Solution

Docker

  • Use a configurable Dockerfile through ARG and ENV
  • Use a specific rust version to reduce unexpected behaviors from upgrades (and ensure support up to a specific version)
  • Use bullseye instead of buster for Debian to reduce vulnerabilities and use more up to date libraries
  • Implement cargo-chef for docker layers caching
  • Move zebrad executable to /usr/local/bin to be able to use it without pointing to a custom path
  • Compile tests without running them, to be able to run them after deployment

Pipeline

  • Remove cloud build dependency
  • Move Sentry endpoint to secrets
  • Use a single yml for auto & manual deploy

GCP

Fixes: #2418 #3447
Partially: #3432

Review

I've already tested single deployments (manual). Automated deployment was not particularly changed in the way it gets deployed. It's just using the same Dockerfile as the manual deployment (it already was a dockerized deployment).

@dconnolly @conradoplg @teor2345

Reviewer Checklist

  • Review Dockerfile commands and order of cargo execution
  • Confirm merged actions design
  • Confirm Cloud Build removal as a sane option

Follow Up Work

- Use a more ENV configurable Dockerfile
- Remove cloudbuild dependency
- Use compute optimized machine types
- Use SSD instead of normal hard drives
- Move Sentry endpoint to secrets
- Use a single yml for auto & manual deploy
- Migrate to Google Artifact Registry
- Use a more ENV configurable Dockerfile
- Remove cloudbuild dependency
- Use compute optimized machine types
- Use SSD instead of normal hard drives
- Move Sentry endpoint to secrets
- Use a single yml for auto & manual deploy
- Migrate to Google Artifact Registry
@gustavovalverde gustavovalverde force-pushed the docker-actions-refactor branch from 5f812db to 3824e7d Compare January 28, 2022 03:01
@codecov
Copy link

codecov bot commented Jan 28, 2022

Codecov Report

Merging #3431 (ef82699) into main (499ae89) will increase coverage by 0.05%.
The diff coverage is 75.08%.

@@            Coverage Diff             @@
##             main    #3431      +/-   ##
==========================================
+ Coverage   78.34%   78.39%   +0.05%     
==========================================
  Files         267      273       +6     
  Lines       31526    31855     +329     
==========================================
+ Hits        24698    24972     +274     
- Misses       6828     6883      +55     

Caching from the latest image is one of the main reasons to add this extra tag. Before this commit, the inline cache was not being used.
The inline cache exporter only supports `min` cache mode. To enable `max` cache mode, push the image and the cache separately by using the registry cache exporter.

This also allows for smaller release images.
We're leveraging the registry to cache the actions, instead of using the 10GB limits from Github Actions cache storage
Use interactive shells for manual and test deployments. This allow greater flexibility if troubleshooting is needed inside the machines
@gustavovalverde gustavovalverde marked this pull request as ready for review January 31, 2022 03:15
@teor2345

This comment has been minimized.

docker/Dockerfile.build Outdated Show resolved Hide resolved
.github/workflows/cd.yml Outdated Show resolved Hide resolved
.github/workflows/cd.yml Outdated Show resolved Hide resolved
.github/workflows/cd.yml Outdated Show resolved Hide resolved
.github/workflows/cd.yml Outdated Show resolved Hide resolved
docker/Dockerfile.build Outdated Show resolved Hide resolved
docker/Dockerfile.build Outdated Show resolved Hide resolved
Copy link
Contributor

@dconnolly dconnolly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great, some small questions to resolve before we merge!

@gustavovalverde
Copy link
Member Author

I think we're done.

I've implemented the parameter caching with a specific action to trigger a docker build for this specific use: https://github.com/ZcashFoundation/zebra/blob/docker-actions-refactor/.github/workflows/zcash-params.yml

And then use the resulting artifact in our day-to-day Dockerfile:
https://github.com/ZcashFoundation/zebra/blob/docker-actions-refactor/docker/Dockerfile.build#L55-L58

Copy link
Contributor

@teor2345 teor2345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, let's get it merged!

(I'd like to merge separately from the test PR, so we can bisect any failures and revert if needed.)

@teor2345
Copy link
Contributor

teor2345 commented Feb 8, 2022

I opened #3489 for the test failures, they seem unrelated.
(They have also happened in another PR recently.)

@gustavovalverde gustavovalverde merged commit a872f17 into main Feb 9, 2022
@gustavovalverde gustavovalverde deleted the docker-actions-refactor branch February 9, 2022 00:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Google Cloud Build failure on some branch names
4 participants