Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

airbyte-ci: run on github hosted runners #34316

Merged

Conversation

alafanechere
Copy link
Contributor

@alafanechere alafanechere commented Jan 17, 2024

What

Closes #33914
We want to make our CI infrastructure running airbyte-ci commands run on GitHub hosted runners and not on self hosted runners.
Tech spec

How

  • Manually create new per use-case Github hosted runners.
  • Make the CI workflows using airbyte-ci run on these new runners
  • Rename run-dagger-pipeline action to run-airbyte-ci and make this action:
    • Optionally connect the runner's dockerd to a registry mirror if provided as an input.
    • Pull the dagger engine docker image according to the output of airbyte-ci --ci-requirements.
    • Cache the Dagger engine docker image to avoid pulling it on fresh runners.

Recommended reading order

Github actions changes

Refactor the run-dagger-pipeline action into smaller and reusable actions.

  1. .github/actions/run-airbyte-ci/action.yml
  2. .github/actions/install-airbyte-ci/action.yml
  3. github/actions/get-dagger-engine-image/action.yml

Github workflows changes

All workflows using airbyte-ci in .github/*

  • Remove the get_ci_runner jobs. We now call airbyte-ci --ci-requirements to pull the correct dagger-engine docker image according to the dagger SDK version. It's done within `.github/actions/install-airbyte-ci/action.yml``
  • Set runs-on to newly manually provisioned Github hosted runners
  • Use a new Dagger Cloud token secret

airbyte-ci changes:

  • Remove the unused TAILSCALE_AUTH_KEY and INFRA_SUPPORTED_DAGGER_VERSION constants
  • Expose the --ci-requirements at the airbyte-ci root command group level
  • Make --ci-requirements output the Dagger engine image to use.

🚨 Performance benchmark 🚨

Self hosted CPUs Github Host CPUs Self hosted duration GitHub hosted duration speed difference
format 8 2 3mn53 3mn38 s -15s
test source-faker + source-postgres 16 16 9mn43 12mn32 +2mn49
publish source-pokeapi 16 16 2mn58s 4mn35 +1mn37s
Nightly build 32 32 4h31 4h54 +23mn

Note on speed differences:

  • Interestingly, on a smaller runner with only 2 CPUs (ubuntu-latest) the format job is slighlty faster.
  • We can note that on similar instance type (16CPUs) test and publish are slower. The main reason is that on Github hosted runners we have a final step gracefully stopping the dagger engine. The engine shut down can take up to 5mn to upload the dagger cache to Dagger Cloud. This increase in run duration does not impact DX because commits statuses and slack messages are reported before the engine shut down. So we can safely say that the speed difference is not impeeding DX.
  • I'm not alarmed by the speed difference for nightly builds: I sequentially run both self hosted and github hosted workflow. Nightly build consumes the API rate limits so the second run (Github hosted) can easily be slower. Moreover the Github hosted runners did not have their dagger cache seeded.

Copy link

vercel bot commented Jan 17, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Jan 29, 2024 9:24am

@alafanechere alafanechere force-pushed the augustin/01-17-airbyte-ci_run_on_github_hosted_runners branch 14 times, most recently from 2ab4ff0 to f533ff8 Compare January 24, 2024 08:16
@alafanechere alafanechere force-pushed the augustin/01-17-airbyte-ci_run_on_github_hosted_runners branch 14 times, most recently from 7fb9c53 to 83bdcd5 Compare January 24, 2024 10:29
@alafanechere alafanechere force-pushed the augustin/01-17-airbyte-ci_run_on_github_hosted_runners branch from 199f740 to 4f45659 Compare January 25, 2024 11:26
Copy link
Contributor

@perangel perangel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all seems reasonable to me. 👍🏼

TAILSCALE_AUTH_KEY: ${{ inputs.tailscale_auth_key }}
DOCKER_REGISTRY_MIRROR_URL: ${{ inputs.docker_registry_mirror_url }}
PYTHON_REGISTRY_TOKEN: ${{ inputs.python_registry_token }}
# give the Dagger Engine more time to push cache data to Dagger Cloud
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice - and thanks for explaining 👍🏻

Comment on lines 110 to 112
sudo systemctl stop docker
echo '{"registry-mirrors": ["https://${{inputs.docker_registry_mirror_url}}"]}' | sudo tee /etc/docker/daemon.json
sudo systemctl start docker
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how long this takes/if it is an issue at all.

But looks like you can force the docker daemon to reload its configuration without restarting:

Restart the Docker daemon. On Linux, you can avoid a restart (and avoid any downtime for your containers) by reloading the Docker daemon. If you use systemd, then use the command systemctl reload docker. Otherwise, send a SIGHUP signal to the dockerd process.

(from here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, I did not notice any related slowness, but if I do I'll consider it!

@alafanechere alafanechere force-pushed the augustin/01-17-airbyte-ci_run_on_github_hosted_runners branch from 4f45659 to cc09474 Compare January 25, 2024 17:08
@alafanechere alafanechere force-pushed the augustin/01-17-airbyte-ci_run_on_github_hosted_runners branch 3 times, most recently from eeb7751 to 47039ff Compare January 26, 2024 09:44
@alafanechere alafanechere force-pushed the augustin/01-17-airbyte-ci_run_on_github_hosted_runners branch 5 times, most recently from 6d43576 to 72c1211 Compare January 29, 2024 09:03
@alafanechere alafanechere force-pushed the augustin/01-17-airbyte-ci_run_on_github_hosted_runners branch from 72c1211 to 1b1709a Compare January 29, 2024 09:24
@alafanechere alafanechere merged commit 57b43a4 into master Jan 29, 2024
21 checks passed
@alafanechere alafanechere deleted the augustin/01-17-airbyte-ci_run_on_github_hosted_runners branch January 29, 2024 09:42
Copy link

sentry-io bot commented Jan 30, 2024

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

Did you find this useful? React with a 👍 or 👎

jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 21, 2024
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Decide on path forward for our CI infrastructure
3 participants