-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GitHub actions and workflows #12085
GitHub actions and workflows #12085
Conversation
What is the purpose of moving from CircleCI to GitHub Actions? |
Meta open source projects are consolidating on GitHub Actions over the next year |
@bigfootjon Thanks for replying, I was not aware of that.
This is a Continuous Benchmark system we configured for RocksDB that uses a Custom CircleCI runner on dedicated hardware, are you guys aware of that? I am not sure how easy it will be to port this to GitHub Actions yet...
This is a Linux Docker container, so it doesn't need anything particularly special from the host, we have run this just fine under Ubuntu, CentOS, and macOS. |
@robandpdx Why don't I see any results for the proposed new configuration? What are we supposed to do to debug if we can't see any results? |
I would recommend creating either a fork or create a new repo in this org add an origin to your clone. Then merge this branch to |
OK I dug deeper and found that the pull request trigger on our CircleCI jobs was not migrated. The draft PR here only had a trigger on pushes to main. I've fixed that, but now I'm getting weird failures on most jobs with no diagnostics @robandpdx : |
These jobs require runner groups as described above...
|
Apparently the diagnostics are on the "summary" page, where you have to scroll down with your pointer NOT over the main content (where scrolling does nothing so seems to indicate there is nothing to scroll to). Clicking individual failures does not take you to the diagnostics. And "summary" seems like the worst place to put failure details when you have produced failure pages for each job. |
@adamretter @robandpdx It looks like evolvedbinary's docker image doesn't work with any reasonable version of the checkout action, as seen here: https://github.com/facebook/rocksdb/actions/runs/7227817623/job/19697909322?pr=12085 Whether it's using node16 or node20, both appear to be bound to GLIBC versions (>= 2.14) unavailable to the image that indicates it is from CentOS 6 (glibs 2.12). I don't see any GitHub Actions documentation about these kinds of limitations on docker images. Am I missing it somewhere? And I can't even ssh in to debug, because (a) that would only get me into the docker environment, and (b) the ssh action also fails: https://github.com/facebook/rocksdb/actions/runs/7228263740/job/19697647417 What's our next step? |
@pdillinger You could run the container locally to debug. Another option is to get a reverse shell into the container running on the actions runner using the method I have published here. |
Hmm, this does not make sense to me. Obviously running the commands we want to run in the container works as expected, based on the CircleCI results. The problem is GHA trying to run what it wants to run to get things set up in the container. How do I debug that locally? I haven't found any GHA documentation that seems relevant. Best I can tell, the only way to test GHA workflows is guess-and-test (on the server side).
I don't think the debug output from the failed jobs provides sufficient context to know how to reproduce the error seen. For example, the last command seen before the failure is |
@pdillinger I'll see if I can get some help internally to figure out a way forward here. I agree, troubleshooting github actions and workflows is less than ideal. |
There is nothing special about our Docker Image that I am aware of. Yes the Image very intentionally packages an older version of CentOS so that we can build a version of RocksDB that has a wide glibc compatibility. We do not expect anything to be executed inside the container apart from the script When running the Docker container locally, it expects the source code to mounted from the Host via a volume bind. That is not possible in CircleCI or GitHub Actions. Which the For GitHub Actions, I think it should be fine to remove the |
@adamretter I think I was able to get past the checkout issue by installing |
@robandpdx I pushed new Docker images under the same tag that have |
@robandpdx Looks like still failing with a re-run https://github.com/facebook/rocksdb/actions/runs/7264657871/job/19798246275?pr=12085 |
Summary: Largely based on facebook#12085 but grouped into one large workflow because of bad GHA UI design (see comments). Test Plan: TODO
Summary: Largely based on facebook#12085 but grouped into one large workflow because of bad GHA UI design (see comments). Test Plan: TODO
@adamretter It looks like you broke the CircleCI job 🤷♂️ https://app.circleci.com/pipelines/github/facebook/rocksdb/35846/workflows/26a89122-6299-4788-b79e-630098bc15d3/jobs/718616 |
@pdillinger So I just checked, previously as there was no
The Docker images now have
It looks to me that
So I think we just need to send a PR to fix the current CircleCI config to be compatible with the newer Docker Images in a similar manner to what @robandpdx has done for GitHub Actions. Would you like me to prepare such a PR @pdillinger ? |
The call to git is not in our CircleCI config. I'm pretty sure it's built into CircleCI's checkout step, so we'd have to roll our own to get it to work. If you can get it to work, go for it! |
git 1.7.1 is super old and I'm seeing other issues when I try to fetch the remote ref. Any way to get a newer version of git on the docker image? |
@robandpdx That is the approach I have also been taking to fix the CircleCI builds. I have been working on it yesterday, last night, and today, I think I am almost there. It involves compiling quite a few things from source code as part of a Multi-stage Docker build. I hope to have more news shortly... |
@robandpdx @pdillinger Okay I was able to publish a new Docker Image for 'CentOS 6 RocksDB Build Environment x64' that now includes the latest version of Git (2.43.0) and its dependencies: curl, and nghttp2; all built from source code. I just re-ran the CircleCI job and it is passing again - https://app.circleci.com/pipelines/github/facebook/rocksdb/35889/workflows/05ee91a4-a4ff-46f8-9331-749319c99307/jobs/719117 So hopefully we now have something that is compatible with both CircleCI and GitHub Actions? |
Summary: * Largely based on #12085 but grouped into one large workflow because of bad GHA UI design (see comments). * Windows job details consolidated into an action file so that those jobs can easily move between per-pr-push and nightly. * Simplify some handling of "CIRCLECI" environment and add "GITHUB_ACTIONS" in the same places * For jobs that we want to go in pr-jobs or nightly there are disabled "candidate" workflows with draft versions of those jobs. * ARM jobs are disabled waiting on full GHA support. * build-linux-java-static needed some special attention to work, due to GLIBC compatibility issues (see comments). Pull Request resolved: #12163 Test Plan: Nightly jobs can be seen passing between these two links: https://github.com/facebook/rocksdb/actions/runs/7266835435/job/19799390061?pr=12163 https://github.com/facebook/rocksdb/actions/runs/7269697823/job/19807724471?pr=12163 And per-PR jobs of course passing on this PR. Reviewed By: hx235 Differential Revision: D52335810 Pulled By: pdillinger fbshipit-source-id: bbb95196f33eabad8cddf3c6b52f4413c80e034d
Thanks @robandpdx and @adamretter . This is now obsolete with #12163 |
This pull request converts the CircleCI workflows to GitHub actions workflows. Github Actions Importer was used to convert the workflows initially, then I edited them manually to corect errors in translation. Many of the CircleCI command ended up as actions, some were no longer needed. For example install-cmake-on-macos is no longer needed because cmake in preinstalled on MacOS GitHub runners.
The GitHub actions workflows need runner groups with larger runners configured for the organization.
The following runner groups are needed:
For my testing, I have these runner groups populated with the following runners:
Issues
There are issue with some of the workflow that someone smarter than me needs to address.
This job fails because it cannot find the
report.tsv
file. It looks like the LOGs cannot be found also...Fixing may require edits to
tools/benchmark_ci.py
and/ortools/benchmark.sh
.This seems to be some issue with the docker container
evolvedbinary/rocksjava:centos6_x64-be
. Maybe the runner needs to be centos also, rather than ubuntu? I actually have no idea as this is super far outside of my wheelhouse.facebook/rocksdb/jobs-linux-arm
Currently, GitHub hosted runners do not come in the ARM flavor. Perhaps they will in the future. For now, if we want these jobs to work, you'll need to create some ARM based self-hosted runners.
facebook/rocksdb/jobs-linux-other-checks -> build-linux-mini-crashtest
The job fails with the following message:
Running out of space seems crazy on a machine that has 300GB of disk. I did see a warning in the Makefile that "Parallel can fill your /dev/shm" so maybe that's what's happening. Again, way outside of my expertise here.
This job seems to fail with a failed unit test due to a disk full:
Yeah, filling up 1200 GB seems loco. See me comment above about the warning in the Makefile.
facebook/rocksdb/jobs-linux-run-tests -> build-linux-gcc-7-with-folly
Another disk full thing.
facebook/rocksdb/jobs-linux-run-tests -> build-linux-encrypted_env-no_compression
Another disk full thing.
facebook/rocksdb/jobs-linux-run-test-san
The 4 jobs in this workflow all seem to fail with the disk full issue.
facebook/rocksdb/jobs-macos -> build-macos-cmake even tests
This job seems to fail due to a test failure.
facebook/rocksdb/nightly -> build-format-compatible
I have no idea what is causing this failure.
facebook/rocksdb/nightly -> build linux-arm-test-full
Need an ARM runner.
facebook/rocksdb/nightly -> build-linux-microbench
Segmentation fault.
facebook/rocksdb/nightly -> build-linux-clang-13-asan-ubsan-with-folly
Disk full.
facebook/rocksdb/nightly -> build-linux-valgrind.
Disk full.
Other than all that, everything is working great!
https://fburl.com/workplace/f6mz6tmw