-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
macOS+rust e2e tests in CI are pinned to an over two year old version of Docker #145
Comments
IIRC it is pinged to older version because newer Docker Desktop doesn't install on CI machines, if we can address that it would be great. |
@lizan thx. to clarify the script I mention above does install, build images, run etc. Just we have some tests that are failing (in a bad way, just hanging). Hopefully the general cleanup work on tests can get us to a point where we can understand why and put the unpinned script back in (or some variant) |
I made a branch try-again off master, now that #130 is merged, then clicked in the actions pane to run the build against it (without a PR) The good news is that tests are running (on the unpinned OS/X docker). The bad news is it is less a run and more a crawl: the rust compilation is super super slow While we have work to do anyway on that topic (see #152), @mathetake had a hunch that writes to volumes could be slower on OS/x and there could be other topics to investigate also. Meanwhile, I'll leave the branch there in case someone wants to experiment. Hope this helps. |
the I'll bump the timeout on that branch to see the baseline of an unpinned docker+macos. |
I found that a plugin I used in the past supports OS/x with the same approach as here. Since we are not in the business of hosting shell scripts, I changed my branch to use this instead https://github.com/marketplace/actions/setup-docker I've hesitated on running against it until #156 as it takes a long time to run both extension languages sequentially. Running these isolated will allow us to better understand the extent of things such as cargo flags. |
the setup-docker action works, but there's a serious performance problem building rust on osx+docker at least in GH actions. I suggest we do anything we can to address perf issues on rust based "getenvoy extension build" and revisit when that occurs. Specifically, when not caching tinygo is 20m and rust fails after 2hrs |
#159 isolates the two year old docker install to only the rust+macOS cell. This should help future PRs test improvements as all they have to do is remove that special casing. |
I revised the description here because the facts have become more clear over the last week of investigation. |
macOS is a primary development environment and old docker is near impossible to install in real life. No one would. I found some developers have a linux workstation to avoid "macos problems" but that isn't something we can expect users to do. I spent a lot of time with a large amount of review help by @mathetake on getting CI to a place where rust might be able to work in CI on macOS like tinygo does. This included upgrading our docker image, rewrote the scripts to cache outside the volume, reviewed every dependency removed one and rewrote out another... many iterations on this topic all week and every weekend also. This displaced time from useful work, not just myself but other people's time reviewing it. I have no idea why this language is so vitally important and also ok to leave so slow it is broken on osx. This is one, but not the only reason I opened #189 However, it is fair to say this topic consumed more time than any sane person would spend and we are lucky I am insane or would have given up months ago. Here's the latest run which uses only the dependencies absolutely required: https://github.com/tetratelabs/getenvoy/runs/2397652908?check_suite_focus=true |
This removes the custom script for rust at expense of longer runtimes, last checked at 1hr for the e2e. Fixes #145 Signed-off-by: Adrian Cole <adrian@tetrate.io>
saw this pass once so raised a PR to close this topic, though it will be painful to wait an hour #190 |
This removes the custom script for rust at expense of longer runtimes, last checked at 1hr for the e2e. Fixes #145 Signed-off-by: Adrian Cole <adrian@tetrate.io>
macOS+rust e2e tests in CI are pinned to an over two year old version of Docker, handled in
./ci/e2e/macos/install_docker.sh
. The rest of the matrix uses latest Docker per #159.The impact is that we are blind to any issues that present only on recent versions of Docker + OS/x, such as #147. To mitigate this, people should manually run
make e2e
on a macbook on occasion.The underlying problem is not that tests break, it is that they take too long to complete. The rust performance problem is being tracked in #152 and this issue should be revisited on any significant progress. However, this isn't a sure thing. For example, special casing in #158 resulted in half the time spent in macOS+rust e2e (on a modern MacBook Pro). However, GH actions run of those same tests failed on timeout after an hour.
There means there could be some possibilities that only change CI. For example, the descending performance could be related to memory or other environment details that differ between CI and a macbook pro. In other words, even if work on the root problem is slow progressing, there could be some changes to make things "fast enough" (40m?) to use normal Docker again.
The text was updated successfully, but these errors were encountered: