-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Move i686 CI testing from Travis to CircleCI #18007
Conversation
Nice! I think it will be worth running both in parallel for a little while, only shutting off the job in the travis matrix when we're happy that circle is working and will handle our load level on the free tier (which is only one concurrent worker, right?). Is there a timeout on the initial cache population run time that you've seen? I'd almost rather try to use a non ubuntu distro if we can, I thought Circle lets you run in an arbitrary docker image of your choice? |
ARCH: "i686" | ||
BUILDOPTS: "-j3 VERBOSE=1 FORCE_ASSERTIONS=1 LLVM_ASSERTIONS=1" | ||
TESTSTORUN: "all" | ||
JULIA_CPU_CORES: 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we know how many cores the circle ci vm's have available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They have 2 vCPUs. But you can run up to 4 images in parallel in a single build and can manually split your tests across them. Not sure how helpful that will be for us given the current structure of the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we might be able to set up the parallelism by using different TESTSTORUN
for the different containers. So for example, one could be running the linear algebra tests while another does libgit2, etc. I'm really familiar with how the choosetests
thing works but that seems like it could be doable, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the free plan only gives you one build worker to start right? it just has 2 cores inside that one job
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I'm a little confused as to what you actually get in the free plan. I was able to select up to 4x parallelism, and the way Circle does parallelism is by running completely separate containers, each with 2 vCPUs, in parallel, and keeping each step of the build and test in sync between containers. Now, in the settings they also make it sound like you trade parallel containers for concurrent jobs. I'm not clear on the specifics of that. I should throw a bunch of PRs from different branches at it (to avoid auto-canceling) and see what happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some science and if my analysis is correct, you get a total of 4 containers at any given time on a free account. So with 4x parallelism, there are no containers left for other jobs. With 2x, there are 2 available containers left, so you get 1 other job with 2x parallelism. With 1x, there are 4 containers available for 4 jobs. 1x seems to take too long for it to be a viable replacement for Travis.
I actually couldn't find any information in the Circle docs as to the number of concurrent workers on the free plan but I think it's just one.
You mean a timeout for actually building the stuff in the cached directories? It doesn't seem like it; that part finished successfully in over an hour and a half.
Yes, theoretically you can set it up to use Docker. Buuuuut I don't know how to use Docker at all, so I stuck with their regular Ubuntu images, which are i686. Why would you rather avoid Ubuntu? For variety since Travis Linux is also Ubuntu? |
Yeah, mostly for test coverage's sake, since it's a little too easy to do things that end up only working on debian-shaped distributions if you don't test otherwise. If this already works it's a good step. We have some CentOS buildbots that serve this purpose too, it's just slightly less visible than pre-merge CI. |
I'm having some issues getting the YAML configured properly (CircleCI is really annoying with how they do directory changing) so until I get that sorted out I'll close this to avoid spamming Travis and AppVeyor. |
Appveyor has an |
Ah crap, I just pushed a commit. :/ Edit: I still have the "Reopen pull request" button though, so maybe that's okay.
Nope, checked before closing this. Travis has |
Seems fine to me ;) |
In any case, maybe just do the experiments on a different branch and leave this open just in case GitHub gets confused? |
Good idea 👍 |
maybe it was force pushing after a rebase that causes the github problem |
At any rate, I seem to have gotten it working now (see https://circleci.com/gh/ararslan/julia/27), though the tests run pretty slowly. Not sure how the speed compares to i686 on Travis. |
I was able to get it down to about 40 minutes using 3x parallelism the other day. It's still failing a libgit2 test due to SSH weirdness (I'm hoping #18066 will help with that), but it's otherwise working well. For the interested: https://circleci.com/gh/ararslan/julia/37. |
Okay, I think I have it just about as good as it's going to get. Some notes:
|
@tkelman Think we're ready to give this a go? |
@@ -0,0 +1,15 @@ | |||
#!/bin/bash | |||
# Balance the testing load between 2 CircleCI parallel containers | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add set -e
just in case things go horribly wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deed is done
@tkelman Looks like the Circle webhook is using my fork |
That may be because I didn't have "Permissive building of fork pull requests" turned on when you pushed that last commit, and since you have circle enabled for your fork. Not sure, but I'll turn that setting on now. |
I'll turn Circle off for my fork. Maybe that'll confuse it less. |
test: | ||
override: | ||
- /tmp/julia/bin/julia --precompile=no -e 'true' | ||
- /tmp/julia/bin/julia-debug --precompile=no -e 'true' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--precompiled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Derp. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should that have errored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once it got to that point it would have. I think you caught it before Circle got to the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
committed a day ago
It didn't run on your fork then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. Weird. No, I guess it worked fine. Does Julia silently ignore invalid arguments? It does not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works for me locally as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, I thought we would check for that
Dangit!! (╯°□°)╯︵ ┻━┻ Container 1:
Containers 2-4:
The fact that it's different between containers is a more than a little strange. I remember seeing the "no llvm-size" error on a VM once but I can't recall what I did to fix that. Any ideas, @tkelman? If not, would you mind just restarting the build and we'll see if it was a fluke? |
nevermind, something failed to build in deps - not sure what |
It was doing that before and was working when it was on my account though. Did something change in a makefile? |
Ah you might be hitting the issue that I fixed on master with curl not being able to find libssh. Does circle not build the merge commit for PR's? |
No clue. Speaking of Circle, the webhook seems to be MIA... |
Ah.
|
So I guess disabling "Only build pull requests" actually means don't build pull requests at all if you have a branch whitelist? |
Circle is still upset about not being able to find llvm-size. I remember I had that problem once when I was building Julia on ElementaryOS but I can't for the life of me remember what I did to fix it... I thought it was Edit: Oh, now that we've cleaned up what gets shown in the log, I've found this:
That's... hmm. I think I was getting that too at one point on eOS. |
That may have been an issue with #18164 that #18194 fixed? Not positive. There was a complaint about the red status so I've disabled the webhook for now, so will have to go back to trying on your fork? I couldn't find anywhere in their web UI to manually clear the cache, which may be making results from your fork not 100% representative. |
Yeah, Circle being angry on PRs that don't have Circle tests is understandably pretty annoying. It's weird though, I wasn't getting that error on my fork. 😕 I'll rebase and try on the fork again and see what happens. |
This will hopefully lessen the Travis CI queue by moving the i686 Linux tests from Travis to CircleCI.
CircleCI will need to be manually enabled and configured by a JuliaLang owner, but much of the build configuration is in the circle.yml file added in this PR. I've been testing this on my fork of Julia and it seems to work okay, though the first successful run for me took about 2.5 hours because it had to build dependencies. CircleCI enables cached directories just as Travis does, so subsequent builds are shorter.
I should note that I'm not sure how to configure fast fails for queued commits on the same PR, though that should be possible.Figured it out, it's just a project setting in CircleCI.CircleCI gives you the option to use Ubuntu 12.04 or 14.04. I opted for the latter in my fork after reading their documentation about the difference. I can't guarantee that the YAML I set up here will work with Circle's 12.04 but I don't know why it wouldn't.
cc @tkelman