Evaluate moving to Circle CI #2319

kailuowang · 2018-07-09T16:02:06Z

Travis' memory issue is a bit too much and our build there now takes more than 3 hours.

tpolecat · 2018-07-09T17:37:13Z

Might also look at BuildKite if we can get people to donate hardware. It's easy enough for me to set up, which means it's easy. An agent running on @larsrh's 3874629847653-core machine would be 🔥

larsrh · 2018-07-09T18:08:43Z

@tpolecat That won't work, unfortunately. That machine is university property.

…

On 9 July 2018 19:37:21 CEST, Rob Norris ***@***.***> wrote: Might also look at [BuildKite](https://buildkite.com) if we can get people to donate hardware. It's easy enough for me to set up, which means it's easy. An agent running on @larsrh's 3874629847653-core machine would be 🔥 -- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: #2319 (comment)

ghost · 2018-08-13T19:23:50Z

from https://gitter.im/sbt/sbt-contrib?at=5b71c115988005174ed40110 :

This plugin may be of use/worth a try https://github.com/dwickern/sbt-classloader-leak-prevention

djspiewak · 2018-08-13T19:28:04Z

@kailuowang I'm a bit out of the loop, I think. The build takes 3 hours? How long does it take locally? What is it doing? Thanks to working at SlamData, I have a remarkably vast swath of experience debugging slow Travis builds. I'd be happy to take a look if you want.

ghost · 2018-08-13T19:34:50Z

the 2-3 hours is the combined total of the builds, each job is typically 20-30 minutes - https://travis-ci.org/typelevel/cats/builds/415542808

And a lot of that time is coverage testing, tut testing, doc testing, site building and so on.

djspiewak · 2018-08-13T19:38:00Z

Ok taking a quick look at things, literally the first things that occur to me:

Oh god, you're using a separated build script… I hate that convention.
Why is sudo: required? I'm relatively certain those VMs are slower. Is it just for codecov? See below.
The travis-publish.sh script goes to great lengths to push things all into a single SBT instance. In my experience, this is exactly the opposite of what you want to do when you have a slow build. Separate SBT processes, sequentially invoked, gives you better memory characteristics and is better understood by Travis (especially if you don't split the build script out of .travis.yml).
.jvmopts uses -Xmx6g. This is problematic because Travis doesn't have that much memory! You should strongly consider dropping that option altogether and allowing it to be the default (ditto with -Xms), which will be scaled off of the reported system memory.
We should have a discussion about whether or not code coverage is actually worth anything. Frankly, I've never seen it provide any value whatsoever, and it doubles the duration of the JVM build.
Why is the Ivy cache not being sanitized prior to publication? This is resulting in re-caching quite often.
Random best-practice: consider commenting on each of the secure variables so we know which one is which.

I didn't look at SBT itself. Looks like a lot of the logic is in tasks, so that may also contribute.

ghost · 2018-08-13T19:50:43Z

The build script actually does invoke sbt multiple times, but for jvm we could split even more as per the js build - but the jvm issue normally happens relatively early in the build.

for the sudo - that is a slower startup but you get the 7.5 Gb memory, we could try a lower setting. ref https://docs.travis-ci.com/user/reference/overview/

rossabaker · 2018-08-13T19:51:58Z

Why is sudo: required? I'm relatively certain those VMs are slower.

sudo: required gets 7.5 GB as opposed to 4GB. http4s adopted it because the IO was untenable on the container builds, but that should be far less a factor in cats.

We should have a discussion about whether or not code coverage is actually worth anything. Frankly, I've never seen it provide any value whatsoever, and it doubles the duration of the JVM build.

👍

ghost · 2018-08-13T19:53:01Z

My main concern before moving would be to ensure that it really is not our build at fault! one simple option is to add parallelExecution := false to the jvm settings, already in js

ghost · 2018-08-13T19:58:15Z

re scoverage times... be careful here. The scoverage tests also run the scalacheck tests, but with larger parameters than js. And after a successful coverage run, the code is just rebuilt not tested.

So whilst coverage will always be slower, i doubt it's causing any issues. What we might want to do is try running the scoverage with very low parameters (just to get coverage) and then run the full scalachecks with no scoverage.

IMHO, keeping/ditching coverage is best discussed as a separate issue

kailuowang · 2018-08-13T19:59:13Z

@djspiewak thanks so much for helping. And @BennyHill thanks for answering some of the questions.

To answer your questions above.

I'm not a fan of the separated build script either. Maybe we can replace it, but it didn't bother me enough to spend time on that either.
that is, errr, a way to tell Travis to use a different VM (see @BennyHill's answer above). I don't believe sudo is actually needed for the build to run. We added at least a year ago when we had memory issues with Travis last time. Might worth a try to remove it if we can squeeze
+1 on dropping -Xmx6g especially if we can use a different VM
code coverage combined with codecover chrome extension made it very easy to identify uncovered code in PRs. I agree that the overall coverage number for a PR isn't that critical. We can probably improve the build by limiting the code coverage in a single scala 2.12 jvm build job. Right now it's performed on both scala 2.11 and 2.12 jvm job.
no idea. worth a try.
also +1 on adopting that best practice. I think the two we have are sonatype credentials.

ghost · 2018-08-13T20:03:59Z

re the parallelExecution := false idea, this came up the other day on the scala native channel - https://gitter.im/scala-native/scala-native?at=5b6d631fa6af14730b170260

ghost · 2018-08-13T20:14:39Z

Finally, re the "separated build scripts" this was orignally done as per the ci docs,

But of course, that was a while back , so perhaps we can revisit that

ghost · 2018-08-13T20:21:40Z

And finally, finally.... one small advantage of separate build script is that it's far easier to "run* from the command line without having a local travis - see https://github.com/typelevel/cats/blob/master/scripts/travis-publish.sh#L17-L18

DavidGregory084 · 2018-08-14T09:52:27Z

If you drop sudo: required it would be a good idea to add -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap to ensure that the heap size is set according to the container's memory limits

softinio · 2019-04-26T03:30:45Z

If a decision is made to move to circleci let me know as I would gladly help. Have used circleci last few years exclusively.

Are there any other alternatives being considered?

DavidGregory084 · 2019-04-26T19:25:54Z

I have also heard really good things about Semaphore and BuildKite, although BuildKite requires its own infrastructure (I can highly recommend packet.com for that) and Semaphore's OSS policy seems to have mysteriously become a "Please email us if you are an OSS project" policy

DavidGregory084 · 2019-04-26T19:28:15Z

I had a look at this a few days ago and my overwhelming impression was that it's hard to define a build matrix in a nice way in every one of the hosted services other than Travis. It's possible in Circle CI but relies on YAML dictionary operations rather than being a construct in its own right.

DavidGregory084 · 2019-04-26T19:29:09Z

Whether that's a problem or not depends on how much faster (if at all) the builds run on those services IMO 😀

kailuowang · 2019-04-26T20:17:24Z

Thanks guys. We haven't seriously looked at the any of the alternatives yet. But we probably should soon given the elevated uncertainty in Travis future and it's suboptimal reliability lately. An easier migration from Travis is a nice to have, reason being if we have to switch yet again, it's slightly more likely to find another service somewhat confirm to the Travis way. How easier to set up a trial on circle ci?

DavidGregory084 · 2019-04-26T21:28:54Z

I'd be glad to give a few different services a go and report back @kailuowang?

kailuowang · 2019-04-26T21:47:10Z

@DavidGregory084 that would be amazing. Thanks!

DavidGregory084 · 2019-04-29T23:26:18Z

There's something interesting

about testing new CI systems

that brings out all the weird bugs 😄

DavidGregory084 · 2019-04-30T17:00:10Z

Guys, I've opened a few PRs which demonstrate the config required to use different services.

I evaluated CircleCI too but I found that the container memory limit of 4GB was just not enough to run cats builds reliably. I found the configuration to be quite verbose, and I also had issues where the config validation in the CircleCI CLI disagreed with the service itself and my build didn't run after passing validation locally.

These services do experience intermittent failures with builds, but they all seem to be caused by a single flaky test (ApplicativeTests.monoid.combineAll).

I think we should focus on fixing that whatever we decide to do about CI in the future.

So far my instinct is that Drone.io is probably the best option as it is free for open source, easy to configure and super fast.

Semaphore has a very unclear open source policy and although Buildkite is very nice, I think that managing hardware in addition to the build itself could become a bit of a chore.

kailuowang · 2019-04-30T19:11:22Z

Thanks, @DavidGregory084 that's a lot of work. I will checkout their configs in your PRs , and take stab at ApplicativeTests.monoid.combineAll.

softinio · 2019-05-11T20:12:23Z

@DavidGregory084 Out of curiosity what specific memory related issues did you hit with CircleCI? Where you leveraging any of circle's parallel processing features?

DavidGregory084 · 2019-05-21T20:07:16Z

@softinio you can see the config I used here. I tried using the cgroup memory limit detection (-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap), which didn't work correctly on CircleCI and resulted in the JVM allocating way too much memory. I also tried reducing the JVM memory allocation to 3.5G but I was still getting multiple jobs on each build killed by the CircleCI infra (Exited with code 137). You can see some example runs here.

DavidGregory084 · 2019-05-21T20:12:27Z

@softinio it seems like exceeding 4GB of available memory requires using a paid plan; as an open source project we could probably use the resource_class: large if we contacted CircleCI support.

kailuowang · 2019-10-16T00:28:06Z

Update on this: Semaphore would like to donate Cats 8 bare metal performance agents for cats CI. In my tests, it cuts Cats’ build time by half. I think we should consider migrating to Semaphore, main reason being that we have so many TL projects on Travis all sharing 6 slow agents, it’s nice to have some more powerful CI resources.

larsrh · 2020-10-20T15:17:13Z

Since nobody has worked on this for quite a while, I'm closing all old CI-related PRs.

Alistair-Johnson mentioned this issue Nov 29, 2018

Throttle list size in ListWrapper #2648

Closed

This was referenced Apr 30, 2019

[WIP] Semaphore CI #2823

Closed

[WIP] Drone.io #2824

Closed

[WIP] Buildkite CI #2825

Closed

kailuowang mentioned this issue May 3, 2019

In Apply.semigroup test replace ListWrapper with Option #2827

Merged

kailuowang mentioned this issue Nov 13, 2019

[WIP] set up semaphore #3145

Closed

larsrh closed this as completed Oct 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluate moving to Circle CI #2319

Evaluate moving to Circle CI #2319

kailuowang commented Jul 9, 2018

tpolecat commented Jul 9, 2018

larsrh commented Jul 9, 2018 via email

ghost commented Aug 13, 2018

djspiewak commented Aug 13, 2018

ghost commented Aug 13, 2018

djspiewak commented Aug 13, 2018

ghost commented Aug 13, 2018

rossabaker commented Aug 13, 2018

ghost commented Aug 13, 2018

ghost commented Aug 13, 2018

kailuowang commented Aug 13, 2018 •

edited

Loading

ghost commented Aug 13, 2018

ghost commented Aug 13, 2018

ghost commented Aug 13, 2018

DavidGregory084 commented Aug 14, 2018

softinio commented Apr 26, 2019

DavidGregory084 commented Apr 26, 2019

DavidGregory084 commented Apr 26, 2019

DavidGregory084 commented Apr 26, 2019

kailuowang commented Apr 26, 2019 •

edited

Loading

DavidGregory084 commented Apr 26, 2019

kailuowang commented Apr 26, 2019

DavidGregory084 commented Apr 29, 2019

DavidGregory084 commented Apr 30, 2019 •

edited

Loading

kailuowang commented Apr 30, 2019

softinio commented May 11, 2019

DavidGregory084 commented May 21, 2019

DavidGregory084 commented May 21, 2019 •

edited

Loading

kailuowang commented Oct 16, 2019

larsrh commented Oct 20, 2020

Evaluate moving to Circle CI #2319

Evaluate moving to Circle CI #2319

Comments

kailuowang commented Jul 9, 2018

tpolecat commented Jul 9, 2018

larsrh commented Jul 9, 2018 via email

ghost commented Aug 13, 2018

djspiewak commented Aug 13, 2018

ghost commented Aug 13, 2018

djspiewak commented Aug 13, 2018

ghost commented Aug 13, 2018

rossabaker commented Aug 13, 2018

ghost commented Aug 13, 2018

ghost commented Aug 13, 2018

kailuowang commented Aug 13, 2018 • edited Loading

ghost commented Aug 13, 2018

ghost commented Aug 13, 2018

ghost commented Aug 13, 2018

DavidGregory084 commented Aug 14, 2018

softinio commented Apr 26, 2019

DavidGregory084 commented Apr 26, 2019

DavidGregory084 commented Apr 26, 2019

DavidGregory084 commented Apr 26, 2019

kailuowang commented Apr 26, 2019 • edited Loading

DavidGregory084 commented Apr 26, 2019

kailuowang commented Apr 26, 2019

DavidGregory084 commented Apr 29, 2019

DavidGregory084 commented Apr 30, 2019 • edited Loading

kailuowang commented Apr 30, 2019

softinio commented May 11, 2019

DavidGregory084 commented May 21, 2019

DavidGregory084 commented May 21, 2019 • edited Loading

kailuowang commented Oct 16, 2019

larsrh commented Oct 20, 2020

kailuowang commented Aug 13, 2018 •

edited

Loading

kailuowang commented Apr 26, 2019 •

edited

Loading

DavidGregory084 commented Apr 30, 2019 •

edited

Loading

DavidGregory084 commented May 21, 2019 •

edited

Loading