-
Notifications
You must be signed in to change notification settings - Fork 658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elm compilation is incredibly slow on CI platforms #1473
Comments
Thanks for the issue! Make sure it satisfies this checklist. My human colleagues will appreciate it! Here is what to expect next, and if anyone wants to comment, keep these things in mind. |
@obmarg, can you figure out if it's slow to download packages or to actually build things. I'd expect it to be the former, and it'd be great to know for sure. I think @OvermindDL1 is talking about something else, so I'm getting rid of those comments. If you have an SSCCE of your thing, and it's not fixed by elm-lang/elm-make@46ec85c then open a separate issue on an appropriate repo. |
@evancz |
That sounds right, but if you are doing any weird caching of Basically, if you want the compiler to go faster under odd conditions, I need as much detailed information about what's going wrong as possible. Maybe they are throttling processes? Maybe they report they have multiple cores, but it's actually one? I have no idea without you telling me. |
Another question to ask, are you running |
I agree, a precise diagnosis would be a great idea. The commands that are being run are:
There's nothing odd being done in between these commands as far as I'm aware. I would like to explain what the odd conditions causing this are, but I'm not too sure myself. I've been using these CI services for years, and this is the first time I've ran into a serious performance issue like this. Is there any way to enable more logging in the elm compiler, or anything else that would help diagnose? |
It could be the case that travis & circle are reporting way more cores than are actually usable. I just checked |
This line tells Is there some way to try to make sure that that line is reporting two? Or trick it into reporting two and seeing if that resolves things? I have some logging stuff for myself, but not a public flag yet. So you could get this information if you build from source. It breaks down how much time is spent in different parts of |
I've been doing a bit of work to try and confirm that this false number of CPUs is actually causing this problem. I had a look into how the I then built & ran that on my CI environment: $ rm -R elm-stuff/build-artifacts/*
$ time sysconfcpus -n 1 elm-make
Success! Compiled 47 modules.
real 0m2.215s
user 0m2.195s
sys 0m0.024s
$ rm -R elm-stuff/build-artifacts/*
$ time elm-make
Success! Compiled 47 modules.
real 9m21.660s
user 15m38.880s
sys 2m47.578s So it does look like the CPU count detection is the problem. Seems like a command line option (or similar) might be a reasonable idea? For anyone trying to use libsysconfcpus themselves, I ran into a couple of compiler issues. My fixed version is here. |
Awesome @obmarg, shared that trick with NoRedInk, I think it'll help them too! Folks raised the idea of having a I recommend folks having this problem use @obmarg's trick for now. I'd like to talk to more people who are seeing this problem in practice to figure out a solution that does not allow bad outcomes in any cases, so no need for PRs at the moment. Code is always the easy part. |
I think a flag like So you can say @obmarg, do you like that approach? Can you think of ways to make sure anyone using CI knows to use that? Maybe we should just have official CI recipes for testing? I will talk to NRI people about this next week and get their feedback as well. |
Official CI recipes could also be useful, though there's a bunch of different ways to integrate elm into your build system. A recipe for running elm-make on CI might not help someone who uses brunch or webpack to run elm-make, for example. Though at least it could be a place to explain the issue, that people could refer to. Don't know if this is something that you'd be interested in adding a warning to the compiler for? Though it's probably quite hard to get right... |
FWIW, here is a concrete Travis recipe I arrived at that does work around this issue: cache:
directories:
- sysconfcpus
install:
- |
if [ ! -d sysconfcpus/bin ];
then
git clone https://github.com/obmarg/libsysconfcpus.git;
cd libsysconfcpus;
./configure --prefix=$TRAVIS_BUILD_DIR/sysconfcpus;
make && make install;
cd ..;
fi and then wherever there is a call to As a thing to note about the |
This workaround cuts my elm-package + elm-make time in Travis CI from almost 10 minutes down to 5 seconds. Nice work. Thank you. |
Would an environment variable make sense for this? |
This looks like a promising work-around. My team is presently building our elm modules in a Docker container. I will try this out and report back. If anyone has already done this (with Docker) please respond with your results and possibly save us some time. 😄 |
Btw: |
For those using Basically this is a drop-in replacement that makes |
@rtfeldman Is this something that can help |
@evancz AFAIK, the line from the comment #1473 (comment) could be removed, since the default value of GHC.Conc.numCapabilities should be the number of processors, or can be controlled via the runtime options |
@francesco-bracchi, I think you are wrong. Simply leaving that line out will change how the compiler behaves. Namely, it will not use concurrency anymore then. See https://downloads.haskell.org/~ghc/master/users-guide/using-concurrent.html. |
See elm/compiler#1473 for details.
I've run into this same class of problems when writing Clojure + Java 8, running on CircleCI. The machine had oodles of RAM, and my JVM thought it could take more of it than it was allowed to use. Manually setting memory limits fixed the issue. The root cause from my perspective is that the system (JVM/elm-make) is not correctly interpreting the hints that the environment is giving it about what resources are available to it. Java 9 and 10 have improvements to running under Docker containers. In Java 10, the container can look at its runtime to see what constraints it is running under. In theory, it seems like it would be possible for the Elm compiler + associated machinery to take a similar approach. Automatically detecting the number of CPU cores available would fix this without requiring any configuration from users. Something like nproc looks like one approach you could take for detecting the number of allowed CPUs to use. Apologies if I've misunderstood the issue, I didn't really see anyone directly suggesting that elm should detect how many CPU cores it is actually allowed to use. |
We've looked into that approach. As it turns out, Haskell's concurrency library only knows how to detect "number of physical cores," not "number of available cores." Node.js is the same way. Rust's |
As described in this elm compiler issue: elm/compiler#1473
As described in this elm compiler issue: elm/compiler#1473
As described in this elm compiler issue: elm/compiler#1473
As described in this elm compiler issue: elm/compiler#1473
As described in this elm compiler issue: elm/compiler#1473
As described in this elm compiler issue: elm/compiler#1473
May I ask if this issue is in any way impacted by Elm 0.19? |
I'm not sure. Anyway, in my opinion, people still blame the wrong thing. Problem is not necessary detection of CPU cores. The issue is that no matter what, compilation gets slower with increasing number of threads. The environment in which more threads are used only makes symptoms more noticeable but CPU detection isn't an issue by itself. Maybe it's a secondary issue which makes sense to fix once primary issue - the fact that compiler gets slower with increasing number of threads even though HW resources are available. |
Yeah - they are separate issues; fixing one but not the other would not solve the problem completely. |
My personal opinion is that:
As a affected user I'm happy I don't have to find a new workaround every month after some bad patch for this is released. |
elm/compiler#1473 Signed-off-by: Elliot Murphy <elliot@elliotmurphy.com>
On elm 0.19 a test suite of 5 tests that runs in less than 2 seconds with If more specific information would somehow help address this issue, I would be happy to provide it. |
@davcamer This i san |
With 0.19 folks are able to say things like: elm make src/Main.elm --optimize +RTS -N4 The things after @rtfeldman also documented the root problem in GHC that led to this here. Given that there are workarounds in Elm, and the root issue is in GHC, I think it makes sense to close this issue. If folks are still having problems, please open a new issue explaining your particular scenario, with an SSCCE if possible! |
Noting that elm-test-rs solved this issue ( |
I believe this was fixed in Elm 0.19.1. |
It has been fixed in elm 0.19.1! (As cool a project as elm-test-rs is it doesn't do anything special with respect to invoking the elm compiler: if you don't need sysconfcpus for elm-test-rs, you don't need it for elm-test either). |
I do apologise, and thank you for correcting me @turboMaCk and @harrysarson. I jumped to the wrong conclusion, and can confirm that elm compilation is not the issue, and neither does [I don't know if it's to be expected that |
I would be interested to here more about this! If you have the time, ping my @harrysarson on the elm-tesr slack or open an issue at https://github.com/rtfeldman/node-test-runner. |
Projects that compile in seconds on my local machine take an unreasonably long time when run on CI.
This issue produced a stopgap fix here #1473 (comment). Use that for now.
Additional Details
I've put together an example project using a sample from the elm-guide.
On my local machine this takes 2.6 seconds to build. The Travis CI build here takes 234 seconds to do the same build. My dev machine may be slightly better than the CI machines in question, but certainly not better enough for this difference in build times.
I've seen this behaviour on both travis CI & circle CI, and it only seems to get worse with larger projects. Another project of mine (a few hundred lines of elm, nothing major) struggles to build within 10 minutes.
I see there's a workaround for this here: https://8thlight.com/blog/rob-looby/2016/04/07/caching-elm-builds-on-travis-ci.html
The text was updated successfully, but these errors were encountered: