-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increasing available memory for release builds [Pinocchio] #232
Comments
Certainly blacklisting 32-bit builds will stop the current failure. I'll not that the 4.11GB peak for the build is something that will be truely a blocker for 32bit systems as that's over the size of the max addressable memory in 32 bit space We don't have the ability to add more memory for a specific job. And in general using this size of compiler object is a problem for many users. I would also recommend simply separating out your system into slightly more smaller compile units. If your system has enough resources it can compile them in parallel. And on smaller systems you can do one at a time. Whereas with the large build at the moment with -j1 there is no way for someone to use it on a smaller platform. We had similar problems with pcl early in its development. It was regularly causing crashes on developer machines due to going OOM. It too is also highly templated however by reviewing the include orders we were able to significantly reduce the memory usage to keep it from overwhelming systems. We run 8GB VMs but they run 4 jobs at a time so our primary specification is 2GB per job. We don't currently enforce it but I would request that you try to respect that. If for example your releases for 2 different platforms ended up on the same executor at the same time it would likely run out of memory as both platforms typically peak at the same time. And there's potentially 2 other jobs as well as our system overhead. We've seen this sort of simultaneous peaking actually taking our executors offline in the past: ros-infrastructure/ros_buildfarm#265 |
Our amd64 build agents have been going offline due to memory exhaustion when building |
I was hoping that similar to ROS1 they'll finish one build by build after a series of failures. This does not seem to be the case :-(. I am okay with reverting the eloquent/foxy releases. @azeey, what's the current memory limit for the amd64 buildfarm for ROS2? I saw that some builds hung in the tests but I thought I explicitly disabled tests via a patch as in ROS1 (since we know they exhaust memory). Is there another patch we could use to disable tests? Regarding reducing memory requirements: When speaking to @jcarpent, I seem to recall that reducing memory requirements for compilation of the Python bindings (expose-aba-derivatives etc.) was not straight-forward or possible. I don't have any insights here, sorry. cc: @Rascof, the ROS2 releases for Pinocchio are likely to be reverted. Could you perhaps look into the possibility of reducing memory requirements? The alternative would be to release without Python bindings, but that'd be quite limiting. |
Thanks for the creating the PRs.
The build agents have 8GB, but as @tfoote mentioned, each agent runs 4 jobs in parallel, so a limit of 2GB per job is recommended.
Unfortunately, I'm not familiar with release process to answer that question. |
@wxmerkt What I've found in the past is that having a lot of code in a single compilation unit tends to be the culprit for excessive memory usage. Thus, the solution usually resolves around splitting the code up into multiple compilation units, running the whole thing serially (i.e. |
I will be looking for a solution but I don't know if I could be of any help since I don't know in detail the Pinocchio package. |
Moving this discussion from email to a GitHub issue.
Background: Some release builds of Pinocchio have recently started failing due to
virtual memory exhausted
(cf. e.g. here). This is due to the template-heavy nature of the project (stack-of-tasks/pinocchio#1074). We have taken steps to decrease memory required to compile (stack-of-tasks/pinocchio#1079, stack-of-tasks/pinocchio#1077). However, the builds are still failing due to memory exhaustion on 32-bit platforms (stack-of-tasks/pinocchio#1096).Current situation:
On Ubuntu 18.04 with an i7-9850H CPU @ 2.60GHz 64-bit, 16GB memory, compiling Pinocchio (commit: stack-of-tasks/pinocchio@8303d3b) with a single job this is my peak use as measured by
/usr/bin/time -v catkin build -j1
(suggesting a 4.11 GB peak):We would like to get input what we can do to alleviate this issue to continue to be able to release via the ROS buildfarm (we do not need pull request testing).
@tfoote indicated that the current limit per VM is 8GB, but may decrease to 2GB in the future. I assume make jobs are using a single job
-j1
?cc: @tfoote @jcarpent
The text was updated successfully, but these errors were encountered: