Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce rviz_default_plugins memory usage #532

Closed
wants to merge 1 commit into from

Conversation

sloretz
Copy link
Contributor

@sloretz sloretz commented Apr 28, 2020

Opening issue because I need to ask for help. The ros2 linux packaging job is failing to build rviz_defaullt_plugins with an error that looks a bit like the agent ran out of memory.

07:37:10 [ 22%] Building CXX object CMakeFiles/rviz_default_plugins.dir/src/rviz_default_plugins/displays/map/swatch.cpp.o
07:37:10 [ 23%] Building CXX object CMakeFiles/rviz_default_plugins.dir/src/rviz_default_plugins/displays/marker/markers/arrow_marker.cpp.o
07:37:10 [ 24%] Building CXX object CMakeFiles/rviz_default_plugins.dir/src/rviz_default_plugins/displays/marker/markers/line_list_marker.cpp.o
07:37:10 c++: fatal error: Killed signal terminated program cc1plus
07:37:10 compilation terminated.
07:37:10 make[2]: *** [CMakeFiles/rviz_default_plugins.dir/build.make:63: CMakeFiles/rviz_default_plugins.dir/rviz_default_plugins_autogen/mocs_compilation.cpp.o] Error 1
07:37:10 make[2]: *** Waiting for unfinished jobs....
07:37:10 make[1]: *** [CMakeFiles/Makefile2:79: CMakeFiles/rviz_default_plugins.dir/all] Error 2
07:37:10 make: *** [Makefile:141: all] Error 2

This PR tried compiling separate object libraries and then linking them into the big rviz_default_plugins library would reduce the peak memory usage. It didn't.

Here's a script I wrote to watch the CPU and memory usage. It outputs csv to stdout that can be graphed.

import re
import sys
import time

import psutil


def get_cumulative_usage(regex):
    sum_cpu = 0.0
    sum_mem = 0.0
    for proc in psutil.process_iter(['name', 'cpu_percent', 'memory_percent']):
        try:
            if re.match(regex, proc.info['name']):
                sum_cpu += proc.info['cpu_percent']
                sum_mem += proc.info['memory_percent']
        except (psutil.NoSuchProcess, psutil.AccessDenied, psutil.ZombieProcess):
            pass
    return sum_cpu, sum_mem


if __name__ == '__main__':
    if 2 != len(sys.argv):
        sys.stderr.write(f'Usage: {sys.argv[0]} <regular_expression>\n')
        sys.exit(1)

    regex = re.compile(sys.argv[1])
    start = time.monotonic()
    now = start

    prev_cpu = None
    prev_mem = None

    while True:
        cum_mem, cum_cpu = get_cumulative_usage(regex)
        if cum_mem != prev_mem or cum_cpu != prev_cpu:
            print(f'{now - start}, {cum_mem}, {cum_cpu}')

        prev_mem = cum_mem
        prev_cpu = cum_cpu
        time.sleep(0.1)
        now = time.monotonic()

I used it to watch the cumulative CPU and memory usage of all ld or cc1plus processes on my machine while doing a clean build rviz_default_plugins.

python3 proc_usage.py "ld|cc1plus" > ~/rviz_mem_usage.csv

Here's the memory usage of building just rviz_default_plugins on the ros2 branch. (vertical axis % of system memory used, horizontal axis seconds)
Screenshot from 2020-04-28 13-57-04

Here's the memory usage using this branch
Screenshot from 2020-04-28 13-57-12

It does not appear to have reduced the peak. The only other idea I have is to break up the rviz_default_plugins library into smaller shared libraries.

Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>
@sloretz
Copy link
Contributor Author

sloretz commented Apr 28, 2020

Another option proposed by @clalancette is to build rviz_default_plugins with -j1. Not sure how to specify that so it happens automatically and only for that package.

@sloretz
Copy link
Contributor Author

sloretz commented Apr 28, 2020

Differences in repos used by Packaging 1841 and 1842

https://ci.ros2.org/view/packaging/job/packaging_linux/1841/

  micro-ROS/ros_tracing/ros2_tracing:
    type: git
    url: https://gitlab.com/micro-ROS/ros_tracing/ros2_tracing.git
    version: c3ea84b4b98a99d0a7ba3ed94846e1aaec8a8d13
  ros2/rclcpp:
    type: git
    url: https://github.com/ros2/rclcpp.git
    version: 04f3c33de5f35b6aec38da71777bf39f47e7c7a0
  ros2/rmw:
    type: git
    url: https://github.com/ros2/rmw.git
    version: 2d020b993a983849748ededec95a5a5d45905898
  ros2/system_tests:
    type: git
    url: https://github.com/ros2/system_tests.git
    version: f9dd335393b32ee9985037cd37cd80cb084bf79a

https://ci.ros2.org/view/packaging/job/packaging_linux/1842/

  micro-ROS/ros_tracing/ros2_tracing:
    type: git
    url: https://gitlab.com/micro-ROS/ros_tracing/ros2_tracing.git
    version: 17573951dd5976e365fb93311507752746364d10
  ros2/rclcpp:
    type: git
    url: https://github.com/ros2/rclcpp.git
    version: e0bf4a9c206753cfee067d265c12f47c4eb471fc
  ros2/rmw:
    type: git
    url: https://github.com/ros2/rmw.git
    version: 8ecc9531226d23b3bd24652306887256e26394d6
  ros2/system_tests:
    type: git
    url: https://github.com/ros2/system_tests.git
    version: 2f86e4239cb07c2be5b166fc781bec7a35e727d3

Of that, only rclcpp seems like a likely candidate for having increased the memory usage here

@brawner
Copy link
Contributor

brawner commented Apr 28, 2020

This package has to build a ton of object files, so it's running a lot of cc1plus processes in parallel. I've also heard anecdotally (from @nuclearsandwich) that this package hogs a lot of memory when compiling. Are there other processes running on the host that It might be taking up too much memory? The Jenkins java agent might be holding on to a lot of memory.

It looks like the last two failures occurred on the same host. Is there possibly something happening with the host? https://ci.ros2.org/computer/linux-d98d367e/

The PR you linked to (ros2/rclcpp#1095) just adds a header that doesn't depend on any new headers. rcl/types.h, rcl/allocator.h, rclcpp/visibility_control.h are already included in some form. I also don't advocate for removing it since it's a required header for the file it's being added to. It's possible it was the straw that broke the camel's back, but reverting that PR is not going to free up much if any memory.

@sloretz
Copy link
Contributor Author

sloretz commented May 5, 2020

Did not fix issue. Closing and continuing discussion in #539

@sloretz sloretz closed this May 5, 2020
@clalancette clalancette deleted the sloretz/reduce_compile_memory_usage branch June 1, 2021 19:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants