High CPU usage even at idle #1015

ruffsl · 2019-08-07T20:58:56Z

Bug report

The security working group has been creating a SROS2 demo targeting the turtlebot 3 with a running navigation stack. However our meager mortal laptops @mikaelarguedas and I have (~2017 Dell XPS i7 32GB) are struggling to run the tb3 demo while maintaining a usable desktop, as about about 40% of our CPU is under load even when the robot is not planning or moving:

I fear many participants in the workshop with lesser machines will struggle to run the demo with added security cipher overhead or unavoidable virtual machines on mac/windows as workshops happen to do when linux is involved. Is there anything we can do to lighten this up, like rate limit spinning in dwb_controler, amcl, map_model, or patches we can pull into our .repo file targeting dashing?

https://github.com/ros-swg/turtlebot3_demo

Required Info:

Operating System:
- Ubuntu 18.04
Installation type:
- source
Version or commit hash:
- Patched from https://github.com/ros-swg/turtlebot3_demo/blob/master/docker/dependencies.repos
- starting around 9cbda50
DDS implementation:
- Fast-RTPS

Steps to reproduce issue

See current tutorial README.md as of writing:
https://github.com/ros-swg/turtlebot3_demo/tree/ccc63b8462a714f6f870db97ae8797d167522463

Expected behavior

Ideal load for the navigation stack when not moving would be low like in ROS1, mostly dominated by gzviewer and rviz if not running in headless mode.

Actual behavior

The processes for dwb_controler, amcl, map_model take the lion's share of CPU, even with the robot is localized and not moving. Note that this is even without security or gzclient enabled.

Additional information

The above is just from launching these two launch files

ros2 launch turtlebot3_gazebo turtlebot3_world.launch.py
ros2 launch turtlebot3_navigation2 navigation2.launch.py use_sim_time:=true

Where gzclient can be disabled in navigation2.launch.py via:

-            cmd=['gazebo', '--verbose', world, '-s', 'libgazebo_ros_init.so'],
+            cmd=['gzserver', '--verbose', world, '-s', 'libgazebo_ros_init.so'],

The text was updated successfully, but these errors were encountered:

SteveMacenski · 2019-08-07T21:09:16Z

@ruffsl are you using any custom config files or just the defaults?

If I'm reading this right: you're seeing 12% CPU usage for the world model (global costmap), DWB controller (which also is running a costmap), and AMCL (woah, that's shouldn't be even remotely that high). Frankly 12% CPU seems about right for the costmap items since that's spinning regardless if anyone's using it. AMCL shouldn't be 12% ever.

Any other errors/warnings like inability to transform laser scans? Particularly coming from AMCL?

Also is this on dashing or bleed?

Is this running with the encryption going or just normal?

mikaelarguedas · 2019-08-07T21:34:00Z

Also is this on dashing or bleed?

We're currently targeting dashing (with some tiny config changes to avoid errors in RViz dashing-devel...mikaelarguedas:mikael/dashing-sync).

Any other errors/warnings like inability to transform laser scans? Particularly coming from AMCL?

I am seeing a bunch of dropped message and tf errors still:

[amcl-3] [INFO] [amcl_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 67.401 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 71.601 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap]: Timed out waiting for transform from base_link to map to become available, tf error: . canTransform returned after 1.001 timeout was 1.

Is it a common set of error messages or are we doing something bad somewhere?

Currently looking at ros2/geometry2#146 and ros2/rviz#375 to see if these improve things.

Is this running with the encryption going or just normal?

I believe these results are just normal, with encryption CPU loads gets significantly higher on my machine

are you using any custom config files or just the defaults?

We're using these parameters: https://github.com/ROBOTIS-GIT/turtlebot3/blob/ros2/turtlebot3_navigation2/param/burger_params.yaml

The diff from the ones from navigation2@master:

$ diff burger_params.yaml nav2_params.yaml 
60c60
<     max_vel_x: 0.22
---
>     max_vel_x: 0.26
64c64
<     max_speed_xy: 0.22
---
>     max_speed_xy: 0.26
86,87c86,87
<     PathAlign.scale: 32.0
<     GoalAlign.scale: 24.0
---
>     PathAlign.scale: 0.0
>     GoalAlign.scale: 0.0
103c103
<       robot_radius: 0.105
---
>       robot_radius: 0.22
125c125
<       robot_radius: 0.105
---
>       robot_radius: 0.22
145c145
<     yaml_filename: "map.yaml"
---
>     yaml_filename: "turtlebot3_world.yaml"

ruffsl · 2019-08-07T21:38:01Z

@ruffsl are you using any custom config files or just the defaults?

You can see our repos file we are using in the Version or commit hash bullet above, as we have a few commits optop of 9cbda50 for changing use_sim_time, unreliable transport, and tweaking the rviz config view. Currently: https://github.com/mikaelarguedas/navigation2/commits/mikael/dashing-sync

Frankly 12% CPU seems about right for the costmap items since that's spinning regardless if anyone's using it.

I'd still argue even that's aggressive, could that be toned down as well by shrinking the local cost map area?

Any other errors/warnings like inability to transform laser scans? Particularly coming from AMCL?

We see lot of info messages from world_model:

[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 179.203 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 179.402 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 179.603 for reason(0)

Here is an capture from navigation2.launch.py, while successfully localizing and navigating:

[INFO] [launch]: All log files can be found below /home/ruffsl/.ros/log/2019-08-07-14-33-32-485199-dox-5447
[INFO] [launch]: Default logging verbosity is set to INFO
[INFO] [lifecycle_manager-1]: process started with pid [5466]
[INFO] [map_server-2]: process started with pid [5467]
[INFO] [amcl-3]: process started with pid [5468]
[INFO] [world_model-4]: process started with pid [5469]
[INFO] [dwb_controller-5]: process started with pid [5470]
[INFO] [navfn_planner-6]: process started with pid [5477]
[INFO] [recoveries_node-7]: process started with pid [5484]
[INFO] [bt_navigator-8]: process started with pid [5499]
[INFO] [rviz2-9]: process started with pid [5511]
[lifecycle_manager-1] [INFO] [lifecycle_manager]: Creating
[map_server-2] [INFO] [map_server]: Creating
[world_model-4] [INFO] [world_model]: Creating
[lifecycle_manager-1] [INFO] [lifecycle_manager]: �[34m�[1mStarting the system bringup...�[0m�[0m
[lifecycle_manager-1] [INFO] [lifecycle_manager]: �[34m�[1mCreating and initializing lifecycle service clients�[0m�[0m
[amcl-3] [INFO] [amcl]: Creating
[lifecycle_manager-1] [INFO] [lifecycle_manager]: �[34m�[1mConfiguring and activating map_server�[0m�[0m
[dwb_controller-5] [INFO] [dwb_controller]: Creating
[dwb_controller-5] [INFO] [local_costmap.local_costmap]: Creating
[world_model-4] [INFO] [global_costmap.global_costmap]: Creating
[recoveries_node-7] [INFO] [recoveries]: Configuring Spin
[recoveries_node-7] [WARN] [rcl.logging_rosout]: Publisher already registered for provided node name. If this is due to multiple nodes with the same name then all logs for that logger name will go out over the existing publisher. As soon as any node with that name is destructed it will unregister the publisher, preventing any further logs for that name from being published on the rosout topic.
[recoveries_node-7] [INFO] [recoveries]: Configuring BackUp
[navfn_planner-6] [INFO] [navfn_planner]: Creating
[bt_navigator-8] [INFO] [bt_navigator]: Creating
[map_server-2] [INFO] [map_server]: Configuring
[map_server-2] [INFO] [map_server]: OccGridLoader: Creating
[map_server-2] [INFO] [map_server]: OccGridLoader: Configuring
[map_server-2] [WARN] [map_server]: Mode parameter not set, using default value (trinary): bad conversion
[map_server-2] [INFO] [map_server]: Activating
[map_server-2] [INFO] [map_server]: OccGridLoader: Activating
[lifecycle_manager-1] [INFO] [lifecycle_manager]: �[34m�[1mConfiguring and activating amcl�[0m�[0m
[amcl-3] [INFO] [amcl]: Configuring
[amcl-3] [INFO] [amcl]: initTransforms
[amcl-3] [WARN] [rcl.logging_rosout]: Publisher already registered for provided node name. If this is due to multiple nodes with the same name then all logs for that logger name will go out over the existing publisher. As soon as any node with that name is destructed it will unregister the publisher, preventing any further logs for that name from being published on the rosout topic.
[amcl-3] [INFO] [amcl]: initPubSub
[amcl-3] [INFO] [amcl]: Subscribed to map topic.
[amcl-3] [INFO] [amcl]: Activating
[lifecycle_manager-1] [INFO] [lifecycle_manager]: �[34m�[1mConfiguring and activating world_model�[0m�[0m
[world_model-4] [INFO] [world_model]: Configuring
[world_model-4] [INFO] [global_costmap.global_costmap]: Configuring
[world_model-4] [INFO] [global_costmap.global_costmap]: Using plugin "static_layer"
[world_model-4] [INFO] [global_costmap.global_costmap]: Using plugin "obstacle_layer"
[world_model-4] [INFO] [global_costmap.global_costmap]: Subscribed to Topics: scan
[world_model-4] [INFO] [global_costmap.global_costmap]: Using plugin "inflation_layer"
[world_model-4] [INFO] [global_costmap.global_costmap]: StaticLayer: Resizing costmap to 384 X 384 at 0.050000 m/pix
[world_model-4] [INFO] [world_model]: Activating
[world_model-4] [INFO] [global_costmap.global_costmap]: Activating
[world_model-4] [INFO] [global_costmap.global_costmap]: Checking transform
[rviz2-9] [INFO] [rviz2]: Stereo is NOT SUPPORTED
[rviz2-9] [INFO] [rviz2]: OpenGl version: 4.6 (GLSL 4.6)
[world_model-4] [INFO] [global_costmap.global_costmap]: Timed out waiting for transform from base_link to map to become available, tf error: . canTransform returned after 1.048 timeout was 1.
[rviz2-9] [INFO] [rviz2]: Stereo is NOT SUPPORTED
[rviz2-9] [WARN] [rcl.logging_rosout]: Publisher already registered for provided node name. If this is due to multiple nodes with the same name then all logs for that logger name will go out over the existing publisher. As soon as any node with that name is destructed it will unregister the publisher, preventing any further logs for that name from being published on the rosout topic.
[rviz2-9] Parsing robot urdf xml string.
[amcl-3] [WARN] [amcl]: Waiting for map....
[world_model-4] [INFO] [global_costmap.global_costmap]: Timed out waiting for transform from base_link to map to become available, tf error: . canTransform returned after 1.089 timeout was 1.
[world_model-4] [INFO] [global_costmap.global_costmap]: Timed out waiting for transform from base_link to map to become available, tf error: . canTransform returned after 1.003 timeout was 1.
[rviz2-9] [INFO] [rviz2]: Creating 1 swatches_
[rviz2-9] [ERROR] [rviz2]: InvalidParametersException: Stream size does not match calculated image size in Image::loadRawData at /tmp/binarydeb/ros-dashing-rviz-ogre-vendor-6.1.2/obj-x86_64-linux-gnu/ogre-master-ca665a6-prefix/src/ogre-master-ca665a6/OgreMain/src/OgreImage.cpp (line 283)
[rviz2-9] [ERROR] [rviz2]: Failed to create 1 swatches_
[rviz2-9] [INFO] [rviz2]: Creating 2 swatches_
[rviz2-9] [ERROR] [rviz2]: InvalidParametersException: Stream size does not match calculated image size in Image::loadRawData at /tmp/binarydeb/ros-dashing-rviz-ogre-vendor-6.1.2/obj-x86_64-linux-gnu/ogre-master-ca665a6-prefix/src/ogre-master-ca665a6/OgreMain/src/OgreImage.cpp (line 283)
[rviz2-9] [ERROR] [rviz2]: Failed to create 2 swatches_
[rviz2-9] [INFO] [rviz2]: Creating 4 swatches_
[amcl-3] [WARN] [amcl]: Waiting for map....
[world_model-4] [INFO] [global_costmap.global_costmap]: Timed out waiting for transform from base_link to map to become available, tf error: . canTransform returned after 1.003 timeout was 1.
[world_model-4] [INFO] [global_costmap.global_costmap]: Timed out waiting for transform from base_link to map to become available, tf error: . canTransform returned after 1.002 timeout was 1.
[amcl-3] [WARN] [amcl]: Waiting for map....
[world_model-4] [INFO] [global_costmap.global_costmap]: Timed out waiting for transform from base_link to map to become available, tf error: . canTransform returned after 1.005 timeout was 1.
[world_model-4] [INFO] [global_costmap.global_costmap]: Timed out waiting for transform from base_link to map to become available, tf error: . canTransform returned after 1.003 timeout was 1.
[amcl-3] [WARN] [amcl]: Waiting for map....
[world_model-4] [INFO] [global_costmap.global_costmap]: Timed out waiting for transform from base_link to map to become available, tf error: . canTransform returned after 1.005 timeout was 1.
[rviz2-9] [INFO] [rviz2]: Setting goal: Frame:map, Position(-2.01168, -0.475417, 0), Orientation(0, 0, 0.000398123, 1) = Angle: 0.000796245
[amcl-3] [INFO] [amcl]: initialPoseReceived
[amcl-3] [INFO] [amcl]: Setting pose (52.589000): -2.012 -0.475 0.001
[world_model-4] [INFO] [global_costmap.global_costmap]: Timed out waiting for transform from base_link to map to become available, tf error: . canTransform returned after 1.004 timeout was 1.
[amcl-3] [INFO] [amcl]: Received a 384 X 384 map @ 0.050 m/pix
[amcl-3] 
[amcl-3] [INFO] [amcl]: Setting pose (53.594000): -2.012 -0.475 0.001
[amcl-3] [INFO] [amcl]: createLaserObject
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 44.001 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap]: Timed out waiting for transform from base_link to map to become available, tf error: Lookup would require extrapolation into the past.  Requested time 54.09400 but the earliest data is at time 54.80200, when looking up transform from frame [base_link] to frame [map]. canTransform returned after 1.006 timeout was 1.
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 44.201 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 44.401 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 44.601 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 44.801 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap]: start
[lifecycle_manager-1] [INFO] [lifecycle_manager]: �[34m�[1mConfiguring and activating dwb_controller�[0m�[0m
[dwb_controller-5] [INFO] [dwb_controller]: Configuring
[dwb_controller-5] [INFO] [local_costmap.local_costmap]: Configuring
[rviz2-9] [INFO] [rviz2]: Creating 1 swatches_
[rviz2-9] [ERROR] [rviz2]: InvalidParametersException: Stream size does not match calculated image size in Image::loadRawData at /tmp/binarydeb/ros-dashing-rviz-ogre-vendor-6.1.2/obj-x86_64-linux-gnu/ogre-master-ca665a6-prefix/src/ogre-master-ca665a6/OgreMain/src/OgreImage.cpp (line 283)
[rviz2-9] [ERROR] [rviz2]: Failed to create 1 swatches_
[rviz2-9] [INFO] [rviz2]: Creating 2 swatches_
[rviz2-9] [ERROR] [rviz2]: InvalidParametersException: Stream size does not match calculated image size in Image::loadRawData at /tmp/binarydeb/ros-dashing-rviz-ogre-vendor-6.1.2/obj-x86_64-linux-gnu/ogre-master-ca665a6-prefix/src/ogre-master-ca665a6/OgreMain/src/OgreImage.cpp (line 283)
[rviz2-9] [ERROR] [rviz2]: Failed to create 2 swatches_
[rviz2-9] [INFO] [rviz2]: Creating 4 swatches_
[dwb_controller-5] [INFO] [local_costmap.local_costmap]: Using plugin "obstacle_layer"
[dwb_controller-5] [INFO] [local_costmap.local_costmap]: Subscribed to Topics: scan
[dwb_controller-5] [INFO] [local_costmap.local_costmap]: Using plugin "inflation_layer"
[dwb_controller-5] [INFO] [dwb_controller]: Setting transform_tolerance to 0.200000
[dwb_controller-5] [INFO] [dwb_controller]: Using critic "RotateToGoal" (dwb_critics::RotateToGoalCritic)
[dwb_controller-5] [INFO] [dwb_controller]: Critic plugin initialized
[dwb_controller-5] [INFO] [dwb_controller]: Using critic "Oscillation" (dwb_critics::OscillationCritic)
[dwb_controller-5] [INFO] [dwb_controller]: Critic plugin initialized
[dwb_controller-5] [INFO] [dwb_controller]: Using critic "BaseObstacle" (dwb_critics::BaseObstacleCritic)
[dwb_controller-5] [INFO] [dwb_controller]: Critic plugin initialized
[dwb_controller-5] [INFO] [dwb_controller]: Using critic "GoalAlign" (dwb_critics::GoalAlignCritic)
[dwb_controller-5] [INFO] [dwb_controller]: Critic plugin initialized
[dwb_controller-5] [INFO] [dwb_controller]: Using critic "PathAlign" (dwb_critics::PathAlignCritic)
[dwb_controller-5] [INFO] [dwb_controller]: Critic plugin initialized
[dwb_controller-5] [INFO] [dwb_controller]: Using critic "PathDist" (dwb_critics::PathDistCritic)
[dwb_controller-5] [INFO] [dwb_controller]: Critic plugin initialized
[dwb_controller-5] [INFO] [dwb_controller]: Using critic "GoalDist" (dwb_critics::GoalDistCritic)
[dwb_controller-5] [INFO] [dwb_controller]: Critic plugin initialized
[dwb_controller-5] [INFO] [dwb_controller]: Controller frequency set to 20.0000Hz
[dwb_controller-5] [INFO] [dwb_controller]: Activating
[dwb_controller-5] [INFO] [local_costmap.local_costmap]: Activating
[dwb_controller-5] [INFO] [local_costmap.local_costmap]: Checking transform
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 45.002 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 45.201 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 45.401 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 45.602 for reason(0)
[dwb_controller-5] [INFO] [local_costmap.local_costmap]: start
[lifecycle_manager-1] [INFO] [lifecycle_manager]: �[34m�[1mConfiguring and activating navfn_planner�[0m�[0m
[navfn_planner-6] [INFO] [navfn_planner]: Configuring
[navfn_planner-6] [INFO] [navfn_planner]: Activating
[lifecycle_manager-1] [INFO] [lifecycle_manager]: �[34m�[1mConfiguring and activating bt_navigator�[0m�[0m
[bt_navigator-8] [INFO] [bt_navigator]: Configuring
[rviz2-9] [INFO] [rviz2]: Creating 1 swatches_
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 45.802 for reason(0)
[bt_navigator-8] [INFO] [bt_navigator_client_node]: Waiting for "ComputePathToPose" action server
[bt_navigator-8] [INFO] [bt_navigator_client_node]: "ComputePathToPose" BtActionNode initialized
[bt_navigator-8] [INFO] [bt_navigator_client_node]: Waiting for "FollowPath" action server
[bt_navigator-8] [INFO] [bt_navigator_client_node]: "FollowPath" BtActionNode initialized
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 46.002 for reason(0)
[bt_navigator-8] [INFO] [bt_navigator_client_node]: Waiting for "Spin" action server
[bt_navigator-8] [INFO] [bt_navigator_client_node]: "Spin" BtActionNode initialized
[bt_navigator-8] [INFO] [bt_navigator]: Activating
[lifecycle_manager-1] [INFO] [lifecycle_manager]: �[34m�[1mThe system is active�[0m�[0m
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 46.201 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 46.401 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 46.602 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 46.802 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 47.002 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 47.201 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 47.401 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 47.601 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 47.801 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 48.001 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 48.202 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 48.402 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 48.601 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 48.801 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 49.001 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 49.201 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 49.401 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 49.601 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 49.802 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 50.001 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 50.201 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 50.401 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 50.601 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 50.801 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 51.001 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 51.202 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 51.401 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 51.601 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 51.801 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 52.001 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 52.201 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 52.402 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 52.601 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 52.802 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 53.001 for reason(0)
[bt_navigator-8] [INFO] [bt_navigator]: Begin navigating from current location to (1.81, -0.41)
[dwb_controller-5] [INFO] [dwb_controller]: Received a goal, begin following path
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 53.201 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 53.401 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 53.602 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 53.802 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 54.002 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 54.201 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 54.402 for reason(0)
[world_model-4] [INFO] [global_costmap.global_costmap_rclcpp_node]: [signalFailure] Drop message: frame 'base_scan' at time 54.602 for reason(0)
[dwb_controller-5] [INFO] [dwb_controller]: Reached the goal!
[bt_navigator-8] [INFO] [bt_navigator]: Navigation succeeded

SteveMacenski · 2019-08-07T22:00:52Z

To be frank, I've been mostly focused on capability and not yet performance, but it looks like I may refocus some of that effort to performance. If you're running in simulation you can drop the controller rate to 5hz and the global costmap update rate to 1hz. Set those *Align configs to 0. That should help a bit.

The CPU 12% for the costmaps seems about right* given all the problems (TF dropping messages, processing a costmap by itself was never a super light weight thing, is there a depth camera involved there or just laser?), but I'm not saying that there isn't something wrong, just from hand wavey background knowledge of getting really deep into costmap 2D in ROS1 navigation 12% is like the lowest I've ever had especially when publishing that costmap at rate, which right now the Nav2 version doesnt have subscriber detection, but I havent run with only a laser.

AMCL and the costmaps all running at 12% seem suspect, given they're all laser scan based and there's so many TF things happening.

orduno · 2019-08-08T16:09:57Z

@mjeronimo do these numbers match what you've found so far from your performance analysis?

orduno · 2019-08-08T16:44:10Z

In addition to what has been proposed, consider reducing the global costmap resolution and the local costmap window size. This will certainly have some impact on navigation quality, but it's worth experimenting. We want to try that at some point.

Also, for simulation you can use nav2_simulation_launch with the argument simulator:=gzserver, this will launch everything you need on a single command.

ros2 launch nav2_bringup nav2_simulation_launch.py simulator:=gzserver

SteveMacenski · 2019-08-08T17:06:24Z

I think the curent global costmap resolution is 5cm, that's a pretty good number, maintaining it shouldn't be that hard if the ranges are setup for the laser to be small. The RPlidar is pretty short range and low resolution, I don't know that lowering the global costmap resolution will have much impact given the small amount of data being processed.

I'm more suspicious that potentially the global costmap is being updated at 5hz in the default configs (will look today), that would add some extra "umph" when our replanning rate I think its 1hz default.

I'm generally suspect of AMCL if those numbers can be reproduced. These items are on my queue to think about tomorrow or next week depending on time availability.

SteveMacenski · 2019-08-08T17:06:50Z

I'm going to assign myself to start on this, but help from other folks is appreciated.

crdelsey · 2019-08-08T17:32:47Z

My first guess as to the problem would be ros2/geometry2#119. As in, you need this PR integrated into whatever you are running. I think it is integrated in master now. Probably not dashing.

All the nodes mentioned have TF listeners, so the above would be a problem for them.

mjeronimo · 2019-08-08T22:39:55Z

@orduno In my recent performance analysis, on the NUC I've been using I see about 3% CPU to spin a node. After running a navigation, the nodes should fall back to this level. One can compare a sample ROS2 application, such as a simple pub/sub example, for reference.

Since the three modules mentioned are the ones that have the TF listener, as Carl mentions, I would suspect that the additional CPU utilization is coming from there. There have been recent changes TF listener and the latest ROS2 code should have addressed this.

SteveMacenski · 2019-08-08T22:52:21Z

3% a node

jeeze. AMCL used to take ~3% running the particle filter

SteveMacenski · 2019-08-09T20:51:43Z

Another thing I just recalled is that we dont use the costmap updates topics yet since RVIZ doesn't support them as of today. Sending the full costmap every time isn't a cheap operation.

SteveMacenski · 2019-08-09T22:54:56Z

I'm seeing crazy costmap update rates while debugging another issue

[dwb_controller-5] [INFO] [local_costmap.local_costmap]: Map update time: 0.000919024
[dwb_controller-5] [INFO] [local_costmap.local_costmap]: Cycle: 0.000919024

crdelsey · 2019-08-26T23:10:55Z

@SteveMacenski Are you still seeing "crazy" costmap updates? If so, can you open a separate issue for that?

@ruffsl Were you able to retest with the fix in ros2/geometry2#119?

ruffsl · 2019-08-28T03:11:33Z

@ruffsl Were you able to retest with the fix in ros2/geometry2#119?

I just changed our dockerfile to build FROM rosplanning/navigation2:master.release and used rocker to spin up and it looks a lot better. However, rviz is glitching out in rendering the map and nav markers.
Here is a snapshot as the robot is planning/moving on its own across the map:

crdelsey · 2019-08-28T19:33:05Z

The rviz thing has been reported upstream ros2/rviz#453. Looks like a workaround is to keep adding the map topic display till it finally looks right.

ruffsl · 2019-09-06T04:03:40Z

I've updated our demo for https://github.com/ros-swg/turtlebot3_demo , so you can check if you're able to reproduce the same performance usage changes. As a side note, perhaps maintainers here could help us triage a QoS/tf issue I think we might be having with navigation + sros2 . ros-swg/turtlebot3_demo#12

crdelsey · 2019-10-07T22:10:20Z

@ruffsl Is this still an issue?

ruffsl · 2019-10-07T22:15:38Z

I'll check back again after the improvements to fast rtps QoS are sync'ed into dashing:
https://discourse.ros.org/t/preparing-for-dashing-sync-and-patch-release-2019-10-10/10898

ruffsl · 2019-10-18T22:39:24Z

I tested this again using the latest recent dashing sync, and everything seems to be much better. The lion's share of the CPU usage remaining seems to be just from the gazebo and rviz GUIs:

@crdelsey , feel free to close this, however we are still having issues enabling security for navigation2:
See latest comment here: ros-swg/turtlebot3_demo#12 (comment)

SteveMacenski · 2019-10-18T22:43:23Z

@ruffsl dear god, what computer are you running on? Gazebo takes 3% CPU?

If there's some action items for the security stuff feel free to file tickets and we'll take a look

ruffsl · 2019-10-18T23:05:26Z

what computer are you running on? Gazebo takes 3% CPU?

It's about a 3yo CPU in a workstation. I was thinking perhaps the GPU ray tracing is working in my favor, but it doesn't seem like the turtlebot3 is using gpu_ray , so perhaps its my CPU core count.

Linux ubuntu 4.20.4-042004-generic #201901222207 SMP Tue Jan 22 22:09:25 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

SteveMacenski · 2019-10-18T23:15:40Z

I need to get off this peewee laptop game then. My computer nearly catches fire when I run gazebo with a depth sensor for longer than 20 minutes

SteveMacenski self-assigned this Aug 8, 2019

crdelsey added the 1 - High High Priority label Aug 8, 2019

crdelsey added this to the Aug 2019 milestone Aug 8, 2019

mlherd added the performance This is a performance problem or enhancement label Sep 5, 2019

ruffsl mentioned this issue Sep 6, 2019

QoS issues over scan topic with security enabled ros-swg/turtlebot3_demo#12

Open

crdelsey modified the milestones: Aug 2019, September Sep 9, 2019

SteveMacenski closed this as completed Oct 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High CPU usage even at idle #1015

High CPU usage even at idle #1015

ruffsl commented Aug 7, 2019

SteveMacenski commented Aug 7, 2019 •

edited

Loading

mikaelarguedas commented Aug 7, 2019

ruffsl commented Aug 7, 2019 •

edited

Loading

SteveMacenski commented Aug 7, 2019 •

edited

Loading

orduno commented Aug 8, 2019

orduno commented Aug 8, 2019 •

edited

Loading

SteveMacenski commented Aug 8, 2019

SteveMacenski commented Aug 8, 2019

crdelsey commented Aug 8, 2019 •

edited

Loading

mjeronimo commented Aug 8, 2019

SteveMacenski commented Aug 8, 2019

SteveMacenski commented Aug 9, 2019

SteveMacenski commented Aug 9, 2019

crdelsey commented Aug 26, 2019

ruffsl commented Aug 28, 2019 •

edited

Loading

crdelsey commented Aug 28, 2019

ruffsl commented Sep 6, 2019

crdelsey commented Oct 7, 2019

ruffsl commented Oct 7, 2019

ruffsl commented Oct 18, 2019

SteveMacenski commented Oct 18, 2019 •

edited

Loading

ruffsl commented Oct 18, 2019 •

edited

Loading

SteveMacenski commented Oct 18, 2019

High CPU usage even at idle #1015

High CPU usage even at idle #1015

Comments

ruffsl commented Aug 7, 2019

Bug report

Steps to reproduce issue

Expected behavior

Actual behavior

Additional information

SteveMacenski commented Aug 7, 2019 • edited Loading

mikaelarguedas commented Aug 7, 2019

ruffsl commented Aug 7, 2019 • edited Loading

SteveMacenski commented Aug 7, 2019 • edited Loading

orduno commented Aug 8, 2019

orduno commented Aug 8, 2019 • edited Loading

SteveMacenski commented Aug 8, 2019

SteveMacenski commented Aug 8, 2019

crdelsey commented Aug 8, 2019 • edited Loading

mjeronimo commented Aug 8, 2019

SteveMacenski commented Aug 8, 2019

SteveMacenski commented Aug 9, 2019

SteveMacenski commented Aug 9, 2019

crdelsey commented Aug 26, 2019

ruffsl commented Aug 28, 2019 • edited Loading

crdelsey commented Aug 28, 2019

ruffsl commented Sep 6, 2019

crdelsey commented Oct 7, 2019

ruffsl commented Oct 7, 2019

ruffsl commented Oct 18, 2019

SteveMacenski commented Oct 18, 2019 • edited Loading

ruffsl commented Oct 18, 2019 • edited Loading

SteveMacenski commented Oct 18, 2019

SteveMacenski commented Aug 7, 2019 •

edited

Loading

ruffsl commented Aug 7, 2019 •

edited

Loading

SteveMacenski commented Aug 7, 2019 •

edited

Loading

orduno commented Aug 8, 2019 •

edited

Loading

crdelsey commented Aug 8, 2019 •

edited

Loading

ruffsl commented Aug 28, 2019 •

edited

Loading

SteveMacenski commented Oct 18, 2019 •

edited

Loading

ruffsl commented Oct 18, 2019 •

edited

Loading