[ros2topic] fix pub rate reduced issue #141

yechun1 · 2018-08-29T02:10:11Z

when rate is 1000, time.sleep() in main loop will cost about 150ms every 1s, use time comparison instead of time.sleep(), save sleep cycle time lossing

Signed-off-by: Chris Ye chris.ye@intel.com

wjwwood

This is not a good solution because it will use 100% of a CPU core waiting for the time to be expired. If you set the publish rate to 1 Hz, it will spend too much time just checking the time.

I would not merge this as-is.

dirk-thomas · 2018-08-29T21:13:05Z

Please use a Timer instead - see here for an example.

yechun1 · 2018-08-30T02:53:14Z

I tried to use create_timer, the time_callback will also cost extra time exhaust. -r 1000 -p 1000, real time is about 1.280s.

I have no idea to fix the issue in pub command. Since it just call APIs to publish messages. Compare with ROS, it used rospy.Rate.sleep(). ref to timer.py. There is no rate loss. Could ROS2 rclpy be possible to support timer with Rate.sleep() function?

dirk-thomas · 2018-08-30T03:33:12Z

I tried to use create_timer, the time_callback will also cost extra time exhaust. -r 1000 -p 1000, real time is about 1.280s.

Can you share that code somehow to review it?

yechun1 · 2018-08-30T03:51:04Z

Here is code: https://github.com/yechun1/ros2cli/blob/fix_ros2topic_pub_issue2/ros2topic/ros2topic/verb/pub.py#L130

$ ros2 topic pub /image sensor_msgs/Image -r 1000 -p 1000
publisher: beginning loop
publishing #1000: sensor_msgs.msg.Image(header=std_msgs.msg.Header(stamp=builtin_interfaces.msg.Time(sec=0, nanosec=0), frame_id=''), height=0, width=0, encoding='', is_bigendian=0, step=0, data=[])

interval = 1.280345
publishing #2000: sensor_msgs.msg.Image(header=std_msgs.msg.Header(stamp=builtin_interfaces.msg.Time(sec=0, nanosec=0), frame_id=''), height=0, width=0, encoding='', is_bigendian=0, step=0, data=[])

interval = 1.302808

yechun1 · 2018-08-30T05:34:44Z

Compared with ROS1 which works very well, so I referenced to the code rospy.Rate.sleep(), since there is no rclpy.Rate.sleep() on ros2 currently, I implemented the code inside the pub.py, updated the patch, please help review, thanks.

dirk-thomas · 2018-08-30T23:15:04Z

There are two separate issues here:

The time each invocation of the publish method takes. That is a problem in / below rclpy (actually in C code below which performs the conversion to a C struct and then publishes the data through the RMW layer).
The current logic in the loop doesn't maintain the desired rate but it adds the time the publish call takes to each cycle.

The second problem can be address - as suggested and recommended above - by using a timer. This works as long as the duration of the publish() call doesn't exceed the cycle time.

The solution using a timer will look like this:

import time
import rclpy

rclpy.init()
node = rclpy.create_node('test_timer')

def timer_callback():
    print(time.time())
    time.sleep(0.5)

timer = node.create_timer(1.0, timer_callback)

rclpy.spin(node)

node.destroy_timer(timer)
node.destroy_node()
rclpy.shutdown()

Please update this PR accordingly to avoid the manual calculation of the time to sleep.

The first problem will need to be addressed on a lower level. You can easily reproducing the problem with my code snippet by changing the sleep() call to e.g. 2.0s.

yechun1 · 2018-08-31T07:03:23Z

I have tried timer but the result is not good:
1. the timer is good for 1hz, but not good for 1000hz. Because timer call cycle will exhaust time.
2. pub and pub call will cost time, but not major rate loss in this scenario (pub -r 1000 -p 1000)

Here is sample code

Below is result print every 0.001s * 1000. All the data has been added up on each 1ms cycle.
real_1s_interval: each expected 1s, then print real time.
cb_exec_time: callback execution time in 1s. (include pub_call_time)
timer_call_time: time to trigger callback, it is outside the callback.

real_1s_interval = 1.313601
cb_exec_time + timer_call_time = 1.313601
cb_exec_time:0.105953
timer_call_time:1.207648
pub_call_time:0.102473

Below is result that disabled pub(msg) code, timer_call_time cost much more time.
( #pub.publish(msg))

real_1s_interval = 1.300832
cb_exec_time + timer_call_time = 1.300832
cb_exec_time:0.003738
timer_call_time:1.297094
pub_call_time:0.000399

Should we report a bug on timer with 1ms 1000hz test? Or adopt manual calculation that adopt the sleep mechanism from ROS topic pub code?

dirk-thomas · 2018-08-31T17:52:32Z

the timer is good for 1hz, but not good for 1000hz. Because timer call cycle will exhaust time

Can you clarify what you mean with this. Why is it "not good"? What does "exhaust time" mean?

On my system a callback triggered by a timer is on average about 0.2ms "late". For a rate of 1000 Hz that adds a significant delay. The current implementation of timers doesn't produce a "fixed cycle time" but accumulates these delays over time which for 1000Hz is pretty significant. I created ros2/rcl#289 to address this in the timer directly.

Therefore I would suggest to update this patch to use a timer which can then also be used to validate the future fix of the timer itself. I don't think implementing custom timing logic in this package as a temporary workaround is a good idea.

time.sleep will add the time the publish call takes to each cycle. Use a timer to avoid pub rate loss. Signed-off-by: Chris Ye <chris.ye@intel.com>

yechun1 · 2018-09-01T11:41:26Z

understand, updated the code, please help review.
I have tested by using a timer the real period of 1ms count to 1000 is 1.3s. The expect 1000 pub rate is reduce to about 769 Hz (1000/1.3). Most time cost in timer delay. Maybe we should address the issue in the timer.

yechun1 · 2018-09-01T13:07:13Z

Verified the timer patch based on (ros2/rcl#289]):

before change (timer without ros2/rcl#289):
$ ros2 topic pub /image sensor_msgs/Image -r 1000 -p 1000

publishing #307000: sensor_msgs.msg.Image(header=std_msgs.msg.Header(stamp=builtin_interfaces.msg.Time(sec=0, nanosec=0), frame_id=''), height=0, width=0, encoding='', is_bigendian=0, step=0, data=[])

$ ros2 topic hz /image (ref to #133)

average rate: 833.743
min: 0.000s max: 0.004s std dev: 0.00008s window: 10000

# after change: (timer based on [ros2/rcl#289])
ros2 topic pub /image sensor_msgs/Image -r 1000 -p 1000

publishing #103000: sensor_msgs.msg.Image(header=std_msgs.msg.Header(stamp=builtin_interfaces.msg.Time(sec=0, nanosec=0), frame_id=''), height=0, width=0, encoding='', is_bigendian=0, step=0, data=[])

ros2 topic hz /image

average rate: 1000.008
min: 0.000s max: 0.003s std dev: 0.00007s window: 10000
average rate: 999.991
min: 0.000s max: 0.003s std dev: 0.00007s window: 10000
average rate: 999.993

This is a significant improvement, the timer call delay issue should be fixed.
And one other issue is when pub and sub(by topic hz) for a while, the pub rate will randomly decrease, cpu consumption is then more than 100%. I have checked the publish(msg) cost much time, not related with timer call, the original code with time.sleep loop also has the same issue.

Updated patch

dirk-thomas · 2018-09-01T18:26:38Z

@yechun1 Thank you for iterating with me on the patch.

…e event (ros2#141) * Add 'handle_once' property to EventHandler This is useful for implementing "one shot" events. The property can be set via the constructor or setter. If 'handle_once' is set to True, then the EventHandler unregisters itself from the context after the first time it is handled. All subclasses of EventHandler pass it kwargs so they can use the new property. Tests included. * Add a common interface for describing an EventHandler A common pattern for describing event handlers has been moved to the parent class. Subclasses should implement the 'handler_description' and 'matcher_description' properties rather than implementing a describe method.

Signed-off-by: Dirk Thomas <dirk-thomas@users.noreply.github.com>

tfoote added the in review Waiting for review (Kanban column) label Aug 29, 2018

yechun1 mentioned this pull request Aug 29, 2018

[ros2topic] pub frequency lower then expect #140

Closed

wjwwood previously requested changes Aug 29, 2018

View reviewed changes

dirk-thomas added requires-changes bug Something isn't working labels Aug 29, 2018

yechun1 force-pushed the fix_ros2topic_pub_issue branch from 04767a3 to 96a3d86 Compare August 30, 2018 05:26

yechun1 force-pushed the fix_ros2topic_pub_issue branch 2 times, most recently from 5c4808c to 4938813 Compare August 30, 2018 15:38

dirk-thomas mentioned this pull request Aug 31, 2018

timer accumulating delay ros2/rcl#289

Closed

yechun1 force-pushed the fix_ros2topic_pub_issue branch from 4938813 to 2a20391 Compare September 1, 2018 10:54

[ros2topic] use a timer instead of time.sleep

2a20391

time.sleep will add the time the publish call takes to each cycle. Use a timer to avoid pub rate loss. Signed-off-by: Chris Ye <chris.ye@intel.com>

dirk-thomas removed the requires-changes label Sep 1, 2018

dirk-thomas merged commit 62ee790 into ros2:master Sep 1, 2018

dirk-thomas removed the in review Waiting for review (Kanban column) label Sep 1, 2018

yechun1 deleted the fix_ros2topic_pub_issue branch September 2, 2018 14:28

esteve pushed a commit to esteve/ros2cli that referenced this pull request Dec 16, 2022

avoid deprecation warning, use from_parent (ros2#141)

e07c9f3

Signed-off-by: Dirk Thomas <dirk-thomas@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ros2topic] fix pub rate reduced issue #141

[ros2topic] fix pub rate reduced issue #141

yechun1 commented Aug 29, 2018

wjwwood left a comment

dirk-thomas commented Aug 29, 2018

yechun1 commented Aug 30, 2018 •

edited

Loading

dirk-thomas commented Aug 30, 2018

yechun1 commented Aug 30, 2018

yechun1 commented Aug 30, 2018 •

edited

Loading

dirk-thomas commented Aug 30, 2018 •

edited

Loading

yechun1 commented Aug 31, 2018 •

edited

Loading

dirk-thomas commented Aug 31, 2018

yechun1 commented Sep 1, 2018

yechun1 commented Sep 1, 2018 •

edited

Loading

dirk-thomas commented Sep 1, 2018

[ros2topic] fix pub rate reduced issue #141

[ros2topic] fix pub rate reduced issue #141

Conversation

yechun1 commented Aug 29, 2018

wjwwood left a comment

Choose a reason for hiding this comment

dirk-thomas commented Aug 29, 2018

yechun1 commented Aug 30, 2018 • edited Loading

dirk-thomas commented Aug 30, 2018

yechun1 commented Aug 30, 2018

yechun1 commented Aug 30, 2018 • edited Loading

dirk-thomas commented Aug 30, 2018 • edited Loading

yechun1 commented Aug 31, 2018 • edited Loading

dirk-thomas commented Aug 31, 2018

yechun1 commented Sep 1, 2018

yechun1 commented Sep 1, 2018 • edited Loading

dirk-thomas commented Sep 1, 2018

yechun1 commented Aug 30, 2018 •

edited

Loading

yechun1 commented Aug 30, 2018 •

edited

Loading

dirk-thomas commented Aug 30, 2018 •

edited

Loading

yechun1 commented Aug 31, 2018 •

edited

Loading

yechun1 commented Sep 1, 2018 •

edited

Loading