Make `test_executor.spin_some_max_duration` more reliable #430

ivanpauno · 2020-05-01T19:32:05Z

Fixes https://ci.ros2.org/view/nightly/job/nightly_win_rel/1537/testReport/junit/test_rclcpp/test_executor__rmw_cyclonedds_cpp/spin_some_max_duration/.

Considering that the Windows scheduler time slice is 20ms, the test is being too optimistic (same applies for macOS and Linux, though it was only failing on Windows).

I increased all durations, to ensure that the OS time slice doesn't represent a big fraction of it.

Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>

clalancette

One small change to add a bit of documentation, but I'll approve anyway.

clalancette · 2020-05-01T20:10:23Z

test_rclcpp/test/test_executor.cpp

@@ -61,7 +61,7 @@ TEST(CLASSNAME(test_executor, RMW_IMPLEMENTATION), spin_some_max_duration) {
  rclcpp::executors::SingleThreadedExecutor executor;
  auto node = rclcpp::Node::make_shared("spin_some_max_duration");
  auto lambda = []() {
-      std::this_thread::sleep_for(1ms);
+      std::this_thread::sleep_for(100ms);


This is a good change, and keeps the current semantics of the test; basically, that 20 threads of N ms sleep take < 20*N ms to complete. However, those semantics are a bit hard to discern here. Would you mind adding a comment explaining that?

There was already a comment, but I forgot to update it.
See 2d0e91c.

Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>

ivanpauno · 2020-05-01T20:45:07Z

CI, testing only test_rclcpp test_executor test.
Using --retest-until-fail 10 to check if this actually makes the test more reliable.

Linux
Linux-aarch64
macOS
Windows

Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>

ivanpauno · 2020-05-05T21:34:15Z

The test seemed to be "less" flaky in the above run, but it still failed after a bunch of repetitions.
I incremented the time tolerance, to see if it doesn't fail anymore:

Windows

Based on internet speculations (as windows doesn't officially clarify anything), it seems that windows server distributions have a bigger time slice that desktop versions. That's (maybe) why time based tests started failing more frequently when we switched to containerized builds.
In windows performance options, there's even a Processor scheduling option, where you can choose between "adjust performance for Programs"/"adjust performance for Backgroud services".
The default in Windows Server installations is the latter, though in desktops is the former.

ivanpauno · 2020-05-06T13:12:41Z

@clalancette let me know if you think this is ready.
I can further increase time tolerance to 500ms, as timing isn't really the point of the test.

clalancette

@clalancette let me know if you think this is ready.
I can further increase time tolerance to 500ms, as timing isn't really the point of the test.

Yeah, I think that would be a good idea. As long as it completes in less than 2 seconds, I think we've proven what the test is trying to prove. I'll still approve.

Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>

ivanpauno · 2020-05-06T14:43:42Z

CI:

Windows

Make test_executor.spin_some_max_duration more reliable

bd4e975

Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>

ivanpauno added the in review Waiting for review (Kanban column) label May 1, 2020

ivanpauno requested a review from sloretz May 1, 2020 19:32

ivanpauno self-assigned this May 1, 2020

clalancette approved these changes May 1, 2020

View reviewed changes

ivanpauno mentioned this pull request May 1, 2020

Make test_two_timers_ready_before_timeout more reliable ros2/rcl#640

Merged

Update comment

2d0e91c

Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>

Use bigger tolerance

2184dbc

Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>

ivanpauno requested a review from clalancette May 6, 2020 13:11

clalancette approved these changes May 6, 2020

View reviewed changes

Use bigger time tolerance

7d5cdaf

Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>

ivanpauno merged commit 2f157db into master May 6, 2020

delete-merged-branch bot deleted the ivanpauno/make-spin-some-max-duration-more-reliable branch May 6, 2020 16:01

ivanpauno mentioned this pull request May 6, 2020

Skip flaky timer test on windows ros2/rclpy#554

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `test_executor.spin_some_max_duration` more reliable #430

Make `test_executor.spin_some_max_duration` more reliable #430

ivanpauno commented May 1, 2020

clalancette left a comment

clalancette May 1, 2020

ivanpauno May 1, 2020

ivanpauno commented May 1, 2020 •

edited

Loading

ivanpauno commented May 5, 2020

ivanpauno commented May 6, 2020

clalancette left a comment

ivanpauno commented May 6, 2020

Make test_executor.spin_some_max_duration more reliable #430

Make test_executor.spin_some_max_duration more reliable #430

Conversation

ivanpauno commented May 1, 2020

clalancette left a comment

Choose a reason for hiding this comment

clalancette May 1, 2020

Choose a reason for hiding this comment

ivanpauno May 1, 2020

Choose a reason for hiding this comment

ivanpauno commented May 1, 2020 • edited Loading

ivanpauno commented May 5, 2020

ivanpauno commented May 6, 2020

clalancette left a comment

Choose a reason for hiding this comment

ivanpauno commented May 6, 2020

Make `test_executor.spin_some_max_duration` more reliable #430

Make `test_executor.spin_some_max_duration` more reliable #430

ivanpauno commented May 1, 2020 •

edited

Loading