Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[events executor] - Fix Behavior with Timer Cancel #2375

Merged
merged 13 commits into from
Jan 30, 2024

Conversation

mwcondino
Copy link
Contributor

@mwcondino mwcondino commented Nov 28, 2023

Background

It appears that with the experimental events executor, some usages of the rclcpp API cause broken behavior. Specifically, if multiple timers are configured in a node, and during runtime one of the timers is cancelled via the TimerBase::cancel method, this will result in all other timers failing to execute.

A minimal working example of the bug is found here --
example_package.tar.gz

To reproduce, build the package using colcon build, then run USE_EVENTS_EXECUTOR=1 ros2 run example_package timer_node. It is observable that both timers get cancelled after the first timer cancels itself:

root@753c2035d766:/app/src# USE_EVENTS_EXECUTOR=1 ros2 run example_package timer_node
[INFO] [1701211973.238745755] [timer_node]: Using EventsExecutor
[INFO] [1701211974.239147118] [timer_node]: Timer 1!
[INFO] [1701211974.239268496] [timer_node]: Timer 2!
[INFO] [1701211975.239086741] [timer_node]: Timer 1!
[INFO] [1701211975.239251090] [timer_node]: Timer 2!
[INFO] [1701211976.239241288] [timer_node]: Timer 1!
[INFO] [1701211976.239417539] [timer_node]: Timer 2!
[INFO] [1701211977.239107783] [timer_node]: Timer 1!
[INFO] [1701211977.239260069] [timer_node]: Timer 2!
[INFO] [1701211978.239079416] [timer_node]: Timer 1!
[INFO] [1701211978.239232915] [timer_node]: Timer 2!
[INFO] [1701211979.239045079] [timer_node]: Timer 1!
[INFO] [1701211979.239198617] [timer_node]: Timer cancelling itself!
[INFO] [1701211979.239225057] [timer_node]: Timer 2!

Myself and @gusbrigantino have tested this on latest rolling as well as iron.

Solution

The root cause is that there’s an edgecase within the TimersManager class which causes the thread to hang indefinitely if the ‘head’ timer in an internal data structure gets cancelled. As a result, the method will block until timers are reset (the timers_updated_ flag is set, which only occurs if the timers are reset. The underlying issue is that the TimersManager class is not properly checking for if a timer to be executed has actually been cancelled.

The implemented fix adds logic to the comparison used for the heap such that cancelled timers are always at the bottom of the heap. Furthermore, get_head_timeout_unsafe is modified to return whether or not the current timer is invalid. This bool is then used to break out of the loop in run_timers.

@mwcondino
Copy link
Contributor Author

Looks like the build for rolling may be unhappy? I'm seeing a similar build failure for the latest merge into rolling - link

@mjcarroll
Copy link
Member

Looks like the build for rolling may be unhappy? I'm seeing a similar build failure for the latest merge into rolling - link

You can probably disregard Rpr jobs right now, we will run full CI on it.

@mjcarroll
Copy link
Member

Is there any way that you can add your reproducible example as a test?

@mwcondino
Copy link
Contributor Author

Is there any way that you can add your reproducible example as a test?

Sure, I can try to get it in the form of a unit test.

@mwcondino
Copy link
Contributor Author

OK I ended up adding two tests:

  1. check_one_timer_cancel_doesnt_affect_other_timers in test_timers_manager.cpp validates the behavior in the timers_manager object. Interestingly enough, this test does not fail on latest iron, which I did not expect given the example package's behavior
  2. testOneTimerCancelledWithExecutorSpin in test_executors.cpp validates the behavior when using a rclcpp::Node that adds and then cancels it's own timers. This example uses rclcpp::spin, and does reproduce the behavior. On this branch, this test should be passing.

I can't figure out why there would be a difference in behavior between the two tests - as far as I can tell, when spin is called on the EventsExecutor, it makes a call to start it's own timers_manager object, so I would think that the behavior should be the same as when the timers_manager object is manually started and then the same sequence of waiting + cancelling is followed.

Signed-off-by: Matt Condino <mwcondino@gmail.com>
Signed-off-by: Matt Condino <mwcondino@gmail.com>
Signed-off-by: Matt Condino <mwcondino@gmail.com>
Signed-off-by: Gus Brigantino <gusbrig97@gmail.com>

// No need to wait if a timer is already available
if (time_to_sleep > std::chrono::nanoseconds::zero()) {
if (time_to_sleep != std::chrono::nanoseconds::max()) {
// Wait until timeout or notification that timers have been updated
timers_cv_.wait_for(lock, time_to_sleep, [this]() {return timers_updated_;});
} else {
}
else if (head_was_cancelled) {
Copy link
Collaborator

@alsora alsora Nov 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if there's only 1 timer which is cancelled? Will we heap keep entering here at every iteration, without ever sleeping?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent point - as-is, this would result in a busy loop as you suggest!

One possible fix would be to explicitly check the size of the heap and if it is 1, we can do a similar wait for timers_updated_. Any thoughts there? To me, that would result in a slightly messier run_timers method, but I don't know how else we would cleanly handle this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the single timer handling, but run_timers is getting a bit messy.
Happy to refactor/reorganize it for clarity if you would like, just let me know

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that the fix works.
What if there are more than 1 timers but they are all cancelled?

I think that the correct approach would likely require to:
a) remove cancelled timers from the heap
b) have a notification to put them back in when they are re-activated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, you are correct, this approach was brittle - sorry about that.

Will implement what you suggested here, thank you

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alsora sorry for the delay here - had a busy couple of weeks.

@gusbrigantino and I refactored the code, and we came up with an implementation of run_timers that does not require storing cancelled timers and removing them from the heap. Rather, the approach is to try to re-heap if the head timer was cancelled, then re-organize some logic from that point on. If there are non-cancelled timers in the heap, then the re-heap will force them to be at the front and thus head is now valid and the logic remains the same. However, if all timers in the heap are cancelled, the behavior now matches the previous behavior with no timers in the heap -- we wait until timers_updated is true.

We added a few more test cases to validate a couple of different scenarios.

Please let us know if that makes sense and works for you!

mwcondino and others added 4 commits November 30, 2023 14:15
Signed-off-by: Matt Condino <mwcondino@gmail.com>
Signed-off-by: Gus Brigantino <gusbrig97@gmail.com>
Signed-off-by: Matt Condino <mwcondino@gmail.com>
Signed-off-by: Matt Condino <mwcondino@gmail.com>
Signed-off-by: Matt Condino <mwcondino@gmail.com>
@mwcondino
Copy link
Contributor Author

@alsora @mjcarroll could someone please review whenever convenient? Thanks!

@fujitatomoya
Copy link
Collaborator

@mwcondino FYI, besides unresolved comments, there are several failures you can find https://build.ros2.org/job/Rpr__rclcpp__ubuntu_jammy_amd64/1272/testReport/

Signed-off-by: Matt Condino <mwcondino@gmail.com>
@mwcondino
Copy link
Contributor Author

mwcondino commented Jan 10, 2024

@mwcondino FYI, besides unresolved comments, there are several failures you can find https://build.ros2.org/job/Rpr__rclcpp__ubuntu_jammy_amd64/1272/testReport/

Thank you, appreciate the heads up! I am new to contributing to ROS repos, so I was unsure of which failures were important (I think the main build is still messed up, but that seems unrelated to this PR).

Linting should be fixed now.

// Re-heap to (possibly) move cancelled timer from head of heap. If
// entire heap is cancelled, this will still result in a nullopt.
TimersHeap locked_heap = weak_timers_heap_.validate_and_lock();
locked_heap.heapify();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The queue is sorted according to the return a->time_until_trigger() > b->time_until_trigger(); comparison.
What does time_until_trigger returns for a cancelled timer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon a cancelled timer, time_until_trigger returns std::chrono::nanoseconds::max(), so resorting should force cancelled timers to be at the bottom of the heap. In a previous iteration of the PR, we had updated the comparison function for the heap to explicitly check for canceled timers and ensure they were on the bottom of the heap, but that was removed since it's obsolete if std::chrono::nanoseconds::max() is always returned for a cancelled timer.

@alsora
Copy link
Collaborator

alsora commented Jan 12, 2024

CI run

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

Copy link
Collaborator

@alsora alsora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me.
I would like @mauropasse to take a look as well before merging it.

@mauropasse
Copy link
Collaborator

The fix and unit tests look good to me!

@fujitatomoya
Copy link
Collaborator

windows are failing https://ci.ros2.org/job/ci_windows/20793/, i think that is because of timer accuracy. (e.g https://ci.ros2.org/job/ci_windows/20793/testReport/rclcpp/TestTimerCancelBehavior_SingleThreadedExecutor/testHeadTimerCancelThenResetBehavior/ timer does not get called as expected.)

Signed-off-by: Matt Condino <mwcondino@gmail.com>
@mwcondino
Copy link
Contributor Author

windows are failing https://ci.ros2.org/job/ci_windows/20793/, i think that is because of timer accuracy. (e.g https://ci.ros2.org/job/ci_windows/20793/testReport/rclcpp/TestTimerCancelBehavior_SingleThreadedExecutor/testHeadTimerCancelThenResetBehavior/ timer does not get called as expected.)

Thanks for the callout - I relaxed the timing constraints, hopefully that will fix things

@clalancette
Copy link
Contributor

Thanks for the callout - I relaxed the timing constraints, hopefully that will fix things

They should be a lot more relaxed than that, otherwise we are going to end up with flaky tests.

In general, on Windows, you should expect things to take 4x as long as on Linux. We shouldn't even consider having timing tests with constraints of less than 200ms or so.

Signed-off-by: Matt Condino <mwcondino@gmail.com>
@mwcondino
Copy link
Contributor Author

Thanks for the callout - I relaxed the timing constraints, hopefully that will fix things

They should be a lot more relaxed than that, otherwise we are going to end up with flaky tests.

In general, on Windows, you should expect things to take 4x as long as on Linux. We shouldn't even consider having timing tests with constraints of less than 200ms or so.

Thanks for the heads up @clalancette, I was unaware that Windows requires such a different order of precision for this type of test. I have relaxed the tests much further and validated that they still work as expected and pass on this branch while failing on current iron. I settled on 300ms waits, since I didn't want to increase the time required to run these tests too much more; already the full test_executors executable takes 18s to run on my machine with these changes.

@alsora
Copy link
Collaborator

alsora commented Jan 18, 2024

New build after changes in unit-tests:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

It's probably for a separate discussion, but I really feel unease with having so many unit-tests using wall-time clock.
IMO unit-tests should always use a simulated time, to be robust to latency that happens in the server.

@clalancette
Copy link
Contributor

It's probably for a separate discussion, but I really feel unease with having so many unit-tests using wall-time clock.

I agree to an extent. We probably should have more tests that use some kind of simulated time. The main reason we don't is that those are harder tests to write.

But I also think that we are always going to have to have tests that use system time. They show problems that a simulated time won't, and they are what most of our users are using.

Signed-off-by: Matt Condino <mwcondino@gmail.com>
@mwcondino
Copy link
Contributor Author

Updated the test to use a simulated clock. For now, that is at .25 realtime, but we can update as necessary if the tests are still flaky and require a slower rate.

@mwcondino
Copy link
Contributor Author

@alsora could you please trigger a full CI run whenever convenient?

@alsora
Copy link
Collaborator

alsora commented Jan 25, 2024

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@mwcondino
Copy link
Contributor Author

Looks like there's still some windows failures, but they appear unrelated to this PR (unless I'm missing something):
projectroot.test_n_nodes__rmw_fastrtps_cpp
test_rclcpp.TestNNodes.test_10_nodes
test_rclcpp.TestNNodesAfterShutdown.test_10_nodes

@clalancette
Copy link
Contributor

Looks like there's still some windows failures, but they appear unrelated to this PR (unless I'm missing something):

While I generally agree with you (tests will only be using the events executor if they are explicitly configured to do so), it is kind of weird that these are failing. We don't see those failures in the nightlies: https://ci.ros2.org/view/nightly/job/nightly_win_rel/ .

I think we should re-run Windows, and if it comes back green this time we can go ahead with it.

@mwcondino
Copy link
Contributor Author

I think we should re-run Windows, and if it comes back green this time we can go ahead with it.

Sounds good - is it possible for me to kick off the jobs? I feel bad bugging folks to start the CI, but I can't see any buttons on https://ci.ros2.org/job/ci_windows/ to start things, and I'm wondering if I'm missing anything (admittedly I have not used Jenkins in the past so this could definiely be a noob question)

@alsora
Copy link
Collaborator

alsora commented Jan 29, 2024

Currently only maintainers can trigger a job, and don't hesitate to ping if we forget to do it!

Windowns-only job:

  • Windows Build Status

@alsora
Copy link
Collaborator

alsora commented Jan 30, 2024

Windows CI is green, indicating that the previous failure is unrelated.
Merging this PR

@alsora alsora merged commit 99f0fc9 into ros2:rolling Jan 30, 2024
3 checks passed
@mwcondino mwcondino deleted the events-executor-timer-cancel branch January 30, 2024 17:22
CursedRock17 pushed a commit to CursedRock17/rclcpp that referenced this pull request Apr 2, 2024
Signed-off-by: methylDragon <methylDragon@gmail.com>

Fix a format-security warning when building with clang (ros2#2171)

In particular, you should never have a "bare" string in a
printf-like call; that could potentially access uninitialized
memory. Instead, make sure to format the string with %s.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Add missing stdexcept include (ros2#2186)

Signed-off-by: Øystein Sture <os@skarvtech.com>

Fix race condition in events-executor (ros2#2177)

The initial implementation of the events-executor contained a bug where the executor
would end up in an inconsistent state and stop processing interrupt/shutdown notifications.
Manually adding a node to the executor results in a) producing a notify waitable event
and b) refreshing the executor collections.
The inconsistent state would happen if the event was processed before the collections were
finished to be refreshed: the executor would pick up the event but be unable to process it.
This would leave the `notify_waitable_event_pushed_` flag to true, preventing additional
notify waitable events to be pushed.
The behavior is observable only under heavy load.

Signed-off-by: Alberto Soragna <alberto.soragna@gmail.com>

Changelog

Signed-off-by: Yadunund <yadunund@openrobotics.org>

21.1.1

Declare rclcpp callbacks before the rcl entities (ros2#2024)

This is to ensure callbacks are destroyed last
on entities destruction, avoiding the gap in time
in which rmw entities hold a reference to a
destroyed function.

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>
Co-authored-by: Mauro Passerino <mpasserino@irobot.com>

add mutex to protect events_executor current entity collection (ros2#2187)

* add mutex to protect events_executor current entity collection and unit-test

Signed-off-by: Alberto Soragna <alberto.soragna@gmail.com>

* be more precise with mutex locks; make stress test less stressfull

Signed-off-by: Alberto Soragna <alberto.soragna@gmail.com>

* fix uncrustify error

Signed-off-by: Alberto Soragna <alberto.soragna@gmail.com>

---------

Signed-off-by: Alberto Soragna <alberto.soragna@gmail.com>

Feature/available capacity of ipm (ros2#2173)

* added available_capacity to get the lowest number of free capacity for intra-process communication for a publisher

Signed-off-by: Joshua Hampp <j.hampp@denso-adas.de>

* added unit tests for available_capacity

Signed-off-by: Joshua Hampp <j.hampp@eu.denso.com>
Signed-off-by: Joshua Hampp <j.hampp@denso-adas.de>

* fixed typos in comments

Signed-off-by: Joshua Hampp <j.hampp@denso-adas.de>

* Updated warning

Co-authored-by: Alberto Soragna <alberto.soragna@gmail.com>
Signed-off-by: Joshua Hampp <j.hampp@denso-adas.de>

* returning 0 if ipm is disabled in lowest_available_ipm_capacity

Signed-off-by: Joshua Hampp <j.hampp@denso-adas.de>

* return 0 if no subscribers are present in lowest_available_capacity

Signed-off-by: Joshua Hampp <j.hampp@denso-adas.de>

* updated unit test

Signed-off-by: Joshua Hampp <j.hampp@denso-adas.de>

* update unit test

Signed-off-by: Joshua Hampp <j.hampp@eu.denso.com>

* moved available_capacity to a lambda function to be able to handle subscriptions which went out of scope

Signed-off-by: Joshua Hampp <j.hampp@denso-adas.de>

* updated unit test to check subscriptions which went out of scope

Signed-off-by: Joshua Hampp <j.hampp@denso-adas.de>

---------

Signed-off-by: Joshua Hampp <j.hampp@denso-adas.de>
Signed-off-by: Joshua Hampp <j.hampp@eu.denso.com>
Co-authored-by: Joshua Hampp <j.hampp@denso-adas.de>
Co-authored-by: Joshua Hampp <j.hampp@eu.denso.com>
Co-authored-by: Alberto Soragna <alberto.soragna@gmail.com>

remove nolint since ament_cpplint updated for the c++17 header (ros2#2198)

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

Changelog.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

21.2.0

Use TRACETOOLS_ prefix for tracepoint-related macros (ros2#2162)

Signed-off-by: Christophe Bedard <bedard.christophe@gmail.com>

Remove flaky stressAddRemoveNode test (ros2#2206)

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

Fix up misspellings of "receive". (ros2#2208)

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Changelog.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

21.3.0

Enable callback group tests for connextdds (ros2#2182)

* Enable callback group tests for connextdds

* Enable executors and event executor tests for connextdds

* Enable qos events tests for connextdds

* Less flaky qos_event tests

Signed-off-by: Christopher Wecht <cwecht@mailbox.org>

Modifies timers API to select autostart state (ros2#2005)

* Modifies timers API to select autostart state

* Removes unnecessary variables

* Adds autostart documentation and expands some timer test

Signed-off-by: Voldivh <eloyabmfcv@gmail.com>

warning: comparison of integer expressions of different signedness (ros2#2219)

  ros2#2167 (comment)

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

Revamp the test_subscription.cpp tests. (ros2#2227)

The original motiviation to do this was a crash during
teardown when using a newer version of gtest.  But while
I was in here, I did a small overall cleanup, including:

1.  Moving code closer to where it is actually used.
2.  Getting rid of unused 'using' statements.
3.  Adding in missing includes.
4.  Properly tearing down and recreating the rclcpp
    context during test teardown (this fixed the actual
    bug).
5.  Making class members private where possible.
6.  Renaming class methods to our usual conventions.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Move always_false_v to detail namespace (ros2#2232)

Since this is a common idiom, especially under this name, we should
define the `always_false_v` template within a namespace to avoid
conflict with other libraries and user code. This could either be
`rclcpp::detail` if it's intended only for internal use or just `rclcpp`
if it's intended as a public helper. In this PR, I've initially chosen
the former.

Signed-off-by: Nathan Wiebe Neufeldt <nwiebeneufeldt@clearpath.ai>

Add new node interface TypeDescriptionsInterface to provide GetTypeDescription service (ros2#2224)

* TypeDescriptions interface with readonly param configuration

* Add parameter descriptor, to make read only

* example of spinning in thread for get_type_description service

* Add a basic test for the new interface

* Fix tests with new parameter

* Add comments about builtin parameters

Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
Signed-off-by: William Woodall <william@osrfoundation.org>

Switch lifecycle to use the RCLCPP macros. (ros2#2233)

This ensures that they'll go out to /rosout and the disk.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Implement get_node_type_descriptions_interface for lifecyclenode and add smoke test for it (ros2#2237)

Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>

Changelog.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

22.0.0

Stop using constref signature of benchmark DoNotOptimize. (ros2#2238)

* Stop using constref signature of benchmark DoNotOptimize.

Newer versions of google benchmark (1.8.2 in my case) warn
that the compiler may optimize away the DoNotOptimize calls
when using the constref version.  Get away from that here
by explicitly *not* calling the constref version, casting
where necessary.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Instrument loaned message publication code path (ros2#2240)

Signed-off-by: Christophe Bedard <christophe.bedard@apex.ai>

associated clocks should be protected by mutex. (ros2#2255)

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

Change associated clocks storage to unordered_set (ros2#2257)

Signed-off-by: Luca Della Vedova <lucadv@intrinsic.ai>

Adding Missing Group Exceptions (ros2#2256)

* Adding Missing Group Exceptions

Signed-off-by: CursedRock17 <mtglucas1@gmail.com>
Signed-off-by: Chris Lalancette <clalancette@gmail.com>
Co-authored-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

Add spin_all shortcut (ros2#2246)

Signed-off-by: Tony Najjar <tony.najjar@logivations.com>

Remove an unused variable from the events executor tests. (ros2#2270)

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Add a pimpl inside rclcpp::Node for future distro backports (ros2#2228)

* Add a pimpl inside rclcpp::Node for future distro backports

Signed-off-by: Emerson Knapp <emerson.b.knapp@gmail.com>
Co-authored-by: Chris Lalancette <clalancette@gmail.com>

Adding Custom Unknown Type Error (ros2#2272)

Signed-off-by: CursedRock17 <mtglucas1@gmail.com>
Co-authored-by: Christophe Bedard <bedard.christophe@gmail.com>

Do not crash Executor when send_response fails due to client failure. (ros2#2276)

* Do not crash Executor when send_response fails due to client failure.

Related to ros2/ros2#1253

It is not sane that a faulty client can crash our service Executor, as
discussed in the referred issue, if the client is not setup properly,
send_response may return RCL_RET_TIMEOUT, we should not throw an error
in this case.

Signed-off-by: Zang MingJie <zealot0630@gmail.com>

* Update rclcpp/include/rclcpp/service.hpp

Co-authored-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
Signed-off-by: Zang MingJie <zealot0630@gmail.com>

* address review comments.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

---------

Signed-off-by: Zang MingJie <zealot0630@gmail.com>
Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>
Co-authored-by: Zang MingJie <zealot0630@gmail.com>

Changelog.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

22.1.0

doc fix: call `canceled` only after goal state is in canceling. (ros2#2266)

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

Fix a typo in a comment. (ros2#2283)

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Revamp list_parameters to be more efficient and easier to read. (ros2#2282)

1. Use constref for the loop variable.
2. Do more work outside of the loop.
3. Skip doing unnecessary work where we can inside the loop.

With this in place, I measured about a 7% performance
improvement over the previous implementation.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Add rcl_logging_interface as an explicit dependency. (ros2#2284)

It is depended on by rclcpp/src/rclcpp/logger.cpp, but
the dependency was not explicitly declared (it was
being inherited from rcl, I believe).  Fix that here.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

add logger level service to lifecycle node. (ros2#2277)

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

Remove unnecessary lambda captures in the tests. (ros2#2289)

* Remove unnecessary lambda captures in the tests.

This was pointed out by compiling with clang.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Correct the position of a comment. (ros2#2290)

Signed-off-by: Jiaqi Li <ljq0831@qq.com>

Make Rate to select the clock to work with (ros2#2123)

* Make Rate to select the clock to work with
Add ROSRate respective with ROS time

* Make GenericRate class to be deprecated

* Adjust test cases for new rates

is_steady() to be deprecated

Signed-off-by: Alexey Merzlyakov <alexey.merzlyakov@samsung.com>
Co-authored-by: Chris Lalancette <clalancette@gmail.com>

Fix C++20 allocator construct deprecation (ros2#2292)

Signed-off-by: Guilherme Rodrigues <guilherme.rodrigues@ait.ac.at>

Topic correct typeadapter deduction (ros2#2294)

* fix TypeAdapter deduction

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

Changelog.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

22.2.0

Cleanup flaky timers_manager tests. (ros2#2299)

* Cleanup flaky timers_manager tests.

The timers_manager tests were relying too heavily on
specific timings; this caused them to be flaky on the
buildfarm, particularly on Windows.

Here, we increase the timeouts, and remove one test which
just relies too heavily on specific timeouts.  This should
make this test much less flaky on the buildfarm.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

fix(ClientGoalHandle): Made mutex recursive to prevent deadlocks (ros2#2267)

* fix(ClientGoalHandle): Made mutex recursive to prevent deadlocks

This prevents deadlocks in cases, were e.g. get_status() would
be called on the handle in a callback of the handle.

* test(rclcpp_action): Added test for deadlocks during access of a goal handle

This test checks, if the code deadlocks, if methods on the goal handle are
called from the callbacks.

Signed-off-by: Janosch Machowinski <J.Machowinski@cellumation.com>
Co-authored-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

Update API docs links in package READMEs (ros2#2302)

Signed-off-by: Christophe Bedard <christophe.bedard@apex.ai>

Fix the return type of Rate::period. (ros2#2301)

In a recent commit (bc43577),
we reworked how the Rate class worked so it could be
used with ROS time as well.  Unfortunately, we also
accidentally broke the API of it by changing the return
type of Rate::period to a Duration instead of a
std::chrono::nanoseconds .  Put this back to a std::chrono::nanoseconds;
if we want to change it to a Duration, we'll have to
add a new method and deprecate this one.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Changelog.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

23.0.0

fix the depth to relative in list_parameters (ros2#2300)

* fix the depth to relative in list_parameters

Signed-off-by: leeminju531 <dlalswn531@naver.com>

Decouple rosout publisher init from node init. (ros2#2174)

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

Documentation for list_parameters  (ros2#2315)

* list_parameters documentation

Signed-off-by: CursedRock17 <mtglucas1@gmail.com>

Removing Old Connext Tests (ros2#2313)

* Removing Old Connext Tests

Signed-off-by: CursedRock17 <mtglucas1@gmail.com>

Update SignalHandler get_global_signal_handler to avoid complex types in static memory (ros2#2316)

* Update SignalHandler get_global_signal_handler to avoid complex types in static memory

This was flagged by msan as a problem.

There's a description of why this is a potential problem here: https://google.github.io/styleguide/cppguide.html#Static_and_Global_Variables

Signed-off-by: Tully Foote <tullyfoote@intrinsic.ai>
Co-authored-by: William Woodall <william+github@osrfoundation.org>

Add locking to protect the TimeSource::NodeState::node_base_ (ros2#2320)

We need this because it is possible for one thread to
be handling the on_parameter_event callback while another
one is detaching the node.  This lock will protect that
from happening.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Add missing header required by the rclcpp::NodeOptions type (ros2#2324)

Signed-off-by: Ignacio Vizzo <ignacio@dexory.com>

Changelog.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

23.1.0

Adding API to copy all parameters from one node to another (ros2#2304)

Signed-off-by: stevemacenski <stevenmacenski@gmail.com>

remove invalid sized allocation test for SerializedMessage. (ros2#2330)

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

add clients & services count (ros2#2072)

* add clients & services count

* add count clients,services tests

Signed-off-by: leeminju531 <dlalswn531@naver.com>

Changelog.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

23.2.0

Fix rclcpp_lifecycle inclusion on Windows. (ros2#2331)

The comment in the commit explains this clearly, but
on Windows ERROR is a macro.  The reuse of it, even
as an enum, causes compilation errors on downstream
users.  Push the macro and undefine it so downstream
consumers can freely include it.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Remove useless ROSRate class (ros2#2326)

Signed-off-by: Alexey Merzlyakov <alexey.merzlyakov@samsung.com>

Fixes pointed out by the clang analyzer. (ros2#2339)

1. Remove the default Logger copy constructor without copy
assignment (rule of three -> rule of zero).
2. Remove an unnecessary capture in a lambda.
3. Mark a variable unused.
4. Mark a method as override.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

address rate related flaky tests. (ros2#2329)

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

Adjust rclcpp usage of type description service (ros2#2344)

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

Add missing 'enable_rosout' comments (ros2#2345)

Signed-off-by: Jiaqi Li <ljq0831@qq.com>

Use message_info in SubscriptionTopicStatistics instead of typed message (ros2#2337)

* Use message_info in SubscriptionTopicStatistics instead of typed message

- Untemplatize the rclcpp::topic_statistics::SubscriptionTopicStatistics
class. Now we will be using message_info instead of typed deserialized
messages in the handle_message callbacks.

* Fix test_receive_stats_include_window_reset by using publisher emulator

- Emulate publishing messages by directly calling
rclcpp::Subscription::handle_message(msg_shared_ptr, message_info) and
settling up needed message_info.source_timestamp

Signed-off-by: Michael Orlov <michael.orlov@apex.ai>
Co-authored-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

Disable the loaned messages inside the executor. (ros2#2335)

* Disable the loaned messages inside the executor.

They are currently unsafe to use; see the comment in the
commit for more information.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Add a custom deleter when constructing rcl_service_t (ros2#2351)

* Add a custom deleter when constructing rcl_service_t

In the type description service construction, we were previously passing
the shared_ptr to the rcl_service_t with the assumption that
rclcpp::Service would do the clean up.  This was an incorrect
assumption, and so I have added a custom deleter to fini the service and
delete when the shared_ptr is cleaned up.

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

Serialized Messages with Topic Statistics (ros2#2274)

Signed-off-by: CursedRock17 <mtglucas1@gmail.com>

rclcpp::Time::max() clock type support. (ros2#2352)

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

Updates to not use std::move in some places. (ros2#2353)

gcc 13.1.1 complains that these uses inhibit copy elision.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

fix (signal_handler.hpp): spelling (ros2#2356)

Signed-off-by: Zard-C <patrick.zhang5233@gmail.com>

Changelog.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

24.0.0

fix(rclcpp_components): increase the service queue sizes in component_container (ros2#2363)

* fix(rclcpp_components): increase the service queue sizes in component_container

Signed-off-by: M. Fatih Cırıt <mfc@leodrive.ai>

Support users holding onto shared pointers in the message memory pool (ros2#2336)

* Support users holding onto shared pointers in the message memory pool

Before this commit, the MessageMemoryPool would actually
reuse messages in the pool, even if the user had taken
additional shared_ptr copies.

This commit fixes things so that we properly handle that situation.  In particular,
we allocate memory during class initialization, and delete
it during destruction.  We then run the constructor when
we hand the pointer out, and the destructor (only) when
we return it to the pool.  This keeps things consistent.
We also add in locks, since in a multi-threaded scenario we need
to protect against multiple threads accessing the pool
at the same time.

With this in place, things work as expected when users hold
shared_ptr copies.  We also add in a test for this situation.

One note about performance: this update preserves the
"no-allocations-at-runtime" aspect of the MessagePool.  However,
there are some tradeoffs with CPU time here, particularly with
very large message pools.  This could probably be optimized
further to do less work when trying to add items back to the
free_list, but I view that as a further enhancement.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Fix data race in EventHandlerBase (ros2#2349)

Both the EventHandler and its associated pubs/subs share
the same underlying rmw event listener.
When a pub/sub is destroyed, the listener is destroyed.
There is a data race when the ~EventHandlerBase wants
to access the listener after it has been destroyed.

The EventHandler stores a shared_ptr of its associated pub/sub.
But since we were clearing the listener event callbacks on the
base class destructor ~EventHandlerBase, the pub/sub was
already destroyed, which means the rmw event listener was also
destroyed, thus causing a segfault when trying to obtain it
to clear the callbacks.

Clearing the callbacks on ~EventHandler instead of
~EventHandlerBase avoids the race, since the pub/sub are still valid.

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

aligh with rcl that a rosout publisher of a node might not exist (ros2#2357)

* aligh with rcl

* keep same behavior with rclpy

1. to not throw exception durning rcl_logging_rosout_remove_sublogger
2. reset error message for RCL_RET_NOT_FOUND

* test for a node with rosout disabled

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

feat(rclcpp_components): support events executor in node main template (ros2#2366)

Signed-off-by: wep21 <daisuke.nishimatsu1021@gmail.com>

Switch to target_link_libraries. (ros2#2374)

That way we can hide more of the implementation by using
the PRIVATE keyword.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Adding QoS to subscription options (ros2#2323)

* Adding QoS to subscription options

Signed-off-by: CursedRock17 <mtglucas1@gmail.com>

make type support helper supported for service (ros2#2209)

* make type support helper supported for service and action as well

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* not to use template and only add the necessary service type currently

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* update comment

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

* add deprecated cycle for `get_typesupport_handle`

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

---------

Signed-off-by: Chen Lihui <lihui.chen@sony.com>

Updated GenericSubscription to AnySubscriptionCallback (ros2#1928)

* added rclcpp::SerializedMessage support for AnySubscriptionCallback

Signed-off-by: Joshua Hampp <j.hampp@denso-adas.de>
Signed-off-by: Joshua Hampp <j.hampp@eu.denso.com>

* using AnySubscription callback for generic subscriptiion

Signed-off-by: Joshua Hampp <j.hampp@denso-adas.de>
Signed-off-by: Joshua Hampp <j.hampp@eu.denso.com>

* updated tests

Signed-off-by: Joshua Hampp <j.hampp@denso-adas.de>
Signed-off-by: Joshua Hampp <j.hampp@eu.denso.com>

* Remove comment

Signed-off-by: Joshua Hampp <j.hampp@eu.denso.com>

---------

Signed-off-by: Joshua Hampp <j.hampp@denso-adas.de>
Signed-off-by: Joshua Hampp <j.hampp@eu.denso.com>
Co-authored-by: Joshua Hampp <j.hampp@denso-adas.de>
Co-authored-by: Jacob Perron <jacob@openrobotics.org>

Changelog.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

25.0.0

Increase timeout for rclcpp_lifecycle to 360 (ros2#2395)

* Increase timeout for rclcpp_lifecycle to 360

Signed-off-by: Jorge Perez <jjperez@ekumenlabs.com>
Co-authored-by: Chris Lalancette <clalancette@gmail.com>

Stop storing the context in the guard condition. (ros2#2400)

* Stop storing the context in the guard condition.

This was creating a circular reference between GuardCondition
and Context, so that Context would never be cleaned up.
Since we never really need the GuardCondition to know
about its own Context, remove that part of the circular
reference.

While we are in here, we also change the get_context()
lambda to a straight weak_ptr; there is no reason for the
indirection since the context for the guard condition
cannot change at runtime.

We also remove the deprecated version of the
get_notify_guard_condition().  That's because there is
no way to properly implement it in the new scheme, and
it seems to be unused outside of rclcpp.

Finally, we add in a test that guarantees the use_count
is what we expect when inside and leaving a scope, ensuring
that contexts will properly be destroyed.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Add transient local durability support to publisher and subscriptions when using intra-process communication (ros2#2303)

* Add intra process transient local durability support to publisher and subscription

Signed-off-by: Jeffery Hsu <jefferyyjhsu@gmail.com>

* Remove durability_is_transient_local_ from publisher_base
Signed-off-by: Jeffery Hsu <jefferyyjhsu@gmail.com>

* Design changes that move most transient local publish functionalities out of
intra process manager into intra process manager

Signed-off-by: Jeffery Hsu <jefferyyjhsu@gmail.com>

* Move transient local publish to a separate function

Signed-off-by: Jeffery Hsu <jefferyyjhsu@gmail.com>

* Remove publisher buffer weak ptr from intra process manager when it associated publisher is removed.

Signed-off-by: Jeffery Hsu <jefferyyjhsu@gmail.com>

* Remove incorrectly placed RCLCPP_PUBLIC

Signed-off-by: Jeffery Hsu <jefferyyjhsu@gmail.com>

* Add missing RCLCPP_PUBLIC

Signed-off-by: Jeffery Hsu <jefferyyjhsu@gmail.com>

* Expand RingBufferImplementation beyond shared_ptr and unique_ptr

Signed-off-by: Jeffery Hsu <jefferyyjhsu@gmail.com>

* Comment and format fix

Signed-off-by: Jeffery Hsu <jefferyyjhsu@gmail.com>

---------

Signed-off-by: Jeffery Hsu <jefferyyjhsu@gmail.com>

Increase the cppcheck timeout to 600 seconds. (ros2#2409)

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Changelog.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

26.0.0

Make sure to mark RingBuffer methods as 'override'. (ros2#2410)

This gets rid of a warning when building under clang.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Removed deprecated header (ros2#2413)

Signed-off-by: Alejandro Hernández Cordero <ahcorde@gmail.com>

[events executor] - Fix Behavior with Timer Cancel (ros2#2375)

* fix

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* add timer cancel tests

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* cleanup header include

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* reverting change to timer_greater function

Signed-off-by: Gus Brigantino <gusbrig97@gmail.com>

* use std::optional, and handle edgecase of 1 cancelled timer

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* clean up run_timers func

Signed-off-by: Gus Brigantino <gusbrig97@gmail.com>

* some fixes and added tests for cancel then reset of timers.

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* refactor and clean up. remove cancelled timer tracking.

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* remove unused method for size()

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* linting

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* relax timing constraints in tests

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* further relax timing constraints to ensure windows tests are not flaky.

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* use sim clock for tests, pub clock at .25 realtime rate.

Signed-off-by: Matt Condino <mwcondino@gmail.com>

---------

Signed-off-by: Matt Condino <mwcondino@gmail.com>
Signed-off-by: Gus Brigantino <gusbrig97@gmail.com>
Co-authored-by: Gus Brigantino <gusbrig97@gmail.com>

Split test_executors up into smaller chunks. (ros2#2421)

The original reason is that on Windows Debug, we were
failing to compile because the compilation unit was
too large.  But also this file was way too large (1200
lines), so it makes sense to split it up.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Changelog.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

27.0.0

feat: add/minus for msg::Time and rclcpp::Duration (ros2#2419)

* feat: add/minus for msg::Time and rclcpp::Duration

Signed-off-by: HuaTsai <huatsai.eed07g@nctu.edu.tw>

crash on no class found (ros2#2415)

* crash on no class found

* error on no class found instead of no callback groups

Signed-off-by: Adam Aposhian <adam.aposhian@fireflyautomatix.com>
Co-authored-by: Chris Lalancette <clalancette@gmail.com>

Update quality declaration documents (ros2#2427)

Signed-off-by: Christophe Bedard <christophe.bedard@apex.ai>

Set hints to find the python version we actually want. (ros2#2426)

The comment in the commit explains the reasoning behind it.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

fix doxygen syntax for NodeInterfaces (ros2#2428)

Signed-off-by: Jonas Otto <jonas@jonasotto.com>

Remove the set_deprecated signatures in any_subscription_callback. (ros2#2431)

These have been deprecated since April 2021, so it is safe
to remove them now.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Add EXECUTOR docs (ros2#2440)

Signed-off-by: Ruddick Lawrence <679360+mrjogo@users.noreply.github.com>

Various cleanups to deal with uncrustify 0.78. (ros2#2439)

These should also work with uncrustify 0.72.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Rule of five: implement move operators (ros2#2425)

Signed-off-by: Tim Clephas <tim.clephas@nobleo.nl>

Implement generic client (ros2#2358)

* Implement generic client

Signed-off-by: Barry Xu <barry.xu@sony.com>

* Fix the incorrect parameter declaration

Signed-off-by: Barry Xu <barry.xu@sony.com>

* Deleted copy constructor and assignment for FutureAndRequestId

Signed-off-by: Barry Xu <barry.xu@sony.com>

* Update codes after rebase

Signed-off-by: Barry Xu <barry.xu@sony.com>

* Address review comments

Signed-off-by: Barry Xu <barry.xu@sony.com>

* Address review comments from iuhilnehc-ynos

Signed-off-by: Barry Xu <barry.xu@sony.com>

* Correct an error in a description

Signed-off-by: Barry Xu <barry.xu@sony.com>

* Fix window build errors

Signed-off-by: Barry Xu <barry.xu@sony.com>

* Address review comments from William

Signed-off-by: Barry Xu <barry.xu@sony.com>

* Add doc strings to create_generic_client

Signed-off-by: Barry Xu <barry.xu@sony.com>

---------

Signed-off-by: Barry Xu <barry.xu@sony.com>

Fix TypeAdapted publishing with large messages. (ros2#2443)

Mostly by ensuring we aren't attempting to store
large messages on the stack.  Also add in tests.
I verified that before these changes, the tests failed,
while after them they succeed.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

Modify rclcpp_action::GoalUUID hashing algorithm (ros2#2441)

* Add unit tests for hashing rclcpp_action::GoalUUID's

* Use the FNV-1a hash algorithm for Goal UUID

Signed-off-by: Mauro Passerino <mpasserino@irobot.com>

relax the test simulation rate for timer canceling tests. (ros2#2453)

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

Revert "relax the test simulation rate for timer canceling tests. (ros2#2453)" (ros2#2456)

This reverts commit 1c350d0.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

enable simulation clock for timer canceling test. (ros2#2458)

* enable simulation clock for timer canceling test.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* move MainExecutorTypes to test_executors_timer_cancel_behavior.cpp.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

---------

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

refactor and improve the parameterized spin_some tests for executors (ros2#2460)

* refactor and improve the spin_some parameterized tests for executors

Signed-off-by: William Woodall <william@osrfoundation.org>

* disable spin_some_max_duration for the StaticSingleThreadedExecutor and EventsExecutor

Signed-off-by: William Woodall <william@osrfoundation.org>

* fixup and clarify the docstring for Executor::spin_some()

Signed-off-by: William Woodall <william@osrfoundation.org>

* style

Signed-off-by: William Woodall <william@osrfoundation.org>

* review comments

Signed-off-by: William Woodall <william@osrfoundation.org>

---------

Signed-off-by: William Woodall <william@osrfoundation.org>

fix spin_some_max_duration unit-test for events-executor (ros2#2465)

Signed-off-by: Alberto Soragna <alberto.soragna@gmail.com>

Do not generate the exception when action service response timeout. (ros2#2464)

* Do not generate the exception when action service response timeout.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

* address review comment.

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

---------

Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com>

Changelog.

Signed-off-by: Marco A. Gutierrez <marcogg@marcogg.com>

28.0.0

Signed-off-by: Marco A. Gutierrez <marcogg@marcogg.com>

fix flakiness in TestTimersManager unit-test (ros2#2468)

the previous version of the test was relying on the assumption that a timer with 1ms period gets called at least 6 times if the main thread waits 15ms. this is true most of the times, but it's not guaranteed, especially when running the test on windows CI servers. the new version of the test makes no assumptions on how much time it takes for the timers manager to invoke the timers, but rather focuses on ensuring that they are called the right amount of times, which is what's important for the purpose of the test

Signed-off-by: Alberto Soragna <alberto.soragna@gmail.com>

Utilize rclcpp::WaitSet as part of the executors (ros2#2142)

* Deprecate callback_group call taking context

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Add base executor objects that can be used by implementors

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Template common operations

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Address reviewer feedback:

* Add callback to EntitiesCollector constructor
* Make function to check automatically added callback groups take a list

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Lint

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Address reviewer feedback and fix templates

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Lint and docs

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Make executor own the notify waitable

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Add pending queue to collector, remove from waitable

Also change node's get_guard_condition to return shared_ptr

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Change interrupt guard condition to shared_ptr

Check if guard condition is valid before adding it to the waitable

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Lint and docs

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Utilize rclcpp::WaitSet as part of the executors

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Don't exchange atomic twice

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Fix add_node and add more tests

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Make get_notify_guard_condition follow API tick-tock

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Improve callback group tick-tocking

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Don't lock twice

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Address reviewer feedback

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Add thread safety annotations and make locks consistent

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* @wip

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Reset callback groups for multithreaded executor

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Avoid many small function calls when building executables

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Re-trigger guard condition if buffer has data

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Address reviewer feedback

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* Trace points

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* Remove tracepoints

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* Reducing diff

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* Reduce diff

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* Uncrustify

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* Restore tests

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* Back to weak_ptr and reduce test time

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* reduce diff and lint

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* Restore static single threaded tests that weren't working before

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* Restore more tests

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* Fix multithreaded test

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* Fix assert

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* Fix constructor test

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* Change ready_executables signature back

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* Don't enforce removing callback groups before nodes

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* Remove the "add_valid_node" API

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* Only notify if the trigger condition is valid

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* Only trigger if valid and needed

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* Fix spin_some/spin_all implementation

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* Restore single threaded executor

Signed-off-by: Michael Carroll <michael@openrobotics.org>

* Picking ABI-incompatible executor changes

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Add PIMPL

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Additional waitset prune

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Fix bad merge

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Expand test timeout

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Introduce method to clear expired entities from a collection

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Make sure to call remove_expired_entities().

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

* Prune queued work when callback group is removed

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Prune subscriptions from dynamic storage

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Styles fixes.

Signed-off-by: Chris Lalancette <clalancette@gmail.com>

* Re-trigger guard conditions

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Condense to just use watiable.take_data

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Lint

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Address reviewer comments (nits)

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Lock mutex when copying

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Refactors to static single threaded based on reviewers

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* More small refactoring

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Lint

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Lint

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Add ready executable accessors to WaitResult

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Make use of accessors from wait_set

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Fix tests

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Fix more tests

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Tidy up single threaded executor implementation

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* Don't null out timer, rely on call

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>

* change how timers are checked from wait result in executors

Signed-off-by: William Woodall <william@osrfoundation.org>

* peak -> peek

Signed-off-by: William Woodall <william@osrfoundation.org>

* fix bug in next_waitable logic

Signed-off-by: William Woodall <william@osrfoundation.org>

* fix bug in StaticSTE that broke the add callback groups to executor tests

Signed-off-by: William Woodall <william@osrfoundation.org>

* style

Signed-off-by: William Woodall <william@osrfoundation.org>

---------

Signed-off-by: Michael Carroll <mjcarroll@intrinsic.ai>
Signed-off-by: Michael Carroll <michael@openrobotics.org>
Signed-off-by: Chris Lalancette <clalancette@gmail.com>
Signed-off-by: William Woodall <william@osrfoundation.org>
Co-authored-by: Chris Lalancette <clalancette@gmail.com>
Co-authored-by: William Woodall <william@osrfoundation.org>

update rclcpp::Waitable API to use references and const (ros2#2467)

Signed-off-by: William Woodall <william@osrfoundation.org>

Add tracepoint for generic publisher/subscriber (ros2#2448)

Signed-off-by: h-suzuki <h-suzuki@isp.co.jp>
jplapp pushed a commit to pixel-robotics/rclcpp that referenced this pull request May 17, 2024
* fix

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* add timer cancel tests

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* cleanup header include

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* reverting change to timer_greater function

Signed-off-by: Gus Brigantino <gusbrig97@gmail.com>

* use std::optional, and handle edgecase of 1 cancelled timer

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* clean up run_timers func

Signed-off-by: Gus Brigantino <gusbrig97@gmail.com>

* some fixes and added tests for cancel then reset of timers.

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* refactor and clean up. remove cancelled timer tracking.

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* remove unused method for size()

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* linting

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* relax timing constraints in tests

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* further relax timing constraints to ensure windows tests are not flaky.

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* use sim clock for tests, pub clock at .25 realtime rate.

Signed-off-by: Matt Condino <mwcondino@gmail.com>

---------

Signed-off-by: Matt Condino <mwcondino@gmail.com>
Signed-off-by: Gus Brigantino <gusbrig97@gmail.com>
Co-authored-by: Gus Brigantino <gusbrig97@gmail.com>
(cherry picked from commit 99f0fc9)
jplapp pushed a commit to pixel-robotics/rclcpp that referenced this pull request May 17, 2024
* fix

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* add timer cancel tests

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* cleanup header include

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* reverting change to timer_greater function

Signed-off-by: Gus Brigantino <gusbrig97@gmail.com>

* use std::optional, and handle edgecase of 1 cancelled timer

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* clean up run_timers func

Signed-off-by: Gus Brigantino <gusbrig97@gmail.com>

* some fixes and added tests for cancel then reset of timers.

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* refactor and clean up. remove cancelled timer tracking.

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* remove unused method for size()

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* linting

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* relax timing constraints in tests

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* further relax timing constraints to ensure windows tests are not flaky.

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* use sim clock for tests, pub clock at .25 realtime rate.

Signed-off-by: Matt Condino <mwcondino@gmail.com>

---------

Signed-off-by: Matt Condino <mwcondino@gmail.com>
Signed-off-by: Gus Brigantino <gusbrig97@gmail.com>
Co-authored-by: Gus Brigantino <gusbrig97@gmail.com>
(cherry picked from commit 99f0fc9)
apojomovsky pushed a commit to apojomovsky/rclcpp that referenced this pull request Jul 8, 2024
* fix

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* add timer cancel tests

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* cleanup header include

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* reverting change to timer_greater function

Signed-off-by: Gus Brigantino <gusbrig97@gmail.com>

* use std::optional, and handle edgecase of 1 cancelled timer

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* clean up run_timers func

Signed-off-by: Gus Brigantino <gusbrig97@gmail.com>

* some fixes and added tests for cancel then reset of timers.

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* refactor and clean up. remove cancelled timer tracking.

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* remove unused method for size()

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* linting

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* relax timing constraints in tests

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* further relax timing constraints to ensure windows tests are not flaky.

Signed-off-by: Matt Condino <mwcondino@gmail.com>

* use sim clock for tests, pub clock at .25 realtime rate.

Signed-off-by: Matt Condino <mwcondino@gmail.com>

---------

Signed-off-by: Matt Condino <mwcondino@gmail.com>
Signed-off-by: Gus Brigantino <gusbrig97@gmail.com>
Co-authored-by: Gus Brigantino <gusbrig97@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants