Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't insert a TF frame if one of the same timestamp already exists, instead just overwrite it. #426

Closed
wants to merge 9 commits into from

Conversation

pbeeson
Copy link
Contributor

@pbeeson pbeeson commented Oct 19, 2019

So, we ran into an issue with ROS on a project (still on kinetic due to legacy robot issues, but this applies to melodic too, so here's a melodic PR).

Generally, if a node is broadcasting TF data based on other incoming frames and/or other information, and something goes wrong, it will stop publishing data. But we ran into a scenario where something went wrong in a node, and it kept broadcasting the same TF data with the same timestamp, forever. When this happened, we noticed that all nodes with a tf listener started immediately consuming more and more memory.

In looking into this, it is clear that when storing an incoming TF in the tf2 listener code, that if a frame between parent and child exists with the same timestamp, that the incoming data is not ignored, nor does it overwrite the previous frame, but it gets inserted next to the existing frame with the same timestamp, so that now there are multiple transforms between the same two frames with the same timestamp.

I added some debug console outputs and a small publisher node and a node that just makes a tf listener and spins. This proved that the storage for tf listener does indeed grow infinitely so long as no newer data comes in that is more than max_duration (10 seconds) to force these to be pruned.

The fix is VERY simple, and has been tested with the stand alone programs. where previously the size of storage_ grew without bound when sending the exact same message in a loop, but now remains size 1 for that parent/child transform.

@pbeeson pbeeson changed the title Don't insert a TF frame is one of the same timestamp already exists, instead just overwrite it. Don't insert a TF frame if one of the same timestamp already exists, instead just overwrite it. Oct 19, 2019
Copy link
Member

@tfoote tfoote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch, I'd like to switch it to drop instead of overwriting. But otherwise it looks good.

Adding a quick test would also be great to make sure that it keeps working.

@@ -261,7 +261,10 @@ bool TimeCache::insertData(const TransformStorage& new_data)
break;
storage_it++;
}
storage_.insert(storage_it, new_data);
if (storage_it != storage_.end() && storage_it->stamp_ == new_data.stamp_)
*storage_it = new_data;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in #414 I'd rather see the newer data dropped instead of overwriting the older data. With the current semantics /tf data is effectively considered immutable. And without the ability to rewrite sections of data consistently you could easily end up with data from two datasources being improperly interpolated between.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I changed the patch to return immediately when a duplicate timestamp is seen instead of overwriting data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In simply changing the assignment to use the old value, one of the tests in test_tf2 was failing. Further analysis showed that it was not anchoring the forward and inverse transforms to the first frame in the list but the second one.

So, the new code for rejecting redundant transforms actually caught a bug in the tests. 🍰

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great thanks!

Patrick Beeson and others added 3 commits October 21, 2019 21:37
@tfoote
Copy link
Member

tfoote commented Oct 23, 2019

Great thanks for the update it looks great. I'd like to add one more element which is to add a warning that the message is being dropped. Also I'd like to check if we should be returning false from the insertData call if we don't insert. As I think that makes sense since the data wasn't inserted like the other false return case. Maybe the error can be caught one layer higher, but likely we'll have to just print a warning to console.

@pbeeson
Copy link
Contributor Author

pbeeson commented Oct 24, 2019 via email

@pbeeson
Copy link
Contributor Author

pbeeson commented Oct 24, 2019

OK @tfoote. I pushed a change that does accomplish both an error message AND returns false. Because there are 2 ways false can happen now, and there are 2 different error messages, the easiest was to borrow what other functions do here and pass back an error_message to the higher level.

@pbeeson pbeeson requested a review from tfoote November 1, 2019 17:20
@tfoote
Copy link
Member

tfoote commented Nov 6, 2019

This pattern looks good for the long term with passing the, but at first pass it looks like adding the new argument is an ABI breaking change that I'd like to avoid backporting to the existing distros to avoid breaking people.

I can setup a new noetic-devel branch to retarget this clean implementation and then we can possibly reuse your previous implementation for the backports.

@pbeeson
Copy link
Contributor Author

pbeeson commented Nov 6, 2019 via email

@tfoote
Copy link
Member

tfoote commented May 13, 2020

Merged in #459

@flixr
Copy link

flixr commented May 13, 2020

We ran into the same problem...
It would be nice to have the "original" fix (without the error string) backported to melodic.

Earthwings added a commit to Earthwings/annotate that referenced this pull request May 10, 2021
In ROS Noetic the behavior has changed and transform updates
are ignored if the timestamp is the same, see
ros/geometry2#426 (comment)

This breaks the update logic in the previous approach
ooeygui pushed a commit to ms-iot/geometry2 that referenced this pull request Oct 12, 2022
* [tf2_geometry_msgs] pep8 formatting

* [tf2_geometry_msgs] fix copyright

* tf2_geometry_msgs header guard and copyright

* More flake8 stuff

Co-authored-by: Bjar Ne <gleichdick@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants