Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rosbag: start and end times are wrong (implies wrong duration) #952

Open
NikolausDemmel opened this issue Dec 20, 2016 · 17 comments
Open

rosbag: start and end times are wrong (implies wrong duration) #952

NikolausDemmel opened this issue Dec 20, 2016 · 17 comments
Labels

Comments

@NikolausDemmel
Copy link
Contributor

It seems there is something wrong with the logic that computes the start and end time of bags. This results in the wrong times as well as wrong duration. The issue is present for the rosbag info output as well as the get_start_time() and get_end_time() Python API.

The issue is here:

end_stamp = self._chunks[-1].end_time.to_sec()

    def get_end_time(self):
        """
        Returns the end time of the bag.
        @return: a timestamp of the end of the bag
        @rtype: float, timestamp in seconds, includes fractions of a second
        """
        
        if self._chunks:
            end_stamp = self._chunks[-1].end_time.to_sec()
        else:
            if not self._connection_indexes:
                raise ROSBagException('Bag contains no message')
            end_stamp = max([index[-1].time.to_sec() for index in self._connection_indexes.values()])
        
        return end_stamp

If there are _chunks, the line

end_stamp = self._chunks[-1].end_time.to_sec()

assumes that the last chunk contains the last message, which is not necesarily the case.

Similarly, the first chunk does not necessarily contain the first message.

I guess we need to either look at all chunks and take the min / max of their start / end times, or maybe can we somehow use the index to get to the first and last message more efficiently?

I'm not sure if some other place except the Bag class accesses the chunks directly, but at least for the Bag class the places that need to be updated are get_end_time, get_start_time, __str__, and _get_yaml_info.

@chaoflow
Copy link
Contributor

  • Do messages within a chunk need to be in chronological order?
  • Do chunks need to be in chronological order?

In http://wiki.ros.org/Bags/Format/2.0, I could not find information regarding this.

@wjwwood
Copy link
Member

wjwwood commented Dec 21, 2016

I don't have the citations handy, but based on my experience I would say:

  • Do messages within a chunk need to be in chronological order?

No. I don't think so because to require this would require rosbag to collect messages that will go into a chunk and before writing them sort them chronologically. I do not believe the C++ code does this.

Do chunks need to be in chronological order?

No. I don't think so because the way the C++ rosbag::View iterates over the messages, in my opinion, is that it considers all entries in the index in order to iterate in chronological order. If the chunks were know to be in order, then they would only need to consider the order within the current chunk.

This is a similar function in C++ which calculates the earliest time within a "View":

https://github.com/ros/ros_comm/blob/kinetic-devel/tools/rosbag_storage/src/view.cpp#L174-L187

Based on that, I think this Python function is just naive and not correct.

@wjwwood wjwwood added the bug label Dec 21, 2016
@wjwwood
Copy link
Member

wjwwood commented Dec 21, 2016

Well, on further reflection, my answer to the first question is a bit off. In normal use of rosbag record, messages should always be received in order (by receive time), but my logic still holds true if you're building the bag programmatically. There's nothing to stop you from inserting the messages out of chronological order using the C++ bag API.

@NikolausDemmel
Copy link
Contributor Author

@wjwwood What you say is in line with what I think it should be as well. However, I also don't have a citation to back this up. I agree that the C++ implementation probably handles this correctly, but I'm not sure about the Python implementation. I would assume it does (mostly). I.e. I hope that bag.read_messages(...) returns the bags in chronological order no matter how the chunks are sorted and how the messages are sorted within chunks.

To fix the issue of start and end time, I'm not sure what is the best way. It seems to me that the if / else is there for compatibility with previous formats that have _connection_indexes, but not _chunks.

Naively I would have said to use:

start_stamp = min([c.start_time.to_sec() for c in self._chunks])

However, there can be quite a lot of chunks while I would assume most of the time fewer connections and using the indices seems more reasonable:

start_stamp = min([index[0].time.to_sec() for index in self._connection_indexes.values()])

I'm just not sure if the _connection_indexes are correctly filled in the format 2.0 case with chunks.

I guess there was a reason why the if / else was introduced (I am assuming that that happened when bag format 2.0 was implemented).

@NikolausDemmel
Copy link
Contributor Author

P.S.: As you say the issue would likely not manifest or be quite subtle for bags that are recorded because the chunks will probably be mostly in chronological order.

One case where we notice this is when we record a bag, and then using the python API append one more message (containing metadata) with the bag's start time. That message will get a new chunk, meaning the the start and end time are now the same and the duration is reported as 0s.

@wjwwood
Copy link
Member

wjwwood commented Dec 22, 2016

However, there can be quite a lot of chunks while I would assume most of the time fewer connections and using the indices seems more reasonable:

Using the indices is definitely what we want, as you pointed out iterating over all the chunks would be quite expensive in some cases and the C++ code also uses the indices to answer this question.

One case where we notice this is when we record a bag, and then using the python API append one more message (containing metadata) with the bag's start time. That message will get a new chunk, meaning the the start and end time are now the same and the duration is reported as 0s.

Looks like someone just noted this in the wiki's rosbag cookbook:

http://wiki.ros.org/rosbag/Cookbook?action=diff&rev1=28&rev2=29

And updated the method for adding metadata:

http://wiki.ros.org/rosbag/Cookbook?action=diff&rev1=30&rev2=31

Is that one of you guys?

In any case I think that the behavior of bag.get_end_time() in Python is wrong and should be fixed.

@NikolausDemmel
Copy link
Contributor Author

Using the indices is definitely what we want, as you pointed out iterating over all the chunks would be quite expensive in some cases and the C++ code also uses the indices to answer this question.

So you are quite certain that the indices are properly set also for bag format 2.0? I.e. the fix would boild down to removing the if self._chunks: and just use the current else: branch?

Is that one of you guys?

Nope. I think this is a followup from hordurk/rosbag_metadata#3. At the time it did not investigate why the duration information was affected though...

@wjwwood
Copy link
Member

wjwwood commented Dec 22, 2016

@NikolausDemmel I'm not certain, but I do think it should be that way and I have experiences which would indicate to me that it should be always filled out correctly. However, that code might be in place for when the bag is unindexed, which can be the case when a bag file is not properly closed. You can emulate this by doing ctrl-\ while using rosbag record.

@chaoflow
Copy link
Contributor

Is the current order of self._chunks, i.e. order in the bag file, relevant or could we keep self._chunks chronologically ordered?

@NikolausDemmel
Copy link
Contributor Author

Is the current order of self._chunks, i.e. order in the bag file, relevant or could we keep self._chunks chronologically ordered?

I guess chronoglogical order of chunks is meaningless. The first and last message of a bag file might be in the same chunk.

@NikolausDemmel
Copy link
Contributor Author

The issue still remains. @wjwwood, are you inclined to merge a fix and also backport to at least kinetic if a PR is provided. I see that this has potential for a lot of breakage. What level of testing would be required?

@wjwwood
Copy link
Member

wjwwood commented Mar 5, 2018

I can help review a change for this, and based on the comparison with similar C++ code I'd be in favor of merging at least into melodic. So, targeting melodic I think it's likely to get merged, as for back porting to kinetic, I'd have to look at it in detail again to make a recommendation, and ultimately the decision is @dirk-thomas's.

What level of testing would be required?

I recommend an automated test which shows how the problem manifests (use a bag with out-of order message insertion and at least two chunks), and that will be enough for a melodic merge. For back porting, again I'd defer to @dirk-thomas, but I'd say we would need to communicate this with the community through discourse and give a period for feedback, since this is a behavior change (not necessarily an API change). It's the classic argument that "yes the behavior is wrong, but someone might be depending on that bad behavior", and then it becomes a judgement call on whether to favor a bug fix or stability in Kinetic.

@NikolausDemmel
Copy link
Contributor Author

Creating a test case is actually quite easy:

In [114]: b = Bag("foo.bag", 'w')
     ...: b.chunk_threshold = 0
     ...: b.write("foo", String("2"), Time(2))
     ...: b.write("foo", String("3"), Time(3))
     ...: b.write("foo", String("1"), Time(1))
     ...: b.close()
     ...: b = Bag("foo.bag", 'r')
     ...:

In [115]: b.get_start_time()
Out[115]: 2.0

In [116]: b.get_end_time()
Out[116]: 1.0

There is actually another issue, which is that read_messages does not return messages in chronological order in this case. I would say this is a bug as well.

In [117]: list(b.read_messages())
Out[117]:
[BagMessage(topic='foo', message=data: "2", timestamp=rospy.Time[2000000000]),
 BagMessage(topic='foo', message=data: "3", timestamp=rospy.Time[3000000000]),
 BagMessage(topic='foo', message=data: "1", timestamp=rospy.Time[1000000000])]

The C++ API seems to sort correctly. At least rosbag play publishes them in chronological order (whereas e.g. rostopic echo -b ... does not).

A quick investigation leads me to believe that the indices needs to be sorted, as the current read logic does not guarantee that if it is read from multiple chunks:

for chunk_info in self.bag._chunks:
self.bag._file.seek(chunk_info.pos)
_skip_record(self.bag._file)
self.bag._curr_chunk_info = chunk_info
for i in range(len(chunk_info.connection_counts)):
connection_id, index = self.read_connection_index_record()
self.bag._connection_indexes[connection_id].extend(index)

Later, the indices are used as if they were sorted:

for entry, _ in _mergesort(self._get_indexes(connections), key=lambda entry: entry.time):

@wjwwood
Copy link
Member

wjwwood commented Mar 6, 2018

Yeah, it sounds like this detail (bags can contain out of order messages) was just ignored by the Python implementation, and it's not surprising since recording bags almost always result in bags that happen in order.

@JzHuai0108
Copy link

Hi @NikolausDemmel, I have encountered the issue you mentioned several times.
Now I have an opposite opinion to yours (years ago though).

As you stated for ROS1 bags, the C++ implementation will sort the messages at a topic when reading a rosbag, and the python implementation will NOT sort the messages at a topic when reading the rosbag.
I think the python way is actually a good thing as the C++ implementation massages the original data by distorting their order.
Imagine this case: A rosbag has a sequences of msgs, with increasing timestamps, but at one point, a bad msg with an earlier timestamp arrives, and then more messages with proper timestamps.
In python way, we can easily get this bad msg and discard it by checking its timestamp to its neighbors.
But with the C++ API, we cannot do this easily as the msgs are sorted.

Stepping back, I've also been bitten by the inaccurate bag_end_time offered by the python implementation.
In that case, my program assumed that a rosbag with the same msgs would have the same bag_end_time. But in fact, the bag_end_time will change if you extract some msgs and then write them back to the rosbag.
That is exactly what I did and altered the bag_end_time.

Despite this small inconvenience, I still favor the python implementation.

@NikolausDemmel
Copy link
Contributor Author

I see your argument, but I'm not sure I'm convinced. This is about the recording timestamp of the message, not the timestamp in the message itself.

but at one point, a bad msg with an earlier timestamp arrives, and then more messages with proper timestamps.

IIRC, that's not how it works. In this case, the bad msg would be recorded with the current time as the recording timestamp, determined by the receiver (e.g. rosbag). If it's a message that contains a timestamp, set by the sender (not all message types do), you could later identify the bad message by comparing the recording timestamp with the message timestamp.

Note that I haven't worked with/on this in a long time, so take it with a grain of salt.

@JzHuai0108
Copy link

I dig deeper into the problem rosbag and find that the recording times were stripped and replaced by the sender times. There was a bad message from the sender with an earlier sender timestamp. These combined choked our program. In the original rosbag, the recording times increase monotonically. So you are right that we can find the bad message by checking the sender times given that the recording times are intact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants