Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding clarifications about trip matching #15

Merged
merged 5 commits into from
Dec 11, 2015

Conversation

msbreadloaf
Copy link
Contributor

I propose adding information on how to match trips in systems for which trip_ids are not unique or trip_ids are not stable/available.

@skinkie
Copy link
Contributor

skinkie commented Nov 13, 2015

Could you describe the impact of this change? What are the risks for current implementations? Is there a producer/consumer pair want to implement it?

@magdalar
Copy link
Contributor

This describes the rules that Google is currently applying.

We are thusly such a consumer, and you (openov) are one of the producers :-D

As for current implementations, I don't think there's a risk, as this is
only adding clarity where there was none.

On Fri, Nov 13, 2015 at 4:30 PM, Stefan de Konink notifications@github.com
wrote:

Could you describe the impact of this change? What are the risks for
current implementations? Is there a producer/consumer pair want to
implement it?


Reply to this email directly or view it on GitHub
#15 (comment).

realtime feed at 10:01. By 10:05 we suddenly know that the trip will start not
at 10:10 but at 10:13. In our new realtime feed we can still identify this trip
as (T, 2015-05-25, 10:10:00) but provide a StopTimeUpdate with departure from
first stop at 10:13:00.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a corner case not covered here, previous raised by @kurtraschke in another thread. A transit agency may run two buses back-to-back in rush hour to double capacity. This is effectively the same trip_id, start_time, and start_date for both vehicles. In that case, you would also need vehicle_id to disambiguate.

@kurtraschke what agency was it that you said you observed this at?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conversation @barbeau describes is in the comments on this document (along with some other points which may be relevant to this issue): https://docs.google.com/document/d/13pkZPVxWcphCLzSO-CpetUZ4onqZ0BN11oIpf78gNp4/edit#heading=h.fy8gzwwva0d4

I believe this is what WMATA does with some trips on the airport buses during the holidays.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@barbeau This corner-case does not really exist. As it is written in the paragraph (start_date + start_time) does not have to correspond to the departure time from the first stop. For such back-to-back buses one can specify start_time differing by, say, 1 second, and still have them depart at the same moment from the first stop per stop_time_update.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@egorich239 True, you can require a unique start_time for each trip_id + start_date combination, and you no longer have the ambiguity issue. I'd argue, though, that this isn't an intuitive way to handle this issue from a producers perspective.

For example, in the underlying data of our GTFS-rt feed, we see predictions start to be generated at a particular time X. In the loop fetching and formatting the data, we use the current time X as start_time. Given that start_time resolution is in seconds in GTFS-rt, it's very likely that two back-to-back buses would map to the same start_time. If you require a unique start_time, a producer would then need to increment X to a time in the future Y, and then check in future loops that X, Y, ... haven't already been used in a previous update. At this point you're treating start_time more like a generic unique immutable ID than an actual time.

IMHO, it's far more natural to simply output the actual start_time + vehicle_id, and require that combination be unique. You then don't need to deal with the potentially colliding start_time issue. And, start_time stays true to its name and is an actual time that corresponds in some way to that element.

In summary, I'd like to see duplicate start_times be allowed for the same start_date + trip_id combination IFF a unique vehicle_id exists for that element.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@barbeau @egorich239

I'm not particularly a fan of adopting vehicle_id as a part of the unique identifier here; first of all, it's not an attribute of TripDescriptor, and so would need to be added (or moved) there.

Secondly, I think that I agree with Ivan that offsetting the start_time by a second or so might be the better approach here: it's not actually feasible [in the real world] for the vehicles to have departed at the same second anyway. In essence however, the start_time here is less the actual start_time, but more the unique identifier of the trip within the given start_date+trip_id combination anyway.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In essence however, the start_time here is less the actual start_time, but more the unique identifier of the trip within the given start_date+trip_id combination anyway.

That's really my main point - if we require start_time to be unique we're overloading the start_time field with something that isn't really a time.

offsetting the start_time by a second or so might be the better approach here

If everyone else agrees, I'll go with the consensus.

@barbeau
Copy link
Collaborator

barbeau commented Nov 13, 2015

An implementation of GTFS-rt support for frequency-based trips (exact_times=0) in OpenTripPlanner that one of our students worked on made similar assumptions, in terms of requiring the combination trip_id and start_time to be unique in the GTFS-rt feed - see opentripplanner/OpenTripPlanner#1647. The plain text output of the GTFS-rt feed we implemented for frequency-based trips (exact_times=0) for our campus bus system is here. We didn't need start_date in our case because service doesn't span midnight, but I'm for including start_date as a requirement in the spec - others may not realize when exactly its needed and when its not.

I raised a corner case above in which vehicle_id is also needed, but generally speaking +1 for this.

It should be noted that some of this is also already outlined in the current definition of start_time.

@egorich239
Copy link
Contributor

@barbeau Re: start_date + GTFS.

Fun fact: if your transit system has a bus starting at 00:30:00 local wall time, then the only proper way I know to encode such bus in GTFS (static) is by defining it as 24:30:00 of the previous day. It is otherwise impossible to encode this bus on a day when time is moved back by 1 hour (that is, the calendar day, when 13 hours pass between 00:00 wall time and 12:00 wall time) -- GTFS definition of time is 12 hours before local noon, which corresponds to 1am wall time for that day, and you'd need negative time to encode such a trip at such a day; GTFS 00:30:00 means wall time 01:30:00 on such day (assuming a typical switch-over time between 2am and 4am). In the same way on the short day (moving time forward), GTFS 00:30:00 will correspond to 23:30:00 wall time of the previous day. If I were to give a recommendation, I'd recommend to always provide times in GTFS (static) feeds in the time window of 03:00:00-27:00:00 (or 02:00:00-26:00:00, depending on when the typical DST-switchover happens in the system), because these times correspond to wall times on every day of the year.

Now, back to GTFS RT. Resolving start_time's of 24:00:00+ without start_date is not very fruitful (as in: possible, but may trigger ambiguity). Therefore I'd also encourage start_date to be specified properly in order to avoid potential ambiguity.

@magdalar
Copy link
Contributor

magdalar commented Dec 8, 2015

Vote +1

@barbeau
Copy link
Collaborator

barbeau commented Dec 8, 2015

Vote +1
On Dec 8, 2015 5:54 AM, "Eric Andresen" notifications@github.com wrote:

Vote +1


Reply to this email directly or view it on GitHub
#15 (comment).

@bboissin
Copy link
Contributor

bboissin commented Dec 8, 2015

Vote +1

1 similar comment
@skinkie
Copy link
Contributor

skinkie commented Dec 8, 2015

Vote +1

msbreadloaf added a commit that referenced this pull request Dec 11, 2015
Adding clarifications about trip matching
@msbreadloaf msbreadloaf merged commit 01a9cfc into master Dec 11, 2015
@barbeau
Copy link
Collaborator

barbeau commented Aug 23, 2016

Looks like the HTML docs for this PR haven't been updated either:

@bboissin
Copy link
Contributor

FYI I'm trying to get the issue with the devsite@google resolved.

@barbeau
Copy link
Collaborator

barbeau commented Aug 25, 2016

Thanks @bboissin!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants