You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am creating this issue to record something that we are aware of but where I believe the jury is currently out on how to handle; we are in a "monitoring" stance at present & tracking the prevalence of this issue.
There is a case that we encounter occasionally where an agency handles a feed transition like this, where the dates are in order (A, B, C, D, E = D + 1 day, F).
flowchart LR
subgraph f2[Feed 2, uploaded date C]
cal2[calendar.txt covers date E = D+1 to date F]
fi2[feed_info.txt has feed_start_date E = D+1 and feed_end_date F]
end
subgraph f1[Feed 1, uploaded date A]
cal1[calendar.txt covers date B to date D]
fi1[feed_info.txt has feed_start_date B and feed_end_date D]
end
Loading
So, feed 1 is deleted on the date that feed 2 is uploaded, even though feed 1 is not supposed to expire yet, from the agency's perspective.
At any time, the published GTFS dataset should be valid for at least the next 7 days, and ideally for as long as the operator is confident that the schedule will continue to be operated.
However, we are still left with a question of how to handle this scenario in our pipeline. At present, we mark feed 1 as deleted on date C (as soon as feed 2 is uploaded), and the agency will show as having no service between dates C and D, until feed 2 takes effect on date E.
I believe that this handling is defensible, but it can lead to our reports and tables displaying 0 service for an agency during a period where the agency believes that feed 1's coverage should have been persisted (based, perhaps, on feed_end_date in feed_info). We have been told that app consumers keep using the old feed until the new one takes effect.
There is currently no validation being produced when these situations occur, at least in the case of 273.0 (SacRT) for the month of March (feed uploaded 3/3/22 didn't take effect until 4/3/22). There is a validation for cases where it is less than 7 or 30 days before the current feed expires, but there is no validation when the feed has not yet taken effect.
A few considerations:
Do we want to perhaps create two versions of stg_daily_service and fact_daily_service, one version that is "strict" and treats feed 1 as deleted and the other that interpolates service?
We can probably create some logic to stop treating a feed as deleted if the one that replaces it takes effect exactly one day after the original feed, for example.
We should assess prevalence for the different aspects of this -- for example, how precise are feeds about using calendar / calendar_dates + feed_info in exact alignment; do we have cases in the pipeline where there was a genuine pause in service that would have looked the same as this?
I wonder if this approach is taken in particular by feeds that use some specific software or vendors... Perhaps there could be some flag for when we expect this behavior?
We do already have an is_interpolated flag but it doesn't cover this case.
I'm going to close this ticket. Per conversation just now with @e-lo and @o-ram, this situation is explicitly covered in the Cal ITP FAQ. Our recommendation is to publish "future" service in parallel (at a separate link) to the current active feed.
I am creating this issue to record something that we are aware of but where I believe the jury is currently out on how to handle; we are in a "monitoring" stance at present & tracking the prevalence of this issue.
There is a case that we encounter occasionally where an agency handles a feed transition like this, where the dates are in order (
A
,B
,C
,D
,E = D + 1 day
,F
).So,
feed 1
is deleted on the date thatfeed 2
is uploaded, even thoughfeed 1
is not supposed to expire yet, from the agency's perspective.GTFS Best Practices say:
@e-lo has submitted MobilityData/GTFS_Schedule_Best-Practices#48 to clarify the best practices and expectations around this case in general.
However, we are still left with a question of how to handle this scenario in our pipeline. At present, we mark
feed 1
as deleted on dateC
(as soon asfeed 2
is uploaded), and the agency will show as having no service between datesC
andD
, untilfeed 2
takes effect on dateE
.I believe that this handling is defensible, but it can lead to our reports and tables displaying
0
service for an agency during a period where the agency believes thatfeed 1
's coverage should have been persisted (based, perhaps, onfeed_end_date
infeed_info
). We have been told that app consumers keep using the old feed until the new one takes effect.There is currently no validation being produced when these situations occur, at least in the case of
273.0
(SacRT) for the month of March (feed uploaded3/3/22
didn't take effect until4/3/22
). There is a validation for cases where it is less than 7 or 30 days before the current feed expires, but there is no validation when the feed has not yet taken effect.A few considerations:
stg_daily_service
andfact_daily_service
, one version that is "strict" and treatsfeed 1
as deleted and the other that interpolates service?calendar
/calendar_dates
+feed_info
in exact alignment; do we have cases in the pipeline where there was a genuine pause in service that would have looked the same as this?is_interpolated
flag but it doesn't cover this case.cc @edasmalchi @o-ram @Nkdiaz for awareness
The text was updated successfully, but these errors were encountered: