Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change stops.txt presence because of demand responsive services #472

Merged

Conversation

tzujenchanmbd
Copy link
Collaborator

@tzujenchanmbd tzujenchanmbd commented Jun 4, 2024

Context

Before the adoption of GTFS-Flex in PR#433, stops.txtwas the sole file in GTFS defining the geographic locations (geometry: points) where riders board and alight. However, after its adoption, producers can define riders boarding and alighting geographic areas through locations.geojson (geometry: polygons) as well. In theory, an agency may exclusively offer zone-based demand-responsive services, in which case the feed may only contain locations.geojson without stops.txt.

Proposed change in this PR

Therefore, suggest changing the presence of stops.txt from Required to Conditionally required (required if locations.geojson doesn't exist, optional otherwise).

GTFS validator behavior

Once the spec is modified, the GTFS validator can make corresponding adjustments, such as:

  • Both stops.txt and locations.geojson exist - no error
  • Either stops.txt or locations.geojson exists - no error
  • Neither stops.txt nor locations.geojson exists - error (missing required file stops.txt)

Conditionally required vs Optional
If both stops.txt and locations.geojson are optional, the GTFS validator won't show an error when neither stops.txt nor locations.geojson exists. If there is no geographic information defining where riders boarding and alighting in the feed, the feed seems invalid to me.

Since stops.txt is one of the fundamental file in GTFS, we plan to open a vote on this change.
cc: @westontrillium @leonardehrenfried @tsherlockcraig @gcamp

@tzujenchanmbd
Copy link
Collaborator Author

@LeoFrachet would you like to share the reason you are leaning toward making it optional? (issue#452)

@tzujenchanmbd tzujenchanmbd added GTFS Schedule Issues and Pull Requests that focus on GTFS Schedule Extension: GTFS-Flex Issues and Pull Requests that focus on GTFS-Flex Extension Change: Clarification Revisions of the current specification to improve understanding. labels Jun 4, 2024
@tzujenchanmbd tzujenchanmbd changed the title Change stops.txt presence from "Required" to "Conditionally required" Change stops.txt presence because of demand responsive services Jun 4, 2024
@westontrillium
Copy link
Contributor

LGTM, I think stops.txt should be conditionally required for the reasons @tzujenchanmbd provided.

@leonardehrenfried
Copy link
Contributor

I agree. It should be conditionally required.

@westontrillium
Copy link
Contributor

Is anymore discussion needed for this PR? Can a vote be called?

@tzujenchanmbd
Copy link
Collaborator Author

This PR has been open for at least 7 calendar days. As per the Spec Amendment Process, I am opening a vote for this presence change.

Voting ends on 2024-08-05 at 23:59:59 UTC.

(cc @westontrillium, @leonardehrenfried, @gcamp)

@tzujenchanmbd tzujenchanmbd added the Status: Voting Pull Requests where the advocate has called for a vote as described in the changes.md label Jul 22, 2024
@westontrillium
Copy link
Contributor

+1 Trillium

Thanks @tzujenchanmbd!

@skinkie
Copy link
Contributor

skinkie commented Jul 22, 2024

This is (and in essence #433) a breaking change, because we cannot determine what kind of feed cannot be processed. Having an empty stops.txt also does not make sense for a flexible feed. In this specific example #433 also does not introduce anything that suggest this would be a GTFS feed with 'flexible properties'. This pull request is therefore incomplete, my suggestion here is to include an explicit mechanism that announces which files should be expected based on a specific GTFS feature set, and only then speak of something being conditionally required. Because the current wording would basically allow to specify a locations.geojson as being a replacement of stops.txt, which is something we do not have agreement on. Similar to #393

-1 OpenGeo

@LeoFrachet
Copy link
Contributor

LeoFrachet commented Jul 23, 2024

Thanks Stefan for your message.

As always alas, answering in a structure way to a dissenting view takes more time than expressing the dissenting view. So here we go.

About the definition of "breaking change" in the GTFS Guiding Principles, and therefore the question of backward compatibility

One of the official Guiding Principles of GTFS extension is indeed that "Changes to the spec should be backwards-compatible", which is define as the following in [CHANGES.md](https://github.com/google/transit/blob/master/gtfs/CHANGES.md#changes-to-the-spec-should-be-backwards-compatible) (emphasis is mine):

When adding features to the specification, we want to avoid making changes that will make existing feeds invalid. We don't want to create more work for existing feed publishers until they want to add capabilities to their feeds. Also, whenever possible, we want existing parsers to be able to continue to read the older parts of newer feeds.

Therefore it distinguishes two different types of backward compatibility:

  • Keeping the old datasets valid, which is flagged as the most important one. And on this point, this proposal is 100% backward compatible, since the old datasets will stay valid.
  • Allowing the existing parsers to read the older parts of newer feeds: This doesn't prevent the existing parsers to read older parts, this will only prevent parsers that do not support GTFS-Flex to parse... well... datasets with only GTFS-Flex data (since they are not stop).

The balance here is between:

  • [Option 1] Keeping the file required, making it odd for any new GTFS-Flex builder which has to generated the file even if empty (I know, I just coded one, and I had to add that exception).
  • [Option 2] Making the file optional, which will make the parsers which cannot read GTFS-Flex datasets unable to read GTFS-Flex-only datasets.

=> Therefore, I personally don't see backward compatibility issue here, and I don't see any "breaking change" as defined in the Guiding Principles.

About "includ[ing] an explicit mechanism that announces which files should be expected"

What would be the need (or the benefit) of such feature?

  • I does not add any new passenger information (see Guiding Principle "The spec is about passenger information")
  • It makes the spec harder to produce (see Guiding Principle "Feeds should be easy to create and edit")
  • GBFS (emphasis on the B) does have an auto-discovery file with [gbfs.json](https://gbfs.org/specification/reference/#gbfsjson). But here the need is to be able to query individually each file. Static GTFS datasets are already zipped together, so this need doesn't exist here.

The main feature that it would add is the same than a checksum. It confirms that the data that you have is "valid", because you can test consistency. But checksum are usually provided when there is a risk of data loss or corruption in the pipeline, which is not the need here.

Furthermore, some could argue it's creating redundancy of data, since having a GTFS-Flex feature already states that this is, well, a dataset containing GTFS-Flex features. And adding a flag can create inconsistency:

  • If the flag is set as "doesn't contain GTFS-Flex", but then there are Location Groups, what should we do? Virtually every data consumers that I know would just ignore the flag and consume the data. Basically every data consumers that I know are used to get almost invalid dataset and to deal with it.
  • If the flag is set as "does contain GTFS-Flex" but it doesn't contain any GTFS-Feature... does it make it an invalid dataset? Is an empty locations.geojson enough to allow that flag to be one? Or do we need (for example) at least one Location in the file?

=> Therefore, from my perspective, this would complexify the specification and its validation without bringing any benefits. I personally do not see any need nor value for such new field.

About the -1 votes

I can only thank you Stefan for his long lasting implication the GTFS community, and all the feedback you have provided throughout the years.

-1 votes are fundamental for the healthiness of the conversations around the community, and diversity of point of view is what allow GTFS to grow slowly and steadily.

-1 votes, since they are veto-ing any proposal, are also requiring a lot of work from the community to go through each point put forth by the person who voted -1, and are extremely time consuming. It's a pleasure for me to go through each of the points that you have provided (1. concern about breaking changes, 2. the idea to put a flag), but it's a lengthy process for me to write, and an even more costly process for the others to read.

Therefore, some possible suggestion of improvement for the future could be:

  • a clear real-life use case of what's the issue (here, a use-case of where the breaking change would be an issue. IMHO this would only make a parser which cannot read GTFS-Flex... reject a dataset with GTFS-Flex only. Aka a non-issue as far as I can understand)
  • maybe even an proposal amendment process: If somebody -1 the proposal, thinking there is a better way to do it, they can proposal an alternate version, the community vote on it, and if the community rejects the amendment and prefer the original draft, we roll back to it. Here that would mean that the amendment proposal that you are making (adding a flag to say if the dataset contains GTFS-Flex) would go through a vote, and be supported or not.

Finally, whoever you are, if you have read that whole message, please react with some 👀. That will let me know if writing those details answers are useful or if I should change their format. Thanks! 🙏

@skinkie
Copy link
Contributor

skinkie commented Jul 23, 2024

Please read carefully what the change is that is proposed: Stops where vehicles pick up or drop off riders. Also defines stations and station entrances.

Conditionally Required:
- Required if locations.geojson does not exist.
- Optional otherwise.

I can read above as: if as producer I would provide locations.geojson, and place my regular stops in locations.geojson they can suggest to have a valid feed. This breaks existing consumers/parsers, hence is a breaking change. This must be reworded better. "If you don't provide regular public transport stops, but only flexible transport stops, than stops.txt is not required, but locations.json is."

p.s. someone broke gtfs.org.

@westontrillium
Copy link
Contributor

westontrillium commented Jul 23, 2024

I understand how someone with less familiarity with the spec could interpret the requirement the way @skinkie describes when they first read it. But for the sake of the exercise, let's say someone does and thinks to use locations.geojson to represent their stops. Their next step then would be to review the section on locations.geojson. Well, fortunately, the first sentence of that section says "Defines zones where riders can request either pickup or drop off by on-demand services." That should be enough to dissuade someone from using that file in any way other than for zones from the start. :)

That said, I am still open to changing the wording, but we should to be careful not to word it in a way that prevents easy validation: Required if demand-responsive zones, which are defined in locations.geojson, do not exist.

@tzujenchanmbd
Copy link
Collaborator Author

@skinkie I agree that when producers see the description of stops.txt in the Dataset Files section, they might misunderstand that fixed-route stops can also be produced by locations.geojson. Adding clarification here would be helpful.

"If you don't provide regular public transport stops, but only flexible transport stops, than stops.txt is not required, but locations.json is."

In the current spec, stops can also be used by demand responsive services, for instance, by grouping them using location groups. Depending on the actual use case, demand responsive services might only include location groups without areas defined by locations.geojson. This is why the proposed change uses the condition "if...do not exist."

Would @westontrillium 's suggestion here change your vote?
Required if demand-responsive zones, which are defined in locations.geojson, do not exist.

@tzujenchanmbd tzujenchanmbd added Status: Discussion Issues and Pull Requests that are currently being discussed and reviewed by the community. and removed Status: Voting Pull Requests where the advocate has called for a vote as described in the changes.md labels Jul 23, 2024
@skinkie
Copy link
Contributor

skinkie commented Jul 23, 2024

Would @westontrillium 's suggestion here change your vote?

If we can improve the wording my vote change absolutely. I only want to make sure more clarity is added. But I also think that wording is still written down in a suboptimal way, I hope native speakers can improve upon. Is there a reason we don't write: Optional if and only if demand-responsive zones exists and are defined in locations.geojson.

@isabelle-dr isabelle-dr added Status: Voting Pull Requests where the advocate has called for a vote as described in the changes.md and removed Status: Discussion Issues and Pull Requests that are currently being discussed and reviewed by the community. labels Jul 23, 2024
@westontrillium
Copy link
Contributor

I almost always prefer a positive statement over a negative one for clarity's sake but had assumed the use of "required" in the description was necessary. If that is not the case, then I would support the following (tweaked @skinkie's sentence just a little): Optional if demand-responsive zones are defined in locations.geojson. Required otherwise [do we need this too?].

@tzujenchanmbd
Copy link
Collaborator Author

I think conveying the correct message is more important than stylistic consistency (e.g. "required if...").

Revised wording based on the above suggestion - 85f6d08

Thanks @skinkie @westontrillium for the suggested wording!

@skinkie
Copy link
Contributor

skinkie commented Jul 24, 2024

+1 OpenGeo

@leonardehrenfried
Copy link
Contributor

+1 OpenTripPlanner

@gcamp
Copy link
Contributor

gcamp commented Aug 5, 2024

+1 Transit

@LeoFrachet
Copy link
Contributor

+1 Spare

@tzujenchanmbd
Copy link
Collaborator Author

The vote passed on 2024-08-05 at 23:59:59 UTC.

4(+1) votes in favour and no votes against.

Votes came from:
Trillium (@westontrillium)
OpenGeo (@skinkie)
OpenTripPlanner (@leonardehrenfried)
Transit (@gcamp)
Spare (@LeoFrachet) (voted on 2024-08-06)

Thank you to everyone who participated!

@tzujenchanmbd tzujenchanmbd merged commit 4080084 into google:master Aug 7, 2024
2 checks passed
@tzujenchanmbd
Copy link
Collaborator Author

This change has been incorporated into the Canonical GTFS Validator V6.0.0. Please refer to the release page: https://github.com/MobilityData/gtfs-validator/releases/tag/v6.0.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Change: Clarification Revisions of the current specification to improve understanding. Extension: GTFS-Flex Issues and Pull Requests that focus on GTFS-Flex Extension GTFS Schedule Issues and Pull Requests that focus on GTFS Schedule Status: Voting Pull Requests where the advocate has called for a vote as described in the changes.md
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DRT] After the adoption of GTFS-Flex, stops.txt should no longer be a required file.
7 participants