Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Endpoint - manifest.json #462

Merged
merged 4 commits into from
Jan 23, 2023

Conversation

mplsmitch
Copy link
Collaborator

What problem does your proposal solve? Please begin with the relevant issue number. If there is no existing issue, please also describe alternative solutions you have considered.

Discovery of GBFS urls can be challenging for consuming applications, cities and aggregators. Often those seeking GBFS data are forced to guess urls for GBFS feeds for each city or system geography. This also makes the ongoing development and maintenance of the systems.csv data set catalog overly difficult and time consuming.

What is the proposal?

This proposal adds new endpoint (manifest.json) that contains a comprehensive list of gbfs.json files from the publisher. This idea was initially discussed in issue #306.

Some things to note:
While the initial discussion in #306 was focused on organizing this file around geography, my approach has been to make it lightweight and flexible. This will make it easier to implement, and accommodate the needs of some of the data producers that already have similar files. Here's an example of one of an existing file from Beryl. This file could also be used by aggregators to list data sets from multiple operators, like Entur currently does here.

Each producer would only publish a single instance of this file. If a producer operates 10 different systems across 10 cities, they only publish 1 manifest.json listing the 10 systems. This should be at a single url that does not change when versions change or they stop operating in a city.

To avoid circular referencing, the URL for this file is not included in gbfs.json, it is contained in a new field in system_information.json.

This file would be required of any producer that publishes multiple gbfs datasets.

Is this a breaking change?

  • Yes
  • No
  • Unsure

Which files are affected by this change?

gbfs.json
manifest.json
system_information.json

Adds new endpoint (manifest.json) to contain comprehensive list of gbfs.json files from the publisher.
@CLAassistant
Copy link

CLAassistant commented Oct 30, 2022

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ josee-sabourin
❌ mplsmitch
You have signed the CLA already but the status is still pending? Let us recheck it.

@mplsmitch mplsmitch changed the title New disc 10282022 New Endpoint - manifest.json Oct 30, 2022
@Sergiodero
Copy link
Contributor

Tagging those who were interested in the previous issue (#306) @dirkdk @testower @nighthawk

@testower
Copy link
Contributor

testower commented Nov 4, 2022

I think this is good.

@futuretap
Copy link
Contributor

A great addition. +1 from whereto.app!

@cmonagle
Copy link
Contributor

This is great, +1 from Transit

@josee-sabourin
Copy link
Contributor

I hereby call a vote on this proposal. Voting will be open for 10 full calendar days until 11:59PM UTC on Friday, January 20th.

Please vote for or against the proposal, and include the organization for which you are voting in your comment.

Please note if you can commit to implementing the proposal.

@josee-sabourin josee-sabourin added Vote open v3.0-RC Candidate change for GBFS 3.0 (Major release) labels Jan 10, 2023
@futuretap
Copy link
Contributor

futuretap commented Jan 11, 2023

Our official +1 from whereto.app. We'll definitely make use of it.

@testower
Copy link
Contributor

+1 from Entur

@@ -259,7 +261,7 @@ Field Name | REQUIRED | Type | Defines

### gbfs.json

The `gbfs.json` discovery file SHOULD represent a single system or geographic area in which vehicles are operated. The location (URL) of the `gbfs.json` file SHOULD be made available to the public using the specification’s [auto-discovery](#auto-discovery) function.<br />The following fields are all attributes within the main `data` object for this feed.
The `gbfs.json` discovery file SHOULD represent a single system or geographic area in which vehicles are operated. The location (URL) of the `gbfs.json` file SHOULD be made available to the public using the specification’s [auto-discovery](#auto-discovery) function. To avoild circular references, this file MUST NOT contain links to `manifest.json` files.<br />The following fields are all attributes within the main `data` object for this feed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly just curious: you call out avoid linking to manifest.json to avoid circular dependencies, but system_information.json contains a link to the manifest url, providing a circular link of sorts (gbfs.json -> system_information.json -> manifest.json). Is the idea here that consumers are automatically downloading all files referenced in gbfs.json without regard to type and you are worried about them getting stuck in a download loop? Does a similar loop exist for gbfs_versions.json and has it been a problem?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea here that consumers are automatically downloading all files referenced in gbfs.json without regard to type and you are worried about them getting stuck in a download loop?

Yes - that's exactly the concern. It's true that looping is also possible via system_information.json but the manifest.json link needs to go someplace and that's the logical location for it. We haven't heard of any one having problems with gbfs_versions.json . This issue was specifically called out when I started work on this endpoint.

gbfs.md Outdated

Field Name | REQUIRED | Type | Defines
---|---|---|---
`datasets` | Yes | Array | An array containing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if feeds would be more appropriate. IIRC it's used a lot in the spec.

versions also refers to feeds to

the available versions of a feed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with datasets because there's been disagreement on what constitutes a "feed". Is a feed a single endpoint or a complete collection of endpoints? If you look at gbfs.json#feeds it's defined as an array of individual endpoints which together represent a single set of data.

@AntoineAugusti
Copy link
Contributor

It would be very useful to us @ transport.data.gouv.fr! We know company names operating in the country but have a hard time keeping track of where they operate and often rely on "marketing websites" or press articles to keep up to date.

@josee-sabourin
Copy link
Contributor

Voting on this PR closes in 2 calendar days. Please vote for or against the proposal, and include the organization for which you are voting in your comment. Please note if you can commit to implementing the proposal.

@rootsixtysix
Copy link

Hey all, I'm Luka, working at nextbike / TIER. I'm in charge of our GBFS Feed now, since Daniel has new tasks. So hello everybody!

@rootsixtysix
Copy link

+1 from nextbike

@kheraankit
Copy link

+1 from Lime like the idea.

@benwedge
Copy link

Am I understanding correctly that it would be one manifest per operator? So each of Joyride's customers (the brands that actually operate fleets) would need a manifest, not Joyride publishing one manifest for all of our customers?

@mplsmitch
Copy link
Collaborator Author

@benwedge I don't know that there's an easy answer to this because it may come down to who owns the data but I think what would be best would be for Joyride to publish one manifest containing all your customers who agree to publish GBFS. Many platforms (like Joyride) offer to publish GBFS on behalf of their customers if they request it, and the idea behind the manifest is that entities that publish would provide a list of all data sets that they are publishing whether or not they are also the operator.

@cait32
Copy link

cait32 commented Jan 19, 2023

+1 from BCycle

@benwedge
Copy link

Thanks @mplsmitch. I will not be able to review this with the right people in-house in time for voting to close, so Joyride will abstain from this one.

@sven4all
Copy link
Contributor

@benwedge I don't know that there's an easy answer to this because it may come down to who owns the data but I think what would be best would be for Joyride to publish one manifest containing all your customers who agree to publish GBFS. Many platforms (like Joyride) offer to publish GBFS on behalf of their customers if they request it, and the idea behind the manifest is that entities that publish would provide a list of all data sets that they are publishing whether or not they are also the operator.

Would be really interesting if a manifest can reference to other manifests. So in that case Joyride or other shared vehicle software providers can reference to a manifest per customer and there can be one global manifest as an alternative for systems.csv.

@rickbruce
Copy link

+1 from Ito

@benwedge
Copy link

I think what would be best would be for Joyride to publish one manifest containing all your customers who agree to publish GBFS. Many platforms (like Joyride) offer to publish GBFS on behalf of their customers if they request it, and the idea behind the manifest is that entities that publish would provide a list of all data sets that they are publishing whether or not they are also the operator.

I did some initial digging on this, and we definitely cannot provide this as there can be commercial agreements where Joyride cannot be publicly associated with a customer. A keen eye could of course discern this from the systems.csv file or critical thinking when looking at the mobile app.

The bottom line: we can happily supply a manifest per operator, but not across operators.

@ezmckinn
Copy link
Contributor

I like the intent of this — but I have a few questions about how it is to be implemented.

If an operator currently publishes their gbfs.json by geography (e.g. https://mds.linkyour.city/gbfs/us_wa_seattle/gbfs.json) for Superpedestrian's Seattle operations, then are we to publish a https://mds.linkyour.city/gbfs/us_wa_seattle/manifest.json endpoint that references all other markets? Or is the expectation that, in addition to the geography-specific endpoints, we publish a generic "https://mds.linkyour.city/gbfs/manifest.json" file, that references all our geography-specific endpoints?

It seems like the former would needlessly clutter up each geography-specific endpoint. The latter would raise questions about what other files would need to be published via a generic feed. Would that generic gbfs endpoint need to be complete, and include all the other files "required" by GBFS? In that case, what would a generalized "system_information.json" file contain?

Due to the uncertainty, it's a "no" vote from Superpedestrian right now. But I'm open to working out these questions and revisiting it.

@mplsmitch
Copy link
Collaborator Author

@ezmckinn

If an operator currently publishes their gbfs.json by geography (e.g. https://mds.linkyour.city/gbfs/us_wa_seattle/gbfs.json) for Superpedestrian's Seattle operations, then are we to publish a https://mds.linkyour.city/gbfs/us_wa_seattle/manifest.json endpoint that references all other markets?

No: You would not publish a manifest file for each individual market. Superpedestrain would publish a single manifest.json that would contain links to the gbfs.json files for all of the geography specific instances of gbfs that you publish. So in your example it would look like "https://mds.linkyour.city/gbfs/manifest.json" and it would contain links to the gbfs.json files for all of your markets.

It seems like the former would needlessly clutter up each geography-specific endpoint.

There's no clutter because this file is not part of the geography specific endpoints. The only thing this adds to each geography-specific endpoint is a single field, manifest_url in system_information.json which contains the url for the publisher's manifest file.

The latter would raise questions about what other files would need to be published via a generic feed.

There's no generic feed, it's just the manifest.json file which is simply a list of gbfs.json urls. No other files need to be published.

I hope you will reconsider your vote - we've got a lot of support for this but a single"No" vote kills the proposal and and the vote does not pass.

Let me know if you have any more questions

@ezmckinn
Copy link
Contributor

@mplsmitch — thanks for the clarification. Count me as a "yes." ✅

@schnuerle
Copy link
Contributor

If the manifest is the single source of all GBFS links from an operator, how will consumers know for sure which feed URL goes with which city/area. I could imagine scenarios where Paris France and Paris Kentucky feeds have similar URLs (eg paris. and paris2.). In the systems.csv there is a Name and Location fields for more context. Is this a concern, or am I missing something here?

@mplsmitch
Copy link
Collaborator Author

@schnuerle I don't think this is a concern, in fact in an earlier proposal I had systems organized by country/state/province etc. but that was deemed overkill. The data contained by the manifest file intended to feed into systems.csv so that file will still serve as the source for info on individual cities. The primary challenge that this endpoint is trying to address is that very few publishers update systems.csv so there's a lot of data being published that consumers don't have visibility to. If consumers want to use manifest.json in this way, they'll have to do the leg work of looking at lat/lon in the other files to determine if it's Paris France, Paris Kentucky or Paris Ontario.

@schnuerle
Copy link
Contributor

+1 from me at OMF.

Just a note that we were thinking of something like this but a different scope for MDS for some future version.
openmobilityfoundation/mobility-data-specification#628

@josee-sabourin
Copy link
Contributor

josee-sabourin commented Jan 21, 2023

This vote has now closed, and it passes!

Votes in favor:
whereto.app (consumer)
Entur (consumer)
transport.data.gouv (consumer)
Nextbike (producer)
Lime (producer)
PBSC (producer)
BCycle (producer)
Ito World (producer)
Superpedestrian (producer)
OMF (3rd party)

Abstentions:
Joyride (producer)

There were no votes against.
Thank you to everyone who took the time to review and to vote on this! We incorporate it into 3.0-RC, which should be ready to go in the coming week.

@josee-sabourin josee-sabourin merged commit 9366fde into MobilityData:master Jan 23, 2023
josee-sabourin added a commit that referenced this pull request Jan 23, 2023
Adds in-line annotations to changes made in v3.0-RC (#460, #454, #457, #462, #470)
Removes "RC" from annotations in v2.3-RC and RC2
josee-sabourin added a commit that referenced this pull request Jan 23, 2023
* Add versions to new and updated fields

Adds in-line annotations to changes made in v3.0-RC (#460, #454, #457, #462, #470)
Removes "RC" from annotations in v2.3-RC and RC2
Editorial changes
@richfab richfab added the gbfs.md label Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gbfs.md v3.0-RC Candidate change for GBFS 3.0 (Major release) Vote Passed
Projects
None yet
Development

Successfully merging this pull request may close these issues.