Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do we need to include GBFS version? #15

Closed
jcn opened this issue Jan 15, 2016 · 33 comments
Closed

Do we need to include GBFS version? #15

jcn opened this issue Jan 15, 2016 · 33 comments

Comments

@jcn
Copy link
Contributor

jcn commented Jan 15, 2016

@mdarveau and @dsgermain raise up a good point in #9 about versioning. While we've said that we would version the spec in the repository via git tags, we did not actually ever decide where to indicate the GBFS version in the feed.

I can imagine two logical places for this:

  1. In gbfs.json, in which case we would want to make this file required (though the version field would be the only file that was required - the auto-discovery mechanism would change to optional)
  2. In system_information.json, but this doesn't really make sense since it's metadata about the feed, not about the system

I am leaning toward including it in gbfs.json but would like other people's thoughts.

@emilesalem
Copy link

How about in the common header info.

@jcn
Copy link
Contributor Author

jcn commented Jan 15, 2016

@E1000s That's an interesting idea, though I'd be concerned about a system deciding to mix and match their feeds. This shouldn't be supported (i.e. you shouldn't be able to have a 1.0 station_information but a 2.0 station_status) but if the version was in the header of each individual file this might implied.

@jcn jcn changed the title Need to include GBFS version Do we need to include GBFS version? Jan 15, 2016
@jcn
Copy link
Contributor Author

jcn commented Jan 15, 2016

I would also love to hear from people who have implemented GTFS consumers and publishers. The GTFS spec actually indicates that changes should be backward compatible. This is probably something we should shoot for and versioning should be saved for major breaking changes. The problem of course with breaking changes is that they're a pain for everyone (consumers and publishers), so they should really be avoided.

@dsgermain
Copy link
Contributor

The gbfs.json should maybe list a correspondence between a version number and the URL path of the specified version ( /v1, /v2, etc.).
The other option is to pass the version number in headers and for the server to redirect to the proper endpoint. I prefer the first option. Thoughts?

@mdarveau
Copy link

I agree we should aim for no breaking change. This does not mean we should not version the file so the consumer can know upfront what to expect.

Proposition:

  • In gbfs.json we add a version field that respect http://semver.org/ and we increment the minor version in changes that are backward compatible.
  • If we have a breaking change, we would introduce a new header so publisher can publish the old and new version. ie: <link rel="gbfs-v2" type="application/json" href="http://mybikeshare.com/opendata/v2/gbfs.json" />
    • This would allow breaking change in the gbfs.json itself

@cubbi
Copy link
Contributor

cubbi commented Jan 20, 2016

Shouldn't we simply link to other versions of the protocol from gbfs.json file? This would not allow for breaking changes in the gbfs.json, but in majority cases we will rather add fileds than remove, and would not require change of the link attribute.

Also maintaining few versions inside gbfs.json file will be more elegant than having 2 or 3 link tags in the header.

@mdarveau
Copy link

@cubbi Something like this?

{
  "last_updated": 1434054678,
  "ttl": 0,
  "data": {
    "en": {
      "feeds": [
        {
          "name": "system_information",
          "version": "1.0",
          "url": "https://www.example.com/gbfs/en/system_information"
        },
        {
          "name": "station_information",
          "version": "1.0",
          "url": "https://www.example.com/gbfs/en/station_information"
        }
      ]
    },
    "fr" : {
      "feeds": [
        {
          "name": "system_information",
          "version": "1.0",
          "url": "https://www.example.com/gbfs/fr/system_information"
        },
        {
          "name": "station_information",
          "version": "1.0",
          "url": "https://www.example.com/gbfs/fr/station_information"
        }
      ]
    }
  }
}

Should we include the minor version or only major? If we include the minor version we force the consumer to do semver parsing when looking for the right version. If we don't the consumer does not know what to expect.

What I like about the idea of adding a new link tag on major version is that it emphasize the fact that it is a different (ie not backward compatible) protocol. We could still output the version (including minor version) in the gbfs.json feed.

@cubbi
Copy link
Contributor

cubbi commented Jan 27, 2016

@mdarveau Sorry for delay. Yes, I think this is the only way we can structure it without breaking v1, the other option would be to use version number as a key in "data".

With your proposal, should this be allowed to use different versions of different files? e.g. use station_information in 1.0 format, and system_information in 2.0?

@mdarveau
Copy link

@cubbi Yes, I would apply a versioning scheme per feed thus allowing station_information in 1.0 format, and system_information in 2.0.

@antrim
Copy link
Contributor

antrim commented Sep 6, 2019

Way back in 2016 @jcn wrote:

I would also love to hear from people who have implemented GTFS consumers and publishers. The GTFS spec actually indicates that changes should be backward compatible. This is probably something we should shoot for and versioning should be saved for major breaking changes. The problem of course with breaking changes is that they're a pain for everyone (consumers and publishers), so they should really be avoided.

GTFS has followed two different practices:

  • Amendments to GTFS (static) are nearly always backwards-compatible (some clarifications to GTFS to correct ambiguities have not been compatible with every implementation). GTFS (static) is not versioned. MobilityData is beginning to implement changes to the GTFS as "extensions", which is a collection of fields that deliver specific functionalities in consuming applications. "GTFS-Pathways" and GTFS-Flex provide good examples of this. Existing application applications can continue consume GTFS data that adds these extensions; those applications may or may not add features that utilize the data in the extension format.
  • Changes to to GTFS Realtime that change the protocol buffers definitions files are not backwards-compatible. There is currently a v1 and a v2.

I recommend that GBFS accept amendments that break backwards compatibility only when absolutely necessary, and, if there are cases when that is necessary, then a versioning scheme would be essential. If feed consumers would find versioning useful for all spec changes, I'd recommend using whole digit version increments (v1, v2…) for non-backwards compatible changes, and numbers to the right of the decimal (v1.1, v1.2…) for backwards-compatible changes.

Some questions:

  • Are there other thoughts from people participating in GBFS governance?
  • Would GBFS consumers find the declaration of a version (e.g. v1.2) in a feed useful, or is it enough to just see what fields are being provided?

@barbeau
Copy link
Member

barbeau commented Sep 6, 2019

Quick clarification on GTFS-realtime - changes to the .proto file for the Protocol Buffer format are backwards compatible, but we introduced a v2 to define semantic field requirements to the spec (previously the "required" indicator was specific to the Protocol Buffer implementation and didn't really reflect transit-specific logic). This allowed us to have valid v1 feeds in the wild, but tighten the requirements on future v2 feeds to improve data quality in a number of use cases. Details here if anyone is interested:
https://medium.com/@sjbarbeau/whats-new-in-gtfs-realtime-v2-0-cd45e6a861e9

I agree that backwards compatibility is important and breaking changes should only be introduced when absolutely necessary.

As mentioned here:
#139 (comment)

...I think versioning is also helpful to consumers to indicate semantic differences in the feeds - in this case, when the keys in auto-discovery are standardized. This is similar in concept to what we did with GTFS-realtime v2.0.

If we introduce versioning and it's something we're programmatically checking against, I'd suggest having two fields:

  • versionName - A semantic version string like v1.1 following the suggestion by @antrim
  • versionCode - An integer value that is incremented with each version change

@antrim
Copy link
Contributor

antrim commented Sep 10, 2019

@barbeau

Thanks for the clarification on how GTFS-realtime versioning works!

What do you and others think of this:

  • Semantic differences in field definitions: indicated by whole digit increments.
  • Added fields, clarifying changes: indicated by decimal increments

If there is support for such a change, then I will make a pull request for this, also including versionName and versionCode as you suggest.

Since there are a lot of open proposals and pressing GBFS needs, it seems useful to establish versioning practice before making amendments to the GBFS.

@barbeau
Copy link
Member

barbeau commented Sep 10, 2019

@antrim This documentation is what GitHub points to for versioning guidance - https://semver.org/.

I'd propose we follow the general guidance listed there for versionName:

Given a version number MAJOR.MINOR.PATCH, increment the:
1. MAJOR version when you make incompatible API changes,
2. MINOR version when you add functionality in a backwards compatible manner, and
3. PATCH version when you make backwards compatible bug fixes.

And for each change/release we can decide how the changes fit into the above categories.

I fully acknowledge that there are various opinions on version naming and this is just one possible scheme. Alternate opinions or suggestions are welcome.

@jcn
Copy link
Contributor Author

jcn commented Sep 10, 2019

In the semantic versioning parlance, my recommendation would be for the following: MAJOR.MINOR only.

  • MAJOR bump would indicate something that is a completely breaking API, or something so semantically different that it would warrant a new version
  • MINOR bump would indicate a smaller enhancement or clarification (possibly breaking, though hopefully not too significantly)

My reasoning is that we are not building and releasing software where patch versions really make sense. I'm not really sure what the difference would be between a MINOR and PATCH update when talking about an API. A "bug fix" for an API sounds more like a clarification, which in reality is more like a MINOR bump (hopefully backwards compatible). 3 levels of versions is just too complicated for what we're trying to accomplish.

So, implicitly (if not explicitly), we are at v1.0 right now. There are a number of updates to the API that are proposed in pull requests and discussion that I propose would bump us to v1.1. Future updates with additional fields/enhancements would bump the minor version for a while. I'm not actually sure what would bump the major version yet - possibly some deeper ties to other specs like MDS or possibly some of the larger micromobility changes that are being proposed.

In addition, I would recommend that we require API versions to be consistent across an entire collection - i.e. an auto-discovery /gbfs endpoint should not point to mixed versions. From a consumer standpoint, it seems like it would be far more complex to have to possibly support multiple API versions based on what the producer decides on implementing, rather than just being able to say "I am going to support v1 of the spec only for now" (or v2 or whatever).

@barbeau
Copy link
Member

barbeau commented Sep 11, 2019

my recommendation would be for the following: MAJOR.MINOR only.

This generally makes sense to me.

A related question is "when are we bumping version numbers?" If we bump the version on every pull request merge (more akin to the GTFS process where the latest master branch is always the latest release), then PATCH versioning could make sense for things like formatting and editorial changes (e.g., fixing spelling errors). Or, we just don't bump the version for those merges and forgo the PATCH version.

Alternately, we could batch multiple PRs into a release at our discretion. This gives us more control over how things are versioned but has the downside of potentially taking longer to get new features into a new version of the spec. This process would be more akin to de jure standards.

In addition, I would recommend that we require API versions to be consistent across an entire collection - i.e. an auto-discovery /gbfs endpoint should not point to mixed versions.

I agree with this.

A related question - should we specify if/how a producer should deploy multiple versions of the feed in the case when there are breaking changes?

For REST APIs normally there are path designations for versioning (e.g., /v1/), but specifying those for the GBFS version could potentially interfere with existing producer versioning in API paths. This could be defined more as a best practice and not in the spec itself.

@antrim
Copy link
Contributor

antrim commented Oct 8, 2019

Based on what we heard at the NABSA GBFS Developers Workshop, here is a proposal for how to implement versioning for GBFS. Some of these proposals are a leap beyond group consensus, but I'm making a concrete proposal which can be revised in response to counter proposals.

Why implement versioning?

Backwards-compatibility breaking changes appear to be necessary for GBFS to evolve with industry needs. To support this evolution, coordination with MDS, and effective validation tools, we need to implement versioning.

Approach to spec versioning

  • We use a semantic approach to versioning. A git tag in the form of X.Y establishes versions. Tags may be applied to batch pull requests into versions.
  • A whole integer increase is used for breaking changes. A decimal increase is used for non-breaking changes (definitions below).
  • A versionCode integer value will be associated with each version bump, following how Android does versioning.
  • The GBFS community would make an effort to batch breaking changes together to minimize releases, but there would not be a pre-defined release cycle.

We propose to borrow MDS's definitions of breaking vs. non-breaking changes, quoted below.

Examples of breaking changes include:

  • Adding or removing a required endpoint or field
  • Adding or removing a request parameter
  • Changing the data type or semantics of an existing field, including clarifying previously-ambiguous requirements

Examples of non-breaking changes include:

  • Adding or removing an optional endpoint or field
  • Adding or removing enum values
  • Modifying documentation or spec language that doesn't affect the behavior of the API directly

Expectations for GBFS producers / requesting feeds in a specification version

  • Feed consumers would be expected to support, at a minimum, one version behind the "latest release" version.
  • Path designations (e.g. /v1/) would be used to call different expressions of a feed in different versions of the spec. These outputs of a feed might also include a "canonical version + extension". This would make it possible to conform to a particular version of the spec and also supply experimental fields.
  • A new discovery URL mechanism would be established to call what spec versions are available from a GBFS producer.
  • Feeds will declare the versionName to which they conform in the returned data.
  • Feeds will also declare the associated versionCode.

@barbeau
Copy link
Member

barbeau commented Oct 8, 2019

Feed consumers would be expected to support, at a minimum, one version behind the "latest release" version.

I suggest changing the above to something like the following:

  • Feed producers are expected to support new releases (versions) of the GBFS spec within X months of release of the new spec. When releasing a new major version of a GBFS feed, producers are expected to support the previous major version of the spec in addition to the latest version for at least Y months.
  • Feed consumers are expected to support new releases (versions) of the GBFS spec within Z months of release of the new spec.

I'm not sure what X, Y, and Z should be (or if they should all be defined here), but I think the above gives a more complete picture of what a transition between versions would look like. The biggest breaking problem that I could see happening is a producer dropping support for the previous major GBFS version before consumers can adopt the new major version, which would effectively result in cutting off the feed for those apps.

Path designations...

I believe these are only needed for major (breaking) changes.

@antrim
Copy link
Contributor

antrim commented Oct 24, 2019

We need to come to agreement around a versioning scheme to unblock backwards-compatibility breaking changes such as PR #147 ("Rotate bike_id on free_bike_status).

Here's an updated concrete proposal for feedback.

Approach to spec versioning

  • We use a semantic approach to versioning. A git tag in the form of X.Y establishes versions. Tags may be applied to batch pull requests into versions.
  • A whole integer increase is used for breaking changes (MAJOR changes). A decimal increase is used for non-breaking changes (MINOR changes or patches). We borrow MDS's definitions of breaking vs. non-breaking changes.
  • A versionCode integer value will be associated with each version bump, following how Android does versioning.
  • The GBFS community would make an effort to batch breaking changes together to minimize releases, but there would not be a pre-defined release cycle.

Expectations for GBFS producers / requesting feeds in a specification version

  • Feed consumers would be expected to support, at a minimum, one version behind the "latest release" version.
  • URL path designations (e.g. /v1/) would be used to call expressions of a feed in different MAJOR versions of the spec. This follows nextbike's existing practice (ping @j0kan, nextbike CTO -- any comments?)
  • GBFS will define a best practice that producers should maintain GBFS endpoints for all major version releases over the past 12 months.
  • A discovery URL mechanism would be established to call what spec versions are available from a GBFS producer.
  • Feeds will declare the versionName to which they conform in the returned data (including MINOR versions).
  • Feeds will also declare the associated versionCode.
  • Feeds will declare if they include (non-breaking) experimental fields.

I'll draft a concrete PR that defines policies and practices, based on feedback (agreement, objections, alternatives) people post in this thread.

@barbeau
Copy link
Member

barbeau commented Oct 24, 2019

Above looks good, although for:

GBFS will define a best practice that producers should maintain GBFS endpoints for all major version releases over the past 12 months.

...I'd make this a hard requirement and not a best practice. You can't provide a reliable service to end users as a consuming app if a producer can roll out breaking changes at any given time and stop supporting the previous major version. There needs to be some reliable graceful upgrade period that gives consumers time to adapt to the new major release.

@madupras
Copy link
Contributor

URL path designations (e.g. /v1/) would be used to call expressions of a feed in different MAJOR versions of the spec. This follows nextbike's existing practice (ping @j0kan, nextbike CTO -- any comments?)

How will this impact the discovery endpoint? Also, would producer need to support URL with and without version to get the latest version and for backward compatibility ?

For example, all of these would be valid:

/v2/station_status.json
/v1/station_status.json
/stations_status.json     # redirects to /v2/station_status.json

@gcamp
Copy link

gcamp commented Oct 26, 2019

Overall looks good.

I would agree with @barbeau there should be a minimum amount of time that old version should be supported. Maybe 12 months is too long, producers can voice their opinion. I'd be ok with 6 months.

I'm not sure we actually need to support minor version of the spec. What would be the benefits? At the end of the day, if there's a behaviour change that's asked of consumers to do, I would consider it a breaking change. If there's nothing asked of consumers, then no need to actually change the version.

@johnpena
Copy link

How often are breaking changes expected to happen? If the support window is 12 months long, and breaking changes are made quarterly, then producers will be supporting 4-5 versions concurrently. A situation like this would be very cumbersome and I don't think is tenable for producers.

An alternative option would be to have a release process with a long term support (LTS) branch, similar to Django's release process. LTS branches have a long deprecation window and are expected to be maintained for a long amount of time (e.g. a year). Non-LTS branches have a shorter deprecation window. Producers would be expected to maintain the most recent LTS branch until it's replaced.

@antrim
Copy link
Contributor

antrim commented Oct 28, 2019

@gcamp, @barbeau, and @evansiroky all gave a 👍 to the long-term support (LTS) branch idea. Is there other support? Dissent? This would prevent a GBFS application that hasn't been updated from losing access to GBFS data.

I'd like to hear reactions to the following ideas:

  1. Documentation for an alpha (pre-release) version of GBFS would be maintained to test out implementation in advance of a major release.
  2. There would be no more than 2 major releases in 12 months.
  3. To be considered compliant with GBFS guidelines, producers must provide an endpoint that conforms to the current LTS branch and the latest release branch (within 3 months of release).
  4. Each LTS branch would maintain its LTS status for at least 2 years. The current LTS branch would be changed according to the GBFS governance process (currently a vote of the community).

@barbeau
Copy link
Member

barbeau commented Oct 29, 2019

Documentation for an alpha (pre-release) version of GBFS would be maintained to test out implementation in advance of a major release.

Would this just be the master branch after the pull request for a proposal is merged, but before a new release is cut?

There would be no more than 2 major releases in 12 months.

I think it's fine to set this expectation, but I wouldn't box-in the spec by making this a hard requirement. I could see a major release at some point having unexpected consequences, which could require another major release in short order. And then you couldn't release another for 12 months.

To be considered compliant with GBFS guidelines, producers must provide an endpoint that conforms to the current LTS branch and the latest release branch (within 3 months of release).

👍

Each LTS branch would maintain its LTS status for at least 2 years. The current LTS branch would be changed according to the GBFS governance process (currently a vote of the community).

I think this is fine, assuming there is only one active LTS release at a time.

@jcn
Copy link
Contributor Author

jcn commented Oct 31, 2019

I'm not sure we actually need to support minor version of the spec.

@gcamp In my mind, a point revision would be in place to either codify additions (but not breaking changes) to the spec, or to provide language clarity around an existing spec that might be necessary. For example, we've had some instances where, due to ambiguous language in the spec, some operators are publishing a field as a string, and some as an integer. From a consumer perspective, this is no known consistency anyway, so I wouldn't consider this a breaking change (i.e. the spec was broken anyway), so this point release would serve to clarify the original intent of that version of the spec - i.e. a bug fix to the spec itself.

I think this is fine, assuming there is only one active LTS release at a time.

I agree with this, and I'm wondering how minor versioning plays into this. I still see a benefit for clarifications, both for the LTS release and for the new major releases, so I'm hoping someone is able to better articulate how minor changes to the spec might be codified into the documentation.

@barbeau
Copy link
Member

barbeau commented Oct 31, 2019

@jcn Good points. I think one way to handle this would be for minor versioning items to be "backported" to the LTS release as relevant. In other words, a producer would be responsible for the current LTS branch plus any future revisions to the spec that we consider backwards compatible with the LTS release.

Examples:

  • If LTS is v1.0 with the latest release being v2.0, and then we clarify the data type on a particular field that exists in v1.0 and v2.0, then there would be a v1.1 and v2.1 that both contain this fix. And, v1.1 would be the LTS.
  • If LTS is v1.0 with the latest release being v2.0, and there is a new field foo added in v2.0 that doesn't exist in v1.0, and then there is a clarification of how foo works, then you'd have a v2.1, but v1.0 LTS would remain the same.

A complicating factor to this design is how versionCode works when you have a branch like this with backported fixes. The intention of versionCode is to have a simply incremented integer that you can easily compare to when consuming a feed to make decisions on how to handle that content. If we backport minor release fixes to previous LTS versions, that progress wouldn't be reflected in the versionCode for that release (i.e. the versionCode would need to remain the same as the initial LTS major version release). But maybe that's ok?

@jcn
Copy link
Contributor Author

jcn commented Oct 31, 2019

If we backport minor release fixes to previous LTS versions, that progress wouldn't be reflected in the versionCode for that release (i.e. the versionCode would need to remain the same as the initial LTS major version release).

LTS doesn't necessarily mean "can't have bug fixes" though, right?

Since any fixes backported to an LTS version should only serve to clarify bugs in the original spec, a consumer that was consuming a LTS v1.0 that switches to v1.1 shouldn't see any difference:

  • if v1.0 was already broken and the producer updates to v1.1 and indicates that they've bumped the spec version and are now conforming to what the spec intended, that should be fine for the consumer
  • If the producer continues to serve v1.0, the consumer would have had to have dealt with the ambiguity anyway

@barbeau
Copy link
Member

barbeau commented Oct 31, 2019

LTS doesn't necessarily mean "can't have bug fixes" though, right?

Correct, I agree. My main question is what the versionCode integer value would be on a patched LTS release.

For example - let's say we have v1.0 LTS (versionCode 1) and v2.0 (versionCode 2). We clarify something and release v1.1 LTS (still versionCode 1?) and v2.1 (versionCode 3).

The versionCode for LTS releases would just get stuck at whatever the first versionCode was at that LTS release. But like I said, I don't really think it's an issue.

@gcamp
Copy link

gcamp commented Oct 31, 2019

@jcn you make good point with the minor version. I'm ok with it.

@jcn
Copy link
Contributor Author

jcn commented Oct 31, 2019

@barbeau Can you clarify the versionCode for me? I think I am confusing that with a minor version and don't understand the difference between the two (and the fact that we're talking about them like this makes me think that maybe it's too confusing).

@barbeau
Copy link
Member

barbeau commented Nov 1, 2019

@jcn Sure. The standing proposal for versioning is to add two new fields to each version of the spec, which would also be expressed within each producers feed:

  • versionName - A semantic version string with the major.minor convention, such as v1.1
  • versionCode - A simple integer value that increments by 1 with each release of the spec

Some examples of how these would change over time with each release:

  • versionName v1.0 - versionCode 1
  • versionName v1.1 - versionCode 2
  • versionName v1.2 - versionCode 3
  • versionName v2.0 - versionCode 4
  • versionName v2.1 - versionCode 5

The idea behind versionCode is that it gives you a clear, programmatic way of reasoning about how to handle different versions of the spec within consuming code without having to parse a semantic version and worry about malformed strings, etc. So, its a programmatic alternative to the human-readable semantic version.

For example, in consuming code you can have:

if (versionCode < 4) {
  handleLegacyFeed();
} else {
  handleFeed();
}

Above it was proposed to have the major version number expressed within the URL path, like /v1/.... versionCode and versionName could be within the auto-discovery URL response.

If the consensus is that versionCode seems too confusing or isn't helpful in this context, we can drop it. It's a concept that's used on Android in both app versioning and handling different Android platform versions, which is where I use it frequently.

@antrim
Copy link
Contributor

antrim commented Nov 6, 2019

I tried to synthesize everything here into a PR at #188. I didn't see broad support for versionCode so I didn't include that in the first version of the PR. Please comment there if a versionCode should be added.

I propose to close this issue in the next few days to move discussion to the PR, unless anyone objects.

@heidiguenin
Copy link
Contributor

Closing this now with #188 passed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests