-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reportVersion semantics are not defined #349
Comments
cc @boneskull |
For me I think it depends on whether it's ok to ask consumers to check the Node.js version as well. If a breaking change will be signalled by a bump in the Node.js major, then the version number as is might be ok with it being bumped every time something new is added to the report. Having said that it may be better to use SemVer and separate out from the Node.js version. For example, a new Node.js major may have no breaking changes in the report so using the Node.js version for when you may need a new version of tools processing reports is sub-optimal. The previous comparisons to N-API and ABI using a single number don't necessarily apply in my mind because ABI is always SemVer major and N-API is only SemVer minor so you only need 1 number to represent them. |
as a tooling author consuming these—and I realize I may be the only one—semver seems like an overcomplication; more difficult to parse and of dubious value. if someone wants to argue for semver, I’m happy to listen about what it’ll enable that I’m failing to consider. if the report format changes in any way (new keys, changed, removed, or even moved since the order is deterministic; changed types of values; consider the entire tree) the version should be bumped. |
I agree with @boneskull . Given that report is a diagnostic feature, IMO modification is all about more data, less data or structural changes, and the main party affected is tools. semver means adding more complexity here. pinging @june07 (who developed an inspector plugin that understands and renders report https://github.com/june07/NiM ) for an opinion |
@gireeshpunathil Thank you for including me in the discussion. While I see the benefit in simplicity vs complexity, I do think that semantic versioning would be the better choice. And given that as software developers we're all very familiar with semver, I'm not even sure I'd categorize using semver as adding to complexity, in fact I think it will help save us from complexity down the road. @boneskull has done a great job with rtk and in fact NiM and related tooling (BrakeCODE) use the rtk library, which demonstrates perfectly the sort of pipelines that do/may exist amongst different software, all which do/may interact with these reports. In example, JSON fields are currently added to reports that are ingested by BrakeCODE tooling (see screenshot) to encapsulate processing done by the rtk library. While this did not have to be included directly in the report schema, I think doing so is helpful as it eliminates introducing yet another parent data type. But, having an agreed upon standard (semver) of bumping report versions might make things nicer for other potential consumers of the data further down the pipeline, giving better transparency into data schema. NiM is a perfect example, because it is a consumer of reports from BrakeCODE. While adding keys should not equate to breaking changes, it might be nice to have a way to represent those additions via versioning.
No with single integer versioning but yes with semver. I think using a single integer version scheme might be somewhat limiting, non breaking changes being essentially invisible. @mhdawson brings out a good point regarding the Node.js version, having report versioning tightly coupled with Node.js versioning such that report versions are bumped in sync with Node.js is probably bad since changes in one don't necessarily indicate breaking changes in the other. Semver obviously solves this problem. And while more complicated, ultimately we’re all used to the semver paradigm and the complication really only lies in how semver will map to JSON schema changes vs code changes as we’re all very used to, not not much differently I assume. In working together to agree on the answer to...
I feel that semver empowers one to answer that question in the most elegant way, while considering more than breaking changes. |
To be clear, my complexity concern is mainly around avoid string-parsing acrobatics or pulling in the userland I see semver is useful for developers who can choose what version of something to consume. For example, it enables automated upgrades, assuming the contract is upheld. Or you may know you don’t want to upgrade to a new major release, because it’s likely to contain breaking changes. It works pretty well most of the time! But a tool consuming a report file won’t be able to choose the version of said report file; the concept of an “upgrade” doesn’t really apply to the use case. A tool will change its behavior based on the version of the report, however. Regardless of what a semver version “means” in a report (e.g., minor version means “new field”), the behavior must still be defined based on information that semver cannot provide. Example: If we add a report field called If v2.1.1 is a bug fix to field Likewise with a breaking change in v3.0.0. Note that the semver conventions OTOH, if you are using incrementing integers for versions, the decision-making is exactly the same as above example... except a tool doesn’t have to parse semver! Another problem—this is not technical, but of the “people” variety—is that if a tooling author might see semver and think it’s generally okay to restrict the tool’s operation based on the major release number—because that’s how people use semver! For example, only supporting In summary, SemVer was designed to give people better control of automated software upgrades. I just can’t see how our use case fits with that aim. |
A simplified two digit semver approach works also here. Just major and minor/patch. |
Not too keen on a two digit approach. Single digit as @boneskull points out is easier for tools to parse. If we need more digits then semver at least is well defined and has existing parsers that could be reused. If tools were doing their own parsing they could just ignore the patch level (since by definition changes to the patch level should not break them). |
@jasnell From your point-of-view, what are the advantages of using semver here? |
We deliberated in the last(-to-last) WG meeting on this. Report being a data (as opposed to code), I propose a single digit versioning with version bump on every structural changes wherein structural change is anything that:
It is reasonable to expect report parsers to parse it in JSON-native way, instead of line-parsing, char-parsing etc. One of the main motivation for the report to be in JSON format was easy-parsing for consuming tools. Because of this, section movements / aesthetic modifications do not cause a version bump. It is reasonable to expect no data type changes being applied to the fields. |
Can we depend on that? Maybe we should add a point to your list that says:
|
makes sense @mhdawson . so the modified proposal is: a single digit versioning with version bump on every structural changes wherein structural change is anything that:
|
Will a format change of string types be counted as a significant change? Like the format of error stacks. |
@legendecas - I don't think so, as this sort of change implies no change in the report's functionality, but a change in node core w.r.t data representation. |
On the other hand, if the formatting of a string changes, clients consuming the report could break, so maybe this should be considered a breaking change (thus bumping the number)? |
I agree with @mmarchini. @gireeshpunathil said it himself--the report is data, not software. It does not have functionality; it has a format. If that format changes, it should be considered a bumpable change. My preference is that we bump on every change to the format; I don't see the advantage of not bumping. |
Changes in format changes absolutely have to be handled as breaking, requiring a version bump. I'm absolutely fine with a single digit version but that does leave a couple of additional unanswered questions... (a) Is the version number bumped once per change landed in master? Let's say the current version is 1, and I land two changes in master in separate commits over the course of 2 weeks. Is the version now 2 or 3? (b) What is the backport policy? Let's say that LTS 12.x report version is 3, and master is updated to version 4. Let's say it's an additive change. Are all report changes backportable to 12.x and how do we determine that? (c) What if a bug is found in an LTS branch that does not exist in master? Let's say that LTS 12.x report version is 3, and master has had a breaking change that alters the format and bumps the version to 4. Now a bug in the 12.x version 3 report is found and needs to be fixed only in that branch, what does the new version number become? |
For those so inclined, I think this would be a good read. |
Let me prefix this by saying I really don't want to beat this to death--IMO there's enough analysis paralysis going on in nodejs without this particular issue. So I'm going to decline to comment after this one. From the snowplow article:
I am also concerned with this, which is good! From their definitions:
They go on to offer an example of adding an optional field which would satisfy this requirement. Indeed, this would satisfy the requirement in the context of a well-defined JSON schema which allows arbitrary fields. OTOH, we do not have a well-defined JSON schema for reports (maybe we should), so we cannot make these guarantees. For example, a consumer may base their parsing of a report based on a field count. An addition would change the field count. Because there's no contract--a schema, in this case--which says extra fields are allowed, we can't claim adding a field won't break a consumer. From the section about
Essentially, this means "we don't know". Wouldn't it be better to err on the side of caution and go with So if At this point, anything more than a single number seems like overengineering to me. We can always change it later if we need to. To @jasnell's questions
It's probably easier to say "if you're going to land a PR which changes the format, bump the version" than the alternative, which is... I don't know what it is, but it's a PITA.
What's the prior art?
What's a "bug" in terms of the diagnostic report format? 😄 |
full-semver, only patch and minors backported.
Incorrect data being generated, for example. One possible way to deal with this would be to make the report format version relative to the Node.js major, with the counter reset to zero at each major release. For instance... All 12.x related changes: |
@jasnell I think in that case you could just use the Node.js version number. I don’t think such a format would be helpful, because we will most likely maintain backwards-compatibility up to additive changes in the future, and making it relative to the Node.js version number would make it look like there are breaking changes when there are none. Honestly, I’d just go with semver at this point. In the worst case, it means that we’re exposing more information about the format than necessary, which seems okay to me. |
I think the key question raised by @jasnell is around backports. If the number is going to be meaningful across versions then we will have to either backport all of the changes up to a certain report version or none (ie if you want to backport the change which results in version 10 you need to include all changes which resulted in 1,2,3,4,5,6,7,8,9. You can't just pick individual changes out of sequence). If the number is not going to be meaningful across versions then making it relative to the Node.js version might make sense but I'm not sure that is all that useful as @addaleax mentions. I think if we commit to backporting so that the numbers are meaningful across versions then a single number is probably ok, but I'm also fine with SemVer. |
With regards to the backport issue, it's perfectly valid to declare that the diagnostic report format is not subject to the normal semver rules and therefore breaking changes to the diagnostic format can occur in any release, even within an LTS line. This allows backports to be made any time. I would limit this to format changes. Major changes that impact the overall function of the report mechanism (e.g. removing it, changing the command line flags, etc) would fall under normal semver rules. |
This issue is stale because it has been open many days with no activity. It will be closed soon unless the stale label is removed or a comment is made. |
I would like to resurrect this from being stale, and would like to move forward with single digit versioning, and on top of that, what @jasnell suggested in #349 (comment) problem: right now we don't have a rule around versioning, and that needs to be fixed putting it back to diagnostics WG agenda. |
@RafaelGSS - you asked, here is the issue with that: if there are more than one structural changes in the report that a Node.js version carry, then we cannot differentiate between those. |
As agreed in the diagnostics meeting I said I would take a look. To me #349 (comment) from Chris is a good summary. With that I'm +1 on the suggestion from @gireeshpunathil to move this forward with a single number as it reflects the current implementation and can be revisited. |
Diagnostics report has a version number representing its format, yet its rule is not defined. This doc change specifies the rule. Refs: nodejs/diagnostics#349 Refs: nodejs#28121 (comment)
Diagnostics report has a version number representing its format, yet its rule is not defined. This doc change specifies the rule. Refs: nodejs/diagnostics#349 Refs: nodejs#28121 (comment)
Diagnostics report has a version number representing its format, yet its rule is not defined. This doc change specifies the rule. Refs: nodejs/diagnostics#349 Refs: #28121 (comment) PR-URL: #45050 Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com> Reviewed-By: Richard Lau <rlau@redhat.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
this is done via nodejs/node#45050 , closing. |
Diagnostics report has a version number representing its format, yet its rule is not defined. This doc change specifies the rule. Refs: nodejs/diagnostics#349 Refs: #28121 (comment) PR-URL: #45050 Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com> Reviewed-By: Richard Lau <rlau@redhat.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
Diagnostics report has a version number representing its format, yet its rule is not defined. This doc change specifies the rule. Refs: nodejs/diagnostics#349 Refs: #28121 (comment) PR-URL: #45050 Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com> Reviewed-By: Richard Lau <rlau@redhat.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
Diagnostics report has a version number representing its format, yet its rule is not defined. This doc change specifies the rule. Refs: nodejs/diagnostics#349 Refs: #28121 (comment) PR-URL: #45050 Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com> Reviewed-By: Richard Lau <rlau@redhat.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
Diagnostics report has a version number representing its format, yet its rule is not defined. This doc change specifies the rule. Refs: nodejs/diagnostics#349 Refs: #28121 (comment) PR-URL: #45050 Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com> Reviewed-By: Richard Lau <rlau@redhat.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
Diagnostics report has a version number representing its format, yet its rule is not defined. This doc change specifies the rule. Refs: nodejs/diagnostics#349 Refs: #28121 (comment) PR-URL: #45050 Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com> Reviewed-By: Richard Lau <rlau@redhat.com> Reviewed-By: Colin Ihrig <cjihrig@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Chengzhong Wu <legendecas@gmail.com> Reviewed-By: Michael Dawson <midawson@redhat.com>
The
process.report
/--experimental-report
feature comes with output that contains a version number; however, it is unclear what that version number means and when it is incremented.For nodejs/node#31386, it would be a good starting point to know if purely additive changes to the JSON format (i.e. only adding previously non-existent keys) should lead to version bumps.
@jasnell also suggested switching to semver, whereas @cjihrig pointed out that in the PR that added versioning, a single-integer versioning scheme was requested. I personally find it hard to make a decision on this question without knowing how consumers are supposed to interact with the version number.
I'm adding this to the WG agenda, I hope that's okay.
The text was updated successfully, but these errors were encountered: