-
Notifications
You must be signed in to change notification settings - Fork 897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify backward compatibility requirements when introducing new capabilities #3954
Comments
What definition other than "upgrading a dependency for existing working code does not require changes to the code for it to continue working" is being evaluated? |
I think we need to clarify what happens if there are more than two parties: e.g. OTel, user, and cloud provider integration/instrumentation.
Also, it's not only about dependency versions updates, but feature flags and so on. |
It sounds like feature compatibility is being confused with backwards compatibility here. |
Ideally features are not mutually exclusive and new ones don't break existing ones. If they do, it means that old one is going to be deprecated. It'd be great to clarify it in scope of this work. |
I believe one of the scenarios that people have different opinions about is the following: Version N: Capability A exists and behaves in a certain way. Version N+1: Capability B is introduced (that did not exist in version N). B is optional and is opt-in, i.e. the user must perform a certain action to start using B after upgrading from version N to N+1. When capability B is not used (user did not perform an action) the capability A in version N+1 behaves the same as in version N, there are no changes in the behavior of A. When capability B is used it changes the behavior of capability A, including but not limited to rendering capability A completely ineffective. [UPDATE] A sub-scenario with an additional nuance here is that capability A may be typically used by a different party (e.g. the cloud provider) than capability B (e.g. the end user). Does this change from version N to N+1 constitute a breaking change? |
Thank @tigrannajaryan for the summary! It look great. One more important detail I'd like to be made explicit: User who opts into capability B may not know that capability A is used by something in their environment. The party that uses capability A does not opt into anything. |
I updated my description to mention this nuance. |
Another nuance I think is important is when Capability B is the "new better version of" Capability A. E.g. I've heard @austinlparker (not unreasonably) referring to the new configuration proposal as "Configuration v2". It doesn't mean we can't introduce "Configuration v2", but I think we should have a transition / deprecation plan around "Configuration v1" in order to avoid user confusion (e.g. similar to the plan to deprecate SpanEvents in favor of Log-based Events). |
Agreed, I just hope configuration v2 would allow cloud providers to set defaults ;) (i.e. capability v1 can be deprecated in favor of v2 if v2 is a superset of v1) |
Here is my personal opinion on this topic. I am going to consider backward compatibility and couple other relevant aspects. I will use the configuration proposal as an example in the text below, but I think my reasoning also applies generally to the scenario of capabilities A vs B defined above. Backward CompatibilityMy litmus test for backward compatibility is the following: If I upgrade from version N to version N+1 while keeping everything else unchanged does the documented functional system behavior change? If the answer is yes then it is a breaking, non-backward compatible change, otherwise it is a non-breaking, backward compatible change. [UPDATE] I added a qualifier "documented" in the previous sentence to clarify that only documented behavior is important from compatibility perspective (and that obviously includes API definitions because they are documented). Undocumented, unspecified behavior is not part of compatibility considerations. (Note emphasis on functional behavior as opposed to e.g. performance behavior which I don't consider to be part of the regular compatibility guarantees.) Evaluated using this litmus test, I believe the scenario we are considering is not a breaking change. The fact that capability B is introduced in version N+1 and that it is opt-in and unless opted-in the behavior of capability A does not change is what makes me reach this conclusion. Using our example, the new configuration proposal that adds an opt-in OTEL_CONFIG_FILE is not a breaking change. It is backward compatible. Unless the user opts in to the new OTEL_CONFIG_FILE the behavior of version N+1 is the same as in version N. However, I think there is another important aspect that we need to consider when talking about capability changes, an aspect which I think is important in particular for the new configuration proposal. Degradation of CapabilitiesThe configuration proposal suggests replacing a particular capability A (the configuration of the SDK by env vars) by a new capability B (configuration from a file). As far as I can tell the new capability is set to become the recommended way of performing the configuration of the SDK going forward and we will likely declare the old way of configuration deprecated sometime after the new configuration file is declared stable. In other words it is eventually going to become a replacement capability. When a change like this - a replacement capability - is proposed, in addition to backward compatibility I think it is important to consider the following: does the replacement capability allow all use-cases that were previously allowed using the old capability? I believe that unless we explicitly decide that certain use-cases are not necessary we should make a significant effort to continue supporting all use-cases that were supported previously. This is not a backward compatibility requirement. It is a requirement to avoid degradation of capabilities. Evaluated from this perspective the new config proposal appears to be a degradation. It makes a previously possible use-case impossible or more cumbersome: the ability for the cloud provider and for the end user to supply portions of the configuration without explicit coordination. (I do not yet have sufficient information to fully understand whether the cloud provider's use case is impossible or merely more cumbersome. I guess in some cases it may be impossible, depending on which process has what sort of permissions to write to the filesystem. If it is merely more cumbersome then I would call it a user experience degradation, which is still important but less critical than a complete impossibility, what I call degradation of capabilities above). So, I think this is what we have in this particular case of the configuration proposal: it is not a breaking change, but it is a capability degradation. In my opinion we should find a way to avoid the degradation. I have been thinking about how to do it and I think there is a possibility to do it by extending the new configuration proposal without throwing away any of the valuable work that the Configuration SIG did. I will post my thoughts about this on the relevant issue later. |
If seems to me there is no consensus on this, and that's part of the problem. Does file configuration support end user vs. cloud provider separation of config ownership (#3954 (comment))? If it does, but in a different way, I would not consider it as a degradation. |
UPDATE: I added a qualifier "documented" in my definition of the breaking change to clarify that only documented behavior is important from compatibility perspective (and that obviously includes API definitions because they are documented). Undocumented, unspecified behavior is not part of compatibility considerations. |
The configuration file proposal introduces new capabilities. Opinions vary on the topic of whether this is a breaking change or no.
We need to clarify what constitutes and what does not constitute a breaking change when new capabilities are introduced.
I am assigning this to myself to look into it.
Scenario of Interest
Version N: Capability A exists and behaves in a certain way.
Version N+1: Capability B is introduced (that did not exist in version N). B is optional and is opt-in, i.e. the user must perform a certain action to start using B after upgrading from version N to N+1.
When capability B is not used (user did not perform an action) the capability A in version N+1 behaves the same as in version N, there are no changes in the behavior of A. When capability B is used it changes the behavior of capability A, including but not limited to rendering capability A completely ineffective.
A sub-scenario with an additional nuance here is that capability A may be typically used by a different party (e.g. the cloud provider) than capability B (e.g. the end user).
Does this change from version N to N+1 constitute a breaking change?
The text was updated successfully, but these errors were encountered: