Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What does Breaking Changes mean? #187

Open
Atry opened this issue Mar 5, 2022 · 9 comments
Open

What does Breaking Changes mean? #187

Atry opened this issue Mar 5, 2022 · 9 comments

Comments

@Atry
Copy link
Contributor

Atry commented Mar 5, 2022

We commonly include a Breaking Changes section in release notes. What does the terminology actually mean?

I used to think it is similar to breakages in binary backward compatibility in C++ or Java. However, we don't have a binary release form of a library.

If it means breakage at the source level backward compatibility, it could be too sensitive, and we might have missed many breaking changes.

For example, we used to assume changing a function from the defaults capabilities to pure capability is a backward compatible change. However the assumption is not true if we take the type checker into account:

Given a library including the following function:

function method1(): void {
}

When changing it to

function method1()[]: void {
}

The following user code, which could type check originally, will be failed to type check any more:

function method2(): void {
}

function infer<T>(
  (function()[T]: void) $_,
): (function ((function()[T]: void)):void) {
  return $_ ==> {};
}

function broken(): void {
  $f = infer(method1<>);


  // Typing[4403] Invalid argumentHack(4403)
  // Expected a function that requires the capability set {}
  // But got a function that requires the capability set {AccessGlobals, IO, ImplicitPolicyLocal, RxLocal, System, WriteProperty}
  $f(method2<>); 
}

Similarly, changing the return type of a function into a more specific type is also backward incompatible at source level, even though similar changes are usually considered binary backward compatible in other languages.

Key takeaways

  1. We might want to change the structure of our release notes.
  2. I think currently it is not a big deal because any way we don't promise backward compatibility. However, if we want to create a stable language in the future, we will have to define a stable ABI for distributing libraries, not just the source code. Essentially, any statically typed languages that have the ability of type inference, would need a stable ABI in order to create reusable libraries that would not arbitrarily break the library users.
@fredemmott
Copy link
Contributor

Anything that's reasonably likely to cause practical problems for end users. This is a very fuzzy definition, but these notes are to help humans, so that's fine.

  • the example you gave does not seem particularly likely. Additionally, by this definition, any change of a return type - even narrowing it - would be a breaking change.
  • we also tend to list things like changed typechecker error codes; as we already consider code with FIXMEs invalid, this does not fit any standard definition of a compatibility break, but it does cause static analysis

If it means breakage at the source level backward compatibility, it could be too sensitive to be broken, and we might have missed many breaking changes.

I don't think we should be aiming for a formal definition, with a 100% completeness: this would be less useful than what we have now.

If changes like marking a function as pure are considered breaking changes, this will effectively be a copy of git log. That data is available outside of the release notes, and is much lower signal.

We might want to change the structure of our release notes.

No strong feelings, but given this is meant to be practical advice for humans to take in their projects, I don't really see a need.

we will have to define a stable ABI for distributing libraries, not just the source code

This seems extremely unlikely, ever; bytecode will continue to change as needed for performance.

@Atry
Copy link
Contributor Author

Atry commented Mar 7, 2022

the example you gave does not seem particularly likely. Additionally, by this definition, any change of a return type - even narrowing it - would be a breaking change.

That's exactly why binary backward compatibility (or the corresponding concept in Hack) is a thing. It is much easier to keep binary backward compatibility than source backward compatibility in other languages.

@Atry
Copy link
Contributor Author

Atry commented Mar 7, 2022

we will have to define a stable ABI for distributing libraries, not just the source code

This seems extremely unlikely, ever; bytecode will continue to change as needed for performance.

(copy from internal post)

The goal of keeping binary backward compatibility in other languages is to allow for upgrading transitive dependencies. For example, given an application A depending on library B depending on library C, if library B is updating to a binary backward compatible and source level backward incompatible version, application A must be migrated upon the new version to pass type check. However, if library C is updating to a binary backward compatible and source level backward incompatible version, application A does not have to migrate upon the new version.

In Hack, the maintainer of the application A cannot simply update library C, unless the developer of library B migrate first, because there is no binary form.

Something similar to binary backward compatibility in Hack is "runtime backward compatibility", if we could exclude the vendor directory from type checking.

@fredemmott
Copy link
Contributor

This is possible in existing codebases by generating HHIs if desireed, and using ignored_paths to hide the existing things.

It is not enough though: changes in internals (e.g. warnings in array operations) and builtin functions also change between releases.

That's exactly why binary backward compatibility (or the corresponding concept in Hack) is a thing. It is much easier to keep binary backward compatibility than source backward compatibility in other languages.

Narrowing of return types of functions (or in non-final classes) is not usually considered a BC break by other languages; in the case of C++, the ABI is actually weaker than the API here - a narrowed return type is not an API-breaking change, a widened is - but neither is an ABI-breaking change. The return type is not part of the mangling.

Also, I am not arguing that an ABI would not be useful for Hack - just that it is impractical from an implementation side. It would require:

  • a stable bytecode
  • a ban on adding additional errors to existing bytecodes (e.g. we would not have been able to add 'raise error on array append if array is a dict')
  • a language-level definition of ABI and API

The last is a hard problem; the first two are extremely unlikely to ever be practical, unless Facebook stop needing to optimize HHVM for www performance and reliability.

@Atry
Copy link
Contributor Author

Atry commented Mar 7, 2022

I think "runtime backward compatibility" is more realistic than binary, because currently all inferred type parameters do not change the runtime behavior.

Reified type parameters do change the runtime behavior. Fortunately they cannot be inferred.

@lexidor
Copy link
Contributor

lexidor commented Mar 7, 2022

I'd go with variance rules when documenting signature changes. Non-final methods are invariant on their parameters, return type, and coeffects. All final or free standing functions are contravariant on their parameters, covariant on their return type, and contravariant on their coeffects.

Yes, there are indeed ways the typechecker can emit errors where it previously did not. These callsites can be restored by explicitly typing out the previously inferred types. If that type is not denotable, maybe mention it in the blog.

@Atry
Copy link
Contributor Author

Atry commented Mar 7, 2022

Yes, there are indeed ways the typechecker can emit errors where it previously did not. These callsites can be restored by explicitly typing out the previously inferred types. If that type is not denotable, maybe mention it in the blog.

It is trivial to fix errors like this in a monorepo if all the source files of all dependencies are in the same monorepo. However, it is not easy to fix an even trivial typing error in composer based projects where errors are commonly found in the vendor directory, especially when some dependencies are maintained by third parties.

@lexidor
Copy link
Contributor

lexidor commented Mar 7, 2022

Would you happen to be able to give an example where project B and project C are both independently compatible with hhvm version (next). Project A depends on B, which depends on C. The error only shows up in project A, but not in a source file controlled by A? From my understanding, if C typechecks and B typechecks, wouldn't all new type errors have their primary error location in A?

Edit: I misunderstood the example from the internal post. The issue is not, whose error is this to fix, but are there ways to allow incompatible changes to remain unfixed. If A doesn't know about the new signature, it can pretend to use the old one (provided they are compatible).

@Atry
Copy link
Contributor Author

Atry commented Mar 8, 2022

Yes. The problem is not only about changes introduced by HHVM, but also changes introduced by third party libraries, for users who don't have a monorepo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants