Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary versions in the Scala 3.x era #10244

Closed
sjrd opened this issue Nov 9, 2020 · 26 comments
Closed

Binary versions in the Scala 3.x era #10244

sjrd opened this issue Nov 9, 2020 · 26 comments

Comments

@sjrd
Copy link
Member

sjrd commented Nov 9, 2020

Until Dotty 0.x.y-RC1's, the binary and TASTy formats were free to be broken at any time, and hence the binary version was 0.x (for example 0.27). With 3.0.0-M1, it has become 3.0.0-M1, which corresponds to what the milestones and RCs of Scala 2 do (a full version number).

We need to decide what will happen in Scala 3.0.0 and onwards. There are a few things to take into account.

  • There exists a version 3.T.0 at which we will break and finalize TASTy, as well as change the stdlib to use Scala 3 features in appropriate places. That implies that it will also break binary compatibility.
  • For all versions >= 3.0.0 and < 3.T.0, we will preserve binary compatibility, which also implies staying binary compatible with Scala 2.13.
  • For all versions >= 3.T.0, we will keep backward tasty compatibility, basically forever. We don't know exactly what promises will be in terms of binary compat, but most likely we'll want to at least keep backward and forward binary compat between patch versions.

Therefore, we have at least two binary versions to settle on: the one for < 3.T.0, and the one for >= 3.T.0. I see the following options:

  • 3.0 and 3.T. That choice means: the first minor version that uses the new binary version. It has the benefit that it can develop further in time if we decide that binary versions should stick to binary compatibility, not to TASTy compatibility, even after 3.T.0, since we can introduce another 3.U at some point. It is however confusing that 3.1 is compatible with binary version 3.0 but not 3.T.
  • 3.0 and 3: That choice bets on 3 being the ultimate binary version, which will never change again. The use of 3.0 is a transient state in the meantime. It will be weird that 3.0.0 cannot read 3, but that's not too bad.
  • 3 and 4: After all, there is no hard requirement for the binary version to be related to the general version. That choice makes it explicit that a binary version is not structured. It's flat: when it changes, the whole ecosystem has to be rebuilt, no matter what. (If we preserved backward compatibility, we wouldn't change the binary version to begin with.)
  • And finally, 2.13 and 3. This one will make people scream, but it makes perfect sense given that Scala 3 and 2.13 are forward and backward binary compatible. The ecosystems are therefore not split: it is the same ecosystem. That also makes withDottyCompat and the reverse one for which we don't have a name yet both moot. It does mean that a library cannot release the same version separately for 2.13 and 3. But after all, it shouldn't have to, since we made everything so that the binary built by one version is usable from the other version.
@dwijnand
Copy link
Member

dwijnand commented Nov 9, 2020

I like 2.13 and 3, because it's ambitious. However it makes it impossible for a library to be cross-built for 2.13 and 3, so a library must decide if it's publishing for one or for the other...

Riffing off your ideas, I kind of like "3 and 3.T" but where "T" is literally "the Scala 3 Tasty" binary version (not, for example "3.6"). It's also mirrors OnePlus's phone names (3T, 5T, etc).

@odersky
Copy link
Contributor

odersky commented Nov 9, 2020

One possible problem might be that we want to allow backward compatibility between minor versions. So if 3.0 is binary 2.13, what is 3.1? It can't be 2.14 because that would imply a global break according to the Scala-2 scheme and it can't be 2.13.x either since that would imply backward and forward compatibility. Do I understand that correctly?

@sjrd
Copy link
Member Author

sjrd commented Nov 9, 2020

One possible problem might be that we want to allow backward compatibility between minor versions. So if 3.0 is binary 2.13, what is 3.1? It can't be 2.14 because that would imply a global break according to the Scala-2 scheme and it can't be 2.13.x either since that would imply backward and forward compatibility. Do I understand that correctly?

It would be 2.13. You only change the binary version when you break backward binary compatibility, not when you only break forward compatibility. The reason for that is that when you keep backward binary compatibility, libraries do not need to be republished, so it is the same ecosystem.

This scheme has worked without any hiccup for Scala.js 0.6.x and 1.x: we've broken forward binary compatibility many times during the 0.6.x cycle, and already 3 times during the 1.x cycle. But 0.6.x has always used the binary version _sjs0.6 and 1.x has always used _sjs1.

@smarter
Copy link
Member

smarter commented Nov 9, 2020

I think using 2.13 as our binary version is not worth it: library maintainers do want the freedom to publish versions tailored to each compiler (which might require using different sources), forcing everyone to choose one means potentially compromising the user experience on the other one, see for example what @djspiewak wrote in https://gist.github.com/djspiewak/b8f2b4547951442102488964bc351cf9

I think all the other suggestions are fine, but I find using 3.0 for things which are not 3.0 is confusing, so I would rather go with 3abi1 which we would eventually bump to 3abi2 (or 4 if we decide to go in that direction), this makes it clear that the suffix is about binary compatibility and not source or tasty compatibility (as 3.1 would probably bump the minor tasty version).

@odersky
Copy link
Contributor

odersky commented Nov 9, 2020

It would be 2.13. You only change the binary version when you break backward binary compatibility, not when you only break forward compatibility. The reason for that is that when you keep backward binary compatibility, libraries do not need to be republished, so it is the same ecosystem.

But I have thought that for Scala 2.x, it's a stricter rule that also requires forward compatibility?

@sjrd
Copy link
Member Author

sjrd commented Nov 9, 2020

I think using 2.13 as our binary version is not worth it: library maintainers do want the freedom to publish versions tailored to each compiler (which might require using different sources), forcing everyone to choose one means potentially compromising the user experience on the other one, see for example what @djspiewak wrote in https://gist.github.com/djspiewak/b8f2b4547951442102488964bc351cf9

Maintainers might want that freedom, but if they actually use it for anything non-trivial (i.e., where the binary API is not 100% the same), they might prevent the downward ecosystem from actually leveraging the bidirectional interop we built between Scala 2.13 and 3. Because an application A might use libraries X (2.13) and Y (3), both depending on the upstream library L which has variants for 2.13 and 3. And then A ends up with a non-reconcilable classpath.

@sjrd
Copy link
Member Author

sjrd commented Nov 9, 2020

It would be 2.13. You only change the binary version when you break backward binary compatibility, not when you only break forward compatibility. The reason for that is that when you keep backward binary compatibility, libraries do not need to be republished, so it is the same ecosystem.

But I have thought that for Scala 2.x, it's a stricter rule that also requires forward compatibility?

That is not an issue, because it only applies to a) the binary encoding produced by the compiler, which is anyway frozen and b) the binary API of scala-library.jar, which is frozen as well. It doesn't apply to other libraries, in particular not scala3-library.jar.

@djspiewak
Copy link

My two cents…

From a user's perspective, the question to be answered is, given their set scalaVersion value, is this specific artifactId binary compatible. Thus, given version 2.13.3 (for example), is cats-effect_2.13 a compatible version? Of course, the answer to this is yes, and we can determine this by the fact that the bincompat version suffix is a strict prefix of the actual Scala version within the build.

Note that we don't need to answer the inverse question: for a given bincompat suffix, enumerate all compatible Scala versions. That's an interesting question, but one that never, in practice, arises. This asymmetry suggests that the 3 scheme is the correct one, with only one caveat.

For all versions of Scala 3 prior to 3.T (where we finalize Tasty), the bincompat suffix should mirror the Scala 2 scheme of including major.minor. For example: cats-effect_3.0. Once Tasty is finalized for the entire 3.x future-lineage, the bincompat scheme should drop the minor component: cats-effect_3.

The caveat here is that build tools need to understand this cliff so as to properly warn users that 3 is not a bincompat version for scalaVersion = 3.0.0, despite the fact that it's a strict prefix. These cliffs sunset though as builds are upgraded.

Note that Scala actually went through this kind of transition once before. I can't remember exactly when the shift took place (I believe it was during the 2.8.x line), but way back in 2.7.x days the bincompat suffix included the full Scala version since compatibility was broken with absolutely every release. It wasn't until later that Scala's own compatibility guarantees were hardened such that libraries could publish once for a major.minor version. This same scenario (with build tools having to special-case a particular threshold) was dealt with and, in practice, didn't really seem to create any headaches.

What is going to happen with Tasty is quite analogous to this compatibility hardening nearly a decade ago, except the minor component is now being dropped.


Also to @smarter's point, I would discourage everyone from thinking of the 3.0 and 2.13 ecosystems as being merged. They really are not. There are several reasons why this is the case, but very briefly:

  • Transitive diamond eviction is completely broken when you have dependencies which are cross-published to both. ScalaCheck is a major and current exemplar of this, since 1.15.1 is published for 3.0.0-M1 and 2.13, Discipline has done the same, but Specs2 is only published for 2.13 and thus depends on the 2.13 version of ScalaCheck. This in turn means that anything downstream of both Discipline and Specs2 is forced to exclude ScalaCheck 1.15.1 2.13. If you don't exclude, Sbt will produce an error in dependency resolution which is modestly semantic in nature, but it doesn't give you any help in resolving it.
  • Any Scala 3 migration issues which would have been a roadblock when doing an actual build migration in a given library are forced onto all downstream users if that library choses to continue publishing for 2.13 and forces their users to employ withDottyCompat. This is kind of analogous to the difference between static and dynamic typing. withDottyCompat is a bit like a runtime cast, wherein your users will discover issues in your API which you would have otherwise discovered for yourself. This is a lot more common than you would think, particularly in any type signatures which involve skolems, since NSC is immensely buggy in these areas while Dotty is much less so. Put bluntly: a not-insignificant percentage of published Scala libraries are unknowingly exploiting bugs in NSC. That exploitation needs to be either addressed by the library authors (by publishing for Scala 3), or by all of their downstream users (using withDottyCompat). In other words, withDottyCompat is a very dangerous thing for a library author to suggest unless they've actually completed the Scala 3 migration in library code, at which point they may as well publish the artifacts.
  • The 2.13 unpickler has been found to have some issues (which I need to file… mea culpa on that one). Given the differences between Dotty and NSC, it wouldn't surprise me if there are more of these left undiscovered. At the very least, it's an unnecessary ecosystem risk.

Tldr, withDottyCompat is useful, but doesn't work as well as well as you probably think it does. 3.0 is a distinct ecosystem from 2.13, and what the ecosystem is compelled to do at this point is essentially the same as any of the previous 2.(n - 1) -> 2.n upgrades.

@sjrd
Copy link
Member Author

sjrd commented Nov 10, 2020

From a user's perspective, the question to be answered is, given their set scalaVersion value, is this specific artifactId binary compatible. Thus, given version 2.13.3 (for example), is cats-effect_2.13 a compatible version? Of course, the answer to this is yes, and we can determine this by the fact that the bincompat version suffix is a strict prefix of the actual Scala version within the build.

I disagree. The real question is: given a full version (e.g., 2.13.3), what is the binary suffix that I need to use to resolve %% dependencies (here _2.13). That implies that there is only one possible answer (not several compatible binary suffixes).

For all versions of Scala 3 prior to 3.T (where we finalize Tasty), the bincompat suffix should mirror the Scala 2 scheme of including major.minor. For example: cats-effect_3.0. Once Tasty is finalized for the entire 3.x future-lineage, the bincompat scheme should drop the minor component: cats-effect_3.

IIUC your comment, that would mean that Scala 3.1.0 would use binary suffix _3.1. That is not an option, because 3.1.0 would be backward binary compatible with 3.0.x, and therefore must have the same binary version as 3.0.0.

The caveat here is that build tools need to understand this cliff so as to properly warn users that 3 is not a bincompat version for scalaVersion = 3.0.0, despite the fact that it's a strict prefix. These cliffs sunset though as builds are upgraded.

There will be at least one cliff regardless of the version scheme we use. But I agree with you that we can teach build tools to handle the cliffs. That's not an issue we need to be concerned about.

  • Transitive diamond eviction is completely broken when you have dependencies which are cross-published to both. ScalaCheck is a major and current exemplar of this, since 1.15.1 is published for 3.0.0-M1 and 2.13, Discipline has done the same, but Specs2 is only published for 2.13 and thus depends on the 2.13 version of ScalaCheck. This in turn means that anything downstream of both Discipline and Specs2 is forced to exclude ScalaCheck 1.15.1 2.13. If you don't exclude, Sbt will produce an error in dependency resolution which is modestly semantic in nature, but it doesn't give you any help in resolving it.

That would be solved by using 2.13/3. That is actually the main point of using 2.13/3.

  • Any Scala 3 migration issues which would have been a roadblock when doing an actual build migration in a given library are forced onto all downstream users if that library choses to continue publishing for 2.13 and forces their users to employ withDottyCompat. This is kind of analogous to the difference between static and dynamic typing. withDottyCompat is a bit like a runtime cast, wherein your users will discover issues in your API which you would have otherwise discovered for yourself. This is a lot more common than you would think, particularly in any type signatures which involve skolems, since NSC is immensely buggy in these areas while Dotty is much less so. Put bluntly: a not-insignificant percentage of published Scala libraries are unknowingly exploiting bugs in NSC. That exploitation needs to be either addressed by the library authors (by publishing for Scala 3), or by all of their downstream users (using withDottyCompat). In other words, withDottyCompat is a very dangerous thing for a library author to suggest unless they've actually completed the Scala 3 migration in library code, at which point they may as well publish the artifacts.

I don't think so. Only roadblocks that surface in the public/protected API of the library are forced on downstream users. But I would certainly hope that those an overwhelming minority, compared to roadblocks that impact the implementation of the library (in particular inside methods). Exploiting bugs in nsc is often inside methods (pattern matching unsoundness for example), not so much in the APIs. And I may be wrong but I don't think it's even possible for a skolem to be part of a public API, is it?

You maintain more libraries with complicated stuff than I do, so maybe I'm wrong on all accounts, here.

  • The 2.13 unpickler has been found to have some issues (which I need to file… mea culpa on that one). Given the differences between Dotty and NSC, it wouldn't surprise me if there are more of these left undiscovered. At the very least, it's an unnecessary ecosystem risk.

We should definitely fix issues.


I may come off as defensive here. We've worked hard on designs and implementations to make this two-way interoperability between Scala 2.13 and 3. We should capitalize on it unless there is strong evidence that it's broken.

@mpilquist
Copy link
Contributor

Unless I'm misunderstanding, the 2.13 & 3 compatibility does not support macros, which then forces many libraries to publish for both 2.13 and 3. Current list of macro libraries: https://scalacenter.github.io/scala-3-migration-guide/docs/macros/macro-libraries.html

@sjrd
Copy link
Member Author

sjrd commented Nov 10, 2020

A library that has macros can put both the Scala 2 and Scala 3 macros in the same artifact if it is compiled by Scala 3.

@mpilquist
Copy link
Contributor

@sjrd Good to hear. Is that supported by SBT yet?

@sjrd
Copy link
Member Author

sjrd commented Nov 10, 2020

Yes, in the sense that sbt has nothing to do about it. It's supported by the compilers.

@mpilquist
Copy link
Contributor

Understood, though how would an SBT project be setup to generate a Scala 3 artifact with both 2.13 and 3 macros included? As of now, the community is generally using crossScalaVersions with both 2.13 and 3, along with version specific source directories. I'm guessing we'd remove 2.13 from crossScalaVersions and somehow pass the 2.13 macro sources to the 3 compiler. How does that work though?

@mpilquist
Copy link
Contributor

Nevermind, found it in "Mixing Macro Definitions" section of https://scalacenter.github.io/scala-3-migration-guide/docs/macros/migration-tutorial.html

@eed3si9n
Copy link
Member

eed3si9n commented Nov 10, 2020

  • There exists a version 3.T.0 at which we will break and finalize TASTy, as well as change the stdlib to use Scala 3 features in appropriate places. That implies that it will also break binary compatibility.
  • For all versions >= 3.T.0, we will keep backward tasty compatibility, basically forever. We don't know exactly what promises will be in terms of binary compat

So we can think of different criteria for compatibility.

"Can we swap JARs?"-compatibility

If I published an sbt plugin using 1.0.0, the plugin should work today using sbt 1.4.2.
Similarly, if log_2.13:1.0.0 was compiled against cats_2.13:2.3.0-M2, can I JAR-swap it with cats_3.0:2.3.0-M2?

EmcLxIDXIAApTfE

Tpol Chico @tpolecat https://twitter.com/tpolecat/status/1326041680131338240

Can we resolve the diamond simply by evicting lower version numbers from the either side?

"Can we compile together?"-compatibility

For Scala 3.0.0, and Scala 3.1.0 even, it seems like we will be able to compile together with either cats_2.13:2.3.0-M2 or cats_3.0:2.3.0-M2?

The point where we reach we can't compile against anything is 3.T. It hasn't come up yet, but I would be curious how we define this situation fro 3.(T + 1). Using the tooling today, the answer would be no, but if we assume future compile technology, it's probably yes?

"Can we link together?"-compatibility

3.(T + 1) might have different bytecode encoding for default value for parameters for example, but we assume that we can download cats_3.T:2.3.0-M2 and link them by regenerating the bytecode. I'm not sure if this future technology exists today, but it sounds doable.

_tsy0.1_2.13

3 and 4: After all, there is no hard requirement for the binary version to be related to the general version.

If we can learn from recent Java, it's that we don't have to be too precious about the first segment of the version number. I don't think we should quite ape their neck breaking pace, but calling 3.T as Scala 4, and adopting Semantic Versioning may reduce the confusion in this area.

Today, the underscored suffix _2.13 or _sjs1_2.13 denotes "Can we swap JARs?"-compatibility, also known as binary compatibility. Following Scala.JS convention, it might be most tooling-friendly we if could preserve both the TASTY-level information and Standard Library version: so _tsy0.1_2.13 for Scala 3.0.0.

When Scala 4.0.0 comes out, we can call it _tsy1.0_4, and Scala 4.1.0 that's TASTY-compat, could be _tsy1.1_4.

The benefit of this approach is that we don't have to handle cliffs. We create a convention instead.

@tpolecat
Copy link

tpolecat commented Nov 10, 2020

Can we resolve the diamond simply by evicting lower version numbers from the either side?

Only if they're otherwise identical, which I think will rarely be the case (see below).

"Can we compile together?"-compatibility

I'm not sure how this can work. For a project I'm working on I have the following situation.

Feature 2.13 3.0
Arity abstraction. Shapeless HList Tuple
Enumerated types. Enumeratum enum
Whitebox string interpolator. Macros Macros

Plus distinct sets of implicit syntax to paper over the resulting API differences.

How would this work in a single artifact?

I want to emphasize that this is not a complex project. I have intentionally kept it very simple. I think pure center-lane Scala libraries are a bit of a myth and the common case is going to be more like mine than not.

@smarter
Copy link
Member

smarter commented Nov 10, 2020

Here are some of the issues I can see with us using _2.13 as our binary version:

  • Library authors will still want to cross compile between 2.13 and 3.0 to make sure their code makes sense in both (e.g., you don't want your users to get exhaustiveness warnings when using your API without you knowing about it). The usual mechanism for doing this is crossScalaVersions but that's not going to work properly when both compilers have the same binary version, because sbt is going to use the same output directory for both versions target/scala-2.13/classes, I just tried it with 2.13.3 and 2.13.2 and sbt didn't even complain when overwriting these files so it's a silent failure. This would have to be fixed somehow.

  • It's hugely restricting for library authors: for example if their API somehow rely on Scala 2 only features (e.g., early initializers) and they don't want to break binary compatibility for their existing users, they're basically stuck. Even if they're ok with breaking binary compatibility, they're still forced to produce something that works with both compilers which might result in a subpar API for both of them.

  • When I fix bugs, in the Scala 2 unpickler, I have no idea what I'm doing because there's no spec, for example: dotty-staging@c70f6e5. To get any confidence that what we're doing makes sense, we would need a lot more tests as well as a spec of the Scala 2 pickling format.

  • The Scala 2 Tasty Reader is extremely new and therefore also heavily under-tested, I predict that it's going to take a lot of time to work out of the bugs in it and before library authors feel comfortable relying on it.

  • If we bump the Tasty minor version in Scala 3.1 for example, Scala 2.13 users need to wait for a new Scala 2.13 release (which could take several months) with an updated reader to be able to use any library compiled with Scala 3.1.

  • We have several known binary incompatibilities with Scala 2:

    1. Trait encoding (No static $init$ in pure trait extending impure trait #9970)
    2. Value class erasure (Bridge method clash with value class #8001)
    3. Intersection type erasure (Dotty does not erase intersection types like scalac #4619)

    There might be more that we don't know about yet. But of those three, the intersection thing is the worst one by far: it turns out that in Scala 2 the precise way an intersection type is erased depends on compiler implementation details (e.g., if I write A with B in two places, does the compiler represent them with the same type instance via hash-consing or not? This is important since it deter whether it generates one or two fake class symbols to represent them, and these class symbols are taken into account by isNonBottomSubClass). I have a wip branch where I implemented this including over a hundred testcases (https://github.com/smarter/dotty/blob/scala2-intersection-erasure/sbt-dotty/sbt-test/scala2-compat/types/scala2Lib/Api.scala), and I'm still not confident it's going to always match what Scala 2 does. And since it's such a weird scheme my plan was to only use it when erasing Scala 2 types (just like we special case erasure of Java types), so we could use something more reasonable in Scala 3 (the Tasty Reader would have to handle our scheme too), but if we want to guarantee binary compatibility with 2.13, we're stuck with trying to emulate its encoding as best as we can.

This is all just off the top of my mind, at no point in time did we develop Dotty with this goal in mind, there's likely to be many unknown unknowns. If we want to pursue this, I expect that it will require delaying the Scala 3.0 release by at least 6 months so we can get some confidence that this could work. But I really don't see the point of doing that since in the end this will likely just heavily discourage library authors from using Scala 3 at all in their projects: what's the point if you can't use Scala 3 features anyway but get new bugs?

@sjrd
Copy link
Member Author

sjrd commented Nov 10, 2020

"Can we compile together?"-compatibility

I'm not sure how this can work. For a project I'm working on I have the following situation.
[...]
Plus distinct sets of implicit syntax to paper over the resulting API differences.

With all those differences in your API, can there even be user code that uses that library with the different Scala versions and the same code!? To me it seems like those differences are big enough that it is a different version of the library.

Library authors will still want to cross compile between 2.13 and 3.0 to make sure their code makes sense in both (e.g., you don't want your users to get exhaustiveness warnings when using your API without you knowing about it).

You'd cross-compile the tests.

It's hugely restricting for library authors: for example if their API somehow rely on Scala 2 only features (e.g., early initializers) and they don't want to break binary compatibility for their existing users, they're basically stuck.

Same as above: if you have a different API for Scala 3, it's a different version of the library to begin with, because you can write user code that works with both.

Cross-compilation is a moot point if you have a different version of the library.

@tpolecat
Copy link

With all those differences in your API, can there even be user code that uses that library with the different Scala versions and the same code!? To me it seems like those differences are big enough that it is a different version of the library.

I see this as exactly analogous to current 2.x version shuffles, where we do what we can to provide a source-compatible upgrade path, it's just more involved when moving to 3.x. This is what the community is doing now, and if you're advocating for forks in these cases then I think it moves very much in the opposite direction of your intent; the 2.x and 3.x worlds will be almost entirely disjoint and it puts a gigantic burden on library developers.

@dwijnand
Copy link
Member

dwijnand commented Nov 10, 2020

I think it's more like there are libraries that will be consumable by both 2.13 and 3 ("that support both 2.13 and 3") and libraries that will drop 2.13 and use 3-only features. The problem is that you can't tell. But this problem doesn't come now, with the "let's reuse _2.13" idea, it's already present with the Tasty Reader supporting some unspecified subset of Scala 3 and their artefacts being indistinguishable by those that aren't consumable. And vice-versa Dotty can consume _2.13... except if it has those Scala 2 only features (but I believe these are small, fringe drops).

@smarter
Copy link
Member

smarter commented Nov 10, 2020

The problem is that you can't tell.

If you have to explicitly put in your build.sbt .withDottyCompat (or something else: #9974), then you can in fact tell (though you don't know what transitive dependencies you're pulling, hence https://gist.github.com/djspiewak/b8f2b4547951442102488964bc351cf9).

@djspiewak
Copy link

@sjrd I don't think we're super far apart on what this should be, just a difference in optimism surrounding interface compatibility. We should try to have a sync-up on this topic.

I don't think so. Only roadblocks that surface in the public/protected API of the library are forced on downstream users. But I would certainly hope that those an overwhelming minority, compared to roadblocks that impact the implementation of the library (in particular inside methods). Exploiting bugs in nsc is often inside methods (pattern matching unsoundness for example), not so much in the APIs. And I may be wrong but I don't think it's even possible for a skolem to be part of a public API, is it?

(emphasis mine)

This is exactly what I'm saying. More specifically, I'm saying that this is quite common.

The problems that I've seen tend to fall in the following areas:

  • Literally anything involving macros/meta-programming
  • APIs which expose GADTs (which in turn expose skolem unification questions to users by requiring pattern matching)
  • Implicit search space (subtly different on Dotty in ways that I'm modestly convinced are correct)
  • Call-site type inference semantics (NSC does a lot of witchcraft with Nothing which Dotty does more rigorously with hidden unification variables, and the semantics can differ at times)
  • Subtyping, particularly as it interacts with polymorphism

The first area is the kind of pain felt by scodec, circe-generic, skunk, etc. Hopefully it's obvious why this is an issue. I don't know that anyone has actually tested the "compile Scala 2 macro code on Scala 3" route, since we didn't really know that existed until just now, so I can't speak to its viability. Given the nature of some of this macro code though (e.g. Generic), I'm a little skeptical. Some macro code appears to be entirely unportable at present (e.g. log4s), and at the very least requires considerably different semantics to work with Scala 3. Libraries which fall into this bucket (e.g. scodec) are doing things like ripping out shapeless entirely and replacing it with bespoke layers on top of Dotty's built-in metaprogramming functionality. At the very least, this requires some very careful cross-publication to ensure downstream users get the right versions of things.

Regarding GADTs, Scala 2 is just very buggy in this area and I think we all understand that. It's much better than it was years and years ago, but still rough. It's very common to use various tricks to work around these kinds of problems, usually involving using implicits to guide type inference. These tricks change pretty dramatically in Scala 3; many are now unnecessary, while others were always unsound and will simply never work.

Also skipping ahead a bit, the interaction between GADTs and variance is a particularly troublesome point. And again, this is all in the public APIs; these aren't implementation details. Resource in Cats Effect is a decent example of this problem. If you stare at the encoding long enough, you realize that it's actually unsound. This is fixed in Cats Effect 3, but can't be done without breaking binary compatibility in Cats Effect 2. This in turn sets up a situation where Cats Effect 2's users will see some serious usability problems if we don't publish an independent Resource implementation on Scala 3. Also note that the Scala 3 implementations of such things are not always usable on Scala 2, often because of the same bugs that led to the original contortions. My favorite current example of this is the fact that CE2's Resource needs to use G[*] rather than G as a type instantiation in several places on Scala 2, because a structural type lambda is not quite the same as the original type due to dealiasing inconsistency. However, on Scala 3, G[*] is not a structural type lambda but actually a first class type lambda, and it is also not the same as the original type G, but in very different ways. This shows up in user code in the form of visible differences between instantiations of G = F[_] vs G = F[+_]

Looping back to implicit resolution… Scala 3's implicit search space is very slightly different than Scala 2's. In practice, this seems to most often be caused by situations where Scala 3 correctly identifies subtyping relations which Scala 2 simply ignored for unknown reasons. These additional relations open up the search space slightly further, since additional companions are in scope and such. This again is a thing which is very visible to users, since implicit search is performed in their code.

Finally, type inference is often similar, but not always, particularly when polymorphic variables are being instantiated via implicit search. There's a great example of this in Cats where a seemingly-inoccuous type signature simply does not work on Scala 3 because an implicit parameter is in the wrong order with respect to the others. This is something that Scala 2's magic retconning of Nothing (at times) allows us to get away with, but Scala 3 is much stricter (and much more sound), which means the only option is to push the implicit ahead of the concrete parameters. Scala 3 allows this, and this is exactly what Cats does, but obviously Scala 2 has no such notion. The result is split source code for this definition site. Again, user visible.

In my experience, migration issues within method bodies are quite rare unless you're doing something patently weird (like forSome). Migration issues related to APIs are very common, and indeed far and away the norm. This is quite unfortunate because it means that efforts to publish on one version and consume on the other are unlikely to ever work reliably. This in turn means that the only option for most library authors will be to cross-publish to Scala 3 as a completely independent binary version, just the same as we do between 2.12 and 2.13.

Taking this even further, the eviction issue that @tpolecat and @eed3si9n discussed above means that even one or two libraries being in this kind of situation forces all libraries to cross-publish, because you can't really mix the ecosystems without liberal use of exclude. This is something that sbt might be able to solve for us, which would allow us to be a bit more targeted with library migration and rely more heavily on withDottyCompat.

It doesn't really remove the danger though. A library author must be absolutely sure that they aren't hitting any of these Scala 2/3 incompatibility cases before they can recommend withDottyCompat to their users. The only way to be actually sure of this is to do the migration work and upgrade the library to Scala 3, allowing the compiler to catch all such issues. But… once you've done that… why wouldn't you just publish the results? Particularly when the norm (in my experience) is for there to be some subtle API-related problems?


Anyway, this all probably sounds more dire than it actually is. I want to clarify that I think that at worst the 2.13 -> 3.0 migration is about as hard as a typical 2.x -> 2.y migration. I just don't believe that it's any easier, and playing tricks with the bincompat suffixes to try to make it seem like it's easier is just going to tie library authors' hands. At any rate, let's have a discussion on all of this.

@eed3si9n
Copy link
Member

"Can we compile together?"-compatibility

I'm not sure how this can work. For a project I'm working on I have the following situation. ...

I've been using the term Scala 2.13-3.x sandwich, and maybe we can collectively agree on what components this sandwich actually admits, so we are neither over-promising or pessimistic.

Tomato_Sandwiches

If we accept that the bottom bread must be Scala 2.13 and only the stuffs that do not transitively depend on 3.x cross built libraries like Cats, then we can say that app_3.0 depending on log_2.13 was the issue.

@sjrd
Copy link
Member Author

sjrd commented Nov 12, 2020

I haven't really had time to process all the feedback yet. I will come back for that.

I'm replying now because it has been brought to my attention that some people felt this was a done deal and started panicking. It's not. The issue is an open question with several options, some pros and cons, and it is very much looking for feedback from anyone involved. There was no internal discussion prior to opening this issue, other than "We need to figure out what the binary versions will be after 3.0.0. Can someone open an issue to discuss it?" and I said I would open an issue.

I apologize if people took this the wrong way.

@sjrd
Copy link
Member Author

sjrd commented Dec 16, 2020

Fast forward after the offline discussions we had:

  • The _2.13 thing is off.
  • _3a / _3b received some support, but I'm told it could bring back bad memories for scalaz 2.7.x users
  • Given that we're actually hoping to name 3.T Scala 4, I suggest we take _3 now, and we decide what to do when 3.T comes, if it won't be 4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants