Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent implying Parallel breaks coherence #1645

Closed
Baccata opened this issue Feb 5, 2021 · 12 comments · Fixed by #1674
Closed

Concurrent implying Parallel breaks coherence #1645

Baccata opened this issue Feb 5, 2021 · 12 comments · Fixed by #1674
Milestone

Comments

@Baccata
Copy link
Contributor

Baccata commented Feb 5, 2021

Using CE 3.0.0-M5

See gitter discussion

In case of monad stacks, the potential parallel nature of each layer should be respected : par-traversing EitherT when the error type is semigroupal and the F layer succeeds but several Eithers fail ought to give an aggregate of errors.

This behaviour is encoded in cats here. However, importing cats.effect.implicits._ results in the Parallel instance coming from GenSpawn to take precedence. This instance is unaware of the potential layering of the effect, and therefore cannot respect it.

Here's an ammonite snippet showing the problem :

@ type Foo[A] = EitherT[IO, Int, A]  
defined type Foo

@ Concurrent[Foo]  
res5: GenConcurrent[EitherT[IO[A], Int, γ$1$], Throwable] = cats.effect.kernel.GenConcurrent$$anon$2@7de35070

@ Parallel[Foo]  
res6: Parallel[EitherT[IO, Int, A]]{type F[x] = cats.data.Nested[[x]cats.effect.kernel.Par.instance.T[cats.effect.IO,x],[β$19$]cats.data.Validated[Int,β$19$],x]} = cats.data.EitherTInstances$$anon$8@2b400bd0

@ import cats.effect.implicits._  
import cats.effect.implicits._ 

@ Parallel[Foo]  
res8: Parallel[EitherT[IO, Int, A]]{type F[x] = cats.effect.kernel.Par.instance.T[[A]cats.data.EitherT[cats.effect.IO,Int,A],x]} = cats.effect.kernel.instances.GenSpawnInstances$$anon$1@10e4ee33

res6 and res8 point to different instances

@Baccata Baccata changed the title [CE3] Concurrent implying Parallel breaks coherence [CE3] Concurrent implying Parallel breaks coherence Feb 5, 2021
@djspiewak
Copy link
Member

Does this actually result in different semantics though? They should be consistent even though they're different instances.

@djspiewak djspiewak added CE 3 and removed CE3 labels Feb 6, 2021
@Baccata
Copy link
Contributor Author

Baccata commented Feb 6, 2021

Does this actually result in different semantics though?

I'm afraid it does 😞 . Here's an ammonite script that shows the difference in behaviour

// scala 2.13.4

import $ivy.`org.typelevel::cats-effect:3.0.0-M5`

import cats._
import cats.effect._
import cats.syntax.all._
import cats.data.EitherT

import scala.concurrent.duration._

// Int being semigroupal and IO having a Parallel instance
// leads to EitherT[IO, Int, *] getting a parallel instance
// from cats-core that respects both layers
val foo: EitherT[IO, Int, Unit] =
  EitherT.liftF[IO, Int, Unit](IO.sleep(1.second)) *>
    EitherT.fromEither[IO](1.asLeft[Unit])


val withStandardInstance: IO[Unit] = {
  (foo, foo).parTupled.value.flatMap(IO.println)
}

val withCEInstance: IO[Unit] = {
  import cats.effect.implicits._
  (foo, foo).parTupled.value.flatMap(IO.println)
}

import cats.effect.unsafe.implicits.global

(withStandardInstance &> withCEInstance).unsafeRunTimed(1.5.second)

prints

Left(2)
Left(1)

@djspiewak
Copy link
Member

Why do you have to make me sad, Oli?

@djspiewak djspiewak added this to the 3.0.0-RC1 milestone Feb 7, 2021
@djspiewak djspiewak changed the title [CE3] Concurrent implying Parallel breaks coherence Concurrent implying Parallel breaks coherence Feb 7, 2021
@djspiewak
Copy link
Member

djspiewak commented Feb 7, 2021

So in a way, this is very similar to the #1576 issue: the effect stack has a desired side-channel semantic which is inconsistent with the semantics which simply emerge from the primitives as defined. However, it's also being done for exactly the same reason: CommutativeApplicative, which IMO is a compelling reason. It puts us into a bind though.

I think it is very, very intuitive that Concurrent implies Parallel. In fact, it's so intuitive that the fact that this isn't implied in CE2 has tripped me up on several occasions. Unfortunately, this is tricky by definition because Concurrent makes no particular promises about the nature of both, which is what a lot of this stuff comes down to. Nor, really, would it make sense for Concurrent to make any particular promises about the nature of both, because both itself is not a primitive operation!

In order to get commutativity here, we would need to get it, effectively, out of flatMap. The both itself boils down to gets on Deferreds, which themselves carry the results. Those results are wrapped, in this case, in Either, but flatMap doesn't understand this (obviously), and the results are disambiguated in a left-biased fashion. More specifically, this is the relevant line where information is lost:

EitherT(F.both(fa.value, fb.value).map(_.tupled))

tupled comes from Apply, which is provided unconditionally on any Either[E, *]. If this instance had instead been made conditional on Semigroup[E], then the ap implementation likely would have taken this information into account. However, that also would have made ap inconsistent with flatMap, which is obviously a no-go.

Which brings us all the way back to Parallel: why is it making these kinds of assumptions? As far as I can tell, this is only being done because it seems right for it to represent a CommutativeApplicative, and I can't really argue with that intuition, but when the rubber meets the road, that semantic is simply incompatible with other basic semantics. And not just semantics in Cats Effect, either: I'm talking about the monad laws here.

The incoherence is very bad and we can't have that. If we take it as a given that we need to resolve this, then there are really only two options:

  • Have a higher priority Concurrent[EitherT[F, E, *] instance, given Concurrent[F] and Semigroup[E], which explicitly sets about to mirror the semantics of the Parallel in Cats. We still need to keep the lower-priority unconstrained instance (with the same semantics we have today), because otherwise we lose things like Async[EitherT[F, which is kind of a big deal. This is really unlikely to resolve the incoherence in general though, because all it takes is one forgotten Semigroup[E] constraint and you have different semantics deep in polymorphic code, which is… bad
  • Rework Parallel in Cats

Hear me out. While it is intuitive that Parallel defines a CommutativeApplicative, it is not written in the stars, and as it turns out, a lot of the downstream implications of that constraint are unsatisfiable. Like, for example, the issue with tupled in this case. We could remove the existing Parallel instances for EitherT and IorT without breaking binary compatibility simply by removing the implicit modifier and marking them package-private, and then adding new implicit instances which are unconstrained.

Similarly, this would automatically provide a resolution to #1576 (which boils down to the exact same problem), since we would no longer need to care about the non-commutative nature of finalizer order.

Tldr, I think Parallel is over-constrained. Heads up @larsrh, @LukaJCB, @johnynek, @SystemFw

@Baccata
Copy link
Contributor Author

Baccata commented Feb 7, 2021

I'm failing to understand what commutativity has to do with the issue, and #1576 does not give enough context to understand.

While it is intuitive that Parallel defines a CommutativeApplicative

I'm not seeing that it does. Commutativity matters to parallel for a subset of its operations. Moreover, the Applicative type tied to Either in its Parallel is Validated, which only defines a CommutativeApplicative if E has itself a CommutativeSemigroup. So we can remove commutativity altogether from my example by changing Int to List[Int] and still witness the issue at hand :

val foo: EitherT[IO, List[Int], Unit] =
  EitherT.liftF[IO, List[Int], Unit](IO.sleep(1.second)) *>  EitherT.fromEither[IO](List(1).asLeft[Unit])

leads to the following being printed

Left(List(1))
Left(List(1, 1))

I think it is very, very intuitive that Concurrent implies Parallel

It is, but it is also very intuitive to me that EitherT[F, E, *] should aggregate errors on par operations, no matter the F, no matter the implicit imports.

Why do you have to make me sad, Oli?

Just trying to make your future-self less sad is all 😄

@Baccata
Copy link
Contributor Author

Baccata commented Feb 7, 2021

because all it takes is one forgotten Semigroup[E] constraint and you have different semantics deep in polymorphic code, which is… bad

Right, I'm starting to link the dots after looking at gitter. Still would like some more explanation around what commutativity has to do with this.

all it takes is one forgotten Semigroup[E] constraint and you have different semantics deep in polymorphic code, which is… bad

I think that now that cats-core typeclass instances for stdlib types are no-longer orphan, that particular issue would be greatly mitigated if GenSpawn[EitherT...] was dependant on whether E is Semigroupal or not. As a matter of fact, a similar risk of surprise arises from cats-core where the semantics differ for a polymorphic EitherT[F, E, A] depending on whether F is constrained by Parallel or not (ie you can Parallel for EitherT even if F doesn't have a Parallel).

So to answer your question to me, the precedent☝️ leads me to think that the presence of a Semigroup on E should be taken into account in GenSpawn instances for EitherT.

@Baccata Baccata closed this as completed Feb 7, 2021
@larsrh
Copy link
Contributor

larsrh commented Feb 7, 2021

Did you close this issue by accident?

@Baccata Baccata reopened this Feb 7, 2021
@Baccata
Copy link
Contributor Author

Baccata commented Feb 7, 2021

I did yeah, that'll teach me to watch github from my phone ...

@djspiewak
Copy link
Member

I'm not seeing that it does. Commutativity matters to parallel for a subset of its operations. Moreover, the Applicative type tied to Either in its Parallel is Validated, which only defines a CommutativeApplicative if E has itself a CommutativeSemigroup. So we can remove commutativity altogether from my example by changing Int to List[Int] and still witness the issue at hand :

You're correct. I went through and read it last night but didn't loop back here. This discovery honestly leads me to the following pair of conclusions:

  • Resource#both doesn't need to evaluate its finalizers in parallel
  • The existing Parallel[EitherT instances (given Semigroup[E]) are simply broken

It is, but it is also very intuitive to me that EitherT[F, E, *] should aggregate errors on par operations, no matter the F, no matter the implicit imports.

Why is that intuitive? It doesn't happen for IO, for example:

IO.raiseError(new Exception("1")) *> IO.raiseError(new Exception("2")) // => error: "1"

You're basically asking Either to behave like Validated, but only in this one specific case. That seems quite odd to me, and it's very inconsistent (as the issue here demonstrates).

So to answer your question to me, the precedent☝️ leads me to think that the presence of a Semigroup on E should be taken into account in GenSpawn instances for EitherT.

I'm not convinced, though. We could take it into account… in both specifically, but there's no way we could take it into account in general because a lot of this stuff just delegates down to the semantics of Apply. Overriding both – which is a combinator derived from combinators derived from several unrelated primitives – to have its own special semantic seems very much like an antipattern.

Honestly, it feels more like the Parallel instances in cats-core are, themselves, wrong and should be replaced with unconstrained ones. Also on that note, it would be super awesome if we could leverage one of @neko-kai's tricks to get rid of the orphans here in Cats Effect, but it's not entirely clear how we would do that without causing problems.

@larsrh
Copy link
Contributor

larsrh commented Feb 8, 2021

PR is at typelevel/cats#3777, please everyone take a look.

@Baccata
Copy link
Contributor Author

Baccata commented Feb 8, 2021

Honestly, it feels more like the Parallel instances in cats-core are, themselves, wrong

Replying in typelevel/cats#3776

@Baccata
Copy link
Contributor Author

Baccata commented Feb 8, 2021

Overriding both – which is a combinator derived from combinators derived from several unrelated primitives – to have its own special semantic seems very much like an antipattern.

This I agree with. I do not agree with your solution of changing the existing semantics of EitherT's parallel.

Why is that intuitive? It doesn't happen for IO, for example:

I did mention it was intuitive "on par" (parallel) operations. My bad for using the prefix as opposed to the whole thing. It's intuitive because it means the properties of the EitherT layer match the one of a mere Either.

The only reason for which I'd personally use EitherT[IO, E, ?] (as opposed to using IO's error channel in combination with some Throwable error) is actually because the ability of the Either layer to go to Validated and vice versa via Parallel when E is semigroupal. This lets me trust that parTraverse, parTuple, parSequence will result in the error aggregation, which is an extremely desirable feature, and fits my personal intuition.

In contrast, EitherT as a a mere side error channel to IO's inherent one, with no added benefits is not something I personally see as enticing.

@djspiewak djspiewak modified the milestones: 3.0.0-RC1, 3.0.0 Feb 9, 2021
@djspiewak djspiewak linked a pull request Feb 11, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants