-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove casts from Eval, fix stack overflow in Eval #3519
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@non since you wrote this code long ago, I'd appreciate your input.
core/src/main/scala/cats/Eval.scala
Outdated
} | ||
case xx => | ||
// note: xx could be a Defer so calling .value | ||
// could blow the stack? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks unsafe to me. I don't see why we don't need to check for Defer
here. I think this could be the cause of #2587
If this is safe, I think we need a comment as to why xx.value
is okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@non do you know why we wouldn't do loop(xx, Many(c.run, fs))
when we have a Defer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so defer(e).flatMap(f)
is safe because .flatMap
checks for Defer
and uses the thunk as start
, BUT defer(defer(e)).flatMap(f)
isn't safe because flatMap only unwraps one layer of Defer
and then we wind up calling .value
on a defer which blows the stack.
I reproed this failure mode in a test.
Should I combine with this code for the fix or make a follow up PR?
core/src/main/scala/cats/Eval.scala
Outdated
case f :: fs => loop(f(x.value), fs) | ||
case Nil => x.value | ||
case Many(f, fs) => loop(f(x.value), fs) | ||
case Ident(ev) => ev(x.value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the calls to x.value
here are safe because Now
, Later
and Always
don't do any recursion on Eval.
} | ||
} | ||
|
||
loop(e.asInstanceOf[L], Nil).asInstanceOf[A] | ||
loop(e, Ident(implicitly[A <:< A])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we have 1 allocation more than we did before: Ident
where as before we did Nil
. But since this is one allocation per call to .value
, I think this is well worth the cost considering how many allocations we already make (1 per depth at least to build up the user level stack).
core/src/main/scala/cats/Eval.scala
Outdated
case mm @ Memoize(eval) => | ||
mm.result match { | ||
case Some(a) => | ||
loop(Now(a), c.run.asInstanceOf[C] :: fs) | ||
loop(nextFs.first(a), nextFs.rest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here is an optimization: since we know we already have a Many
, we don't need to recurse.
Codecov Report
@@ Coverage Diff @@
## master #3519 +/- ##
==========================================
- Coverage 91.54% 91.49% -0.05%
==========================================
Files 386 386
Lines 8476 8915 +439
Branches 245 285 +40
==========================================
+ Hits 7759 8157 +398
- Misses 717 758 +41 |
Here are the benchmarks before and after:
|
I went ahead and added the fix for stack safety here. BTW: this, IMO, shows the win of removing the casts. Without the casts, it was clearer to me to follow the logic and it was easier to see what was going on. |
fixes #2587 |
core/src/main/scala/cats/Eval.scala
Outdated
@@ -263,26 +263,13 @@ object Eval extends EvalInstances { | |||
* of FlatMap nodes) or "value" (in the case of Now, Later, or | |||
* Always nodes). | |||
*/ | |||
@tailrec private def advance[A](fa: Eval[A]): Eval[A] = | |||
fa match { | |||
@tailrec private def advance[A](fa: Defer[A]): Eval[A] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a private method, and we only need to call it on Defer[A]
to return a non-Defer result. This simplifies the code since I don't see any reason we need to deal with FlatMap as we did.
new Eval.FlatMap[A] { | ||
type Start = compute.Start | ||
val start: () => Eval[Start] = () => compute.start() | ||
val run: Start => Eval[A] = s => advance1(compute.run(s)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@non can you see why we did this? Can you write a test that shows the need? Removing it is a bit more efficient since we don't add a level of stack to all FlatMaps we call, BUT note, we should only really need to call advance when we know it is a Defer (note I change the argument type).
case cc: FlatMap[c.Start] => | ||
val nextFs = Many(c.run, fs) | ||
loop(cc.start(), Many(cc.run, nextFs)) | ||
case call: Defer[c.Start] => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this branch needs to be added. Note I added a test to the PR the exercises this. Without this change we blow the stack.
@non can you remind us here why we have def defer[A](ea: => Eval[A]): Eval[A] =
Unit.flatMap(_ => ea) all the tests seem to pass and it would, I assume, improve performance of code using Defer since we have fewer branches to test. Also, if we did this, in We probably can't remove |
* so calling .value does not trigger | ||
* any flatMaps or defers | ||
*/ | ||
sealed abstract class Leaf[A] extends Eval[A] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with this, we can use x: Leaf[A]
below to make it clear calling .value
is safe. This helps the reader review the def evaluate
method and see that it is safe (we only call .value on Leafs).
it doesn't impact the performance (hopefully scala removes the type check when it can statically prove that it would always pass, and since this was in the last position of the match that is the case currently).
@travisbrown would you have time to look at this one? I know you have a lot on your plate. |
@johnynek Sure, I'll take a look today. |
c.start() match { | ||
case cc: FlatMap[_] => | ||
loop(cc.start().asInstanceOf[L], cc.run.asInstanceOf[C] :: c.run.asInstanceOf[C] :: fs) | ||
case cc: FlatMap[c.Start] => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure why we don't do:
case c: FlatMap[A1] =>
loop(c.start(), Many(c.run, fs))
and let the rest of the code run.
That would be correct, but I assume @non inlined this because for a FlatMap we know that we have at least one item on the stack and if we do my suggestion above, we allocate a Many
just to pattern match on it in many cases and discard it. So, by nesting the pattern match on the c.start()
we often can often sometimes avoid the allocation and pattern match.
If @non has time to chime in, it would be appreciated, I could update a comment to make it clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I like the Leaf
thing. I'm not convinced that it's a good idea to add a type to the public API just to make the safety of some internal code a little clearer (especially since that code is still pretty complicated). It also means that some matches that used to have a reasonable catch-all case now don't (although I guess it's unlikely that someone could add something to the Eval
hierarchy without immediately running into MatchError
s if they mess this up).
I'm also not sure I see all that much value in replacing the old List[Any => Eval[Any]]
approach with a proper type-aligned list, but I guess that since it's turned up bugs / optimizations I'm probably wrong about that.
👍 from me on this. We should publish a 2.2.0-RC2 once it's merged.
If I were designing Eval today I would make all subclasses private. It's a bit strange that the comments say that the details of Eval are opaque yet all the subclasses are public. That said Leaf is somewhat useful to users: it is a case where calling .value is safe at any location. For instance if you were doing a loop where you will call .value at the end you could conceivably match on Leaf and call .value at the beginning, e.g. some map2Eval implementations could possibly make use of it. Note, they could have already matched on Now, Later or Always which are public but this unifies those three into a single subtype. For me, I saw that there were some internal .value calls but I wasn't sure why that was safe: now you can see very clearly. I think that's a win. I even thought of giving Leaf a safeValue method to make it more clear that we only call that inside the evaluate loop. But I ultimately didn't. Anyway, thanks for being flexible and accepting a design you didn't think was your first choice. |
similar to #3518 we use the 2.12 ability to do tailrec when the generic parameter changes to remove the casts.
Instead of using
List[Any => Eval[Any]]
we have to introduce a new type-aligned stack of functions, but I make this private.This simplification leads to one optimization: if we have a finished
Memoize
in FlatMap we don't need toloop(Now...
we can directly apply the flatMap function.Also, I noticed perhaps the source of stack overflows I've seen in the past: #2587