Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Task #894

Closed
wants to merge 7 commits into from
Closed

add Task #894

wants to merge 7 commits into from

Conversation

stew
Copy link
Contributor

@stew stew commented Feb 23, 2016

So I think this contains a working Task implementation. There are some tests, they pass on the JVM, but we are using java.util.concurrent, which won't work on JS, so althogh validateJVM passes, validateJS will not pass.

So clearly this isn't ready for merge.

I don't think we currently have a way to have JVM only stuff currently, right?

@stew
Copy link
Contributor Author

stew commented Feb 23, 2016

aha, ok, I hadn't seen the jvm sub-module. I'm moving Task there now...

@stew
Copy link
Contributor Author

stew commented Feb 23, 2016

still left to do:

  • eq instance (which we will need for ...:)
  • test monad laws
  • semigroup instance
  • have a debate about whether or not it is a comonad (I vote no)

/**
* Run the computation to produce an A
*/
final def unsafePerformIO(): A = eval.value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work in Javascript, because Task is inherently asynchronous and you can't get an A from it without potentially blocking threads.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a note, this also goes for some of the Cats type classes, like Foldable, which extends by inheritance to Traversable.

@adelbertc
Copy link
Contributor

@alexandru these are in the JVM specific subproject

override def eval: Eval[A] = {
val cdl = new CountDownLatch(1)
var result: Option[A] = None
complete((a: A) => {result = Some(a); cdl.countDown})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CountDownLatch won't work in Javascript and besides, it's usage is a red flag on the JVM as well. There's absolutely no reason to block a thread in order to get an asynchronous value out of a Task.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you suggest we wait for the value to arrive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just so we are on the same page, I'm going to want to be able to write something like:

def FutureToTask[A](fa: Future[A]): Task[Throwable Xor A] = 
   Task.async { cb =>
        fa.onSuccess { 
           case Success(a) => cb(Xor.right(a))
           case Failure(t) => cb(Xor.left(t))
    }

@codecov-io
Copy link

Current coverage is 88.90%

Merging #894 into master will decrease coverage by -0.15% as of 955e46a

@@            master   #894   diff @@
=====================================
  Files          174    176     +2
  Stmts         2384   2415    +31
  Branches        76     76       
  Methods          0      0       
=====================================
+ Hit           2123   2147    +24
  Partial          0      0       
- Missed         261    268     +7

Review entire Coverage Diff as of 955e46a

Powered by Codecov. Updated on successful CI builds.

stew added 4 commits February 22, 2016 23:35
- add methods to catch exceptions to Eval
- bedazzle Eval on JVM to allow forking an eval to Executor
- create a way on the JVM to run async computations in an Executor
- create a way on the JVM to run an eval asynchronously in an eval
val cdl = new CountDownLatch(1)
var result: Option[A] = None
cb((a: A) => {result = Some(a); cdl.countDown})
cdl.await
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not asynchronous and it's not equivalent to a Task, because this is cheating :-)

I think Cats should strive to expose the same API for both the JVM and Scala.js. For one because it would allow people to write portable code without worrying about which types and functions are available where. But also because it keeps you honest.

When you do a countDownLatch.await, in essence blocking a thread for a result, the JVM pretends that the processing you're doing is synchronous, with a result available immediately. But it's a trick that can backfire. Ever heard of EJB (Enterprise Java Beans)? Ever heard of Corba? THE reason for why these sucked so badly was because their original design was a game of pretend. As in, lets pretend that where the processing happens for these synchronous methods don't matter. But of course, making local and remote method calls look the same ended up being a problem, because in the real world you can have significant latency, bandwidth limitations, security issues, dropped packages, timing issues and all sorts of other networking problems.

But back to the point, blocking threads is dangerous and in order to block safely, you have to know the configuration of the underlying thread-pool. Lets say that our thread-pool is limited to a single thread. Lets say that our countDownLatch.await will block that single thread. At that point there will be no threads left to do the actual processing. A one-thread pool is simplifying the problem of course, but this can happen with any fixed sized thread-pool.

This is why we have a BlockContext in Scala, being able to tell the thread-pool that this and that operations are potentially blocking so you might want to add more threads or whatever. Such a strategy is acceptable for blocking I/O. But this is very inefficient, as it defeats the purpose of using a limited thread-pool, so when doing a lot of synchronous I/O (like for example JDBC stuff), we might as well use a CachedThreadPool with unlimited threads.

Which leads us to the reason for why we want to use limited thread pools (e.g. with a number of threads directly proportional to the number of CPU cores). It's because preemptive multi-threading of more threads than CPU-cores is done by a technique called time slicing and when a CPU core switches between threads/processes it has to do a context switch and context switches are freaking expensive. This is why it's considered a best practice to do async I/O and to do CPU-bound processing on a limited thread-pool and if you have to do synchronous I/O, then people instantiate a second unlimited thread-pool meant only for synchronous I/O.

In other words, on top of the JVM, we have 3 options:

  1. thread-block a limited thread-pool and suffer from non-deterministic deadlocks
  2. thread-block an unlimited thread-pool and suffer the performance penalties
  3. do not block any threads, ever

So you see, from my point of view, this isn't even an issue related to Javascript :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR - sorry for the long message - what I'm saying is that if the design involves any form of blocking threads (like usage of a CountDownLatch), for a foundational library like Cats I think the design is wrong.

@travisbrown
Copy link
Contributor

👍 to @alexandru's critique (although personally I don't really care about Scala.js for its own sake). More generally I'm still not at all convinced that any version of a Task-like type should be in cats-core, though (or possibly in cats at all).

I want to be able to write code that's generic up to something like MonadError and be confident that it will work for whatever abstraction I choose to use to model asynchronous computation. To take just one example, the core module of iteratee.io will never refer to any specific implementation of Future, Task, or whatever.

@travisbrown
Copy link
Contributor

What I'd really like to see is a set of requirements for a Task type that everyone more or less agrees on, and then some experimentation with implementations until we have a better sense of which requirements (if any) need to be relaxed.

@alexandru's suggestion that we should aim for parity for Scala.js (largely for non-Scala.js-related reasons) seems like a good starting point. I'd also like to see better thread affinity than the standard library's futures, no implicit scheduling stuff floating around, close attention to allocations, opt-in-only exception catching, cancellation (maybe), etc.

I've got at least three of my own implementations of something like this—one a much cleaned-up port of Scalaz's Task and a couple of attempts at a little free-like thing that wraps CompletableFuture and a bunch of NIO.2 stuff. None of these are as mature or well-thought-through as @alexandru's Task, but they make different trade-offs that reflect my personal preferences.

Ideally at some point in the near-ish future we'd round up all these proposals and implementations and put together some kind of matrix of the features we care about (and some benchmarks, etc.). Something built on Eval might belong on that list, but I don't think it (or any of the other attempts) should be considered candidates for inclusion in cats for a long, long time, if ever.

@adelbertc
Copy link
Contributor

@travisbrown I think somewhere we need at least an IO - one thing I really like to use in Scalaz is SafeApp or TaskApp

abstract class SafeApp {
  def run(args: Vector[String]): IO[Unit]

  final def main(args: Array[String]): Unit =
    run(args.toVector).unsafePerformIO()
}

If we agree that IO would be useful, the question becomes do we want to just have a plain IO or do we want Task? If the former, would a Task implementation interpret into IO ? Why not just have Task be our IO ? etc. etc.

@mpilquist
Copy link
Member

I've mentioned it before in chat, though I suppose I should record it here. I agree with @travisbrown's position.

It isn't clear to me that cats needs IO/Task when there are implementations available that are easy to integrate. I use FS2's Task quite happily and it only requires a single instance to bridge to cats.

@julien-truffaut
Copy link
Contributor

I would also prefer to see IO or Task defined as a separate project.

@stew stew closed this Feb 27, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants