-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use standard algebra types #523
Conversation
I like the idea of extending where we can, but this naturally raises the question of what further steps are we planning? Should we just add some implicit bijections to map between the two? |
I suspect the practical impact of changing |
I'm almost thinking we should drop I think algebird should focus on data-structures as algebraic objects, mostly for big-data. |
Re: removing I'm also a little concerned about dropping too much of the base framework from algebird and taking such a strong a dependency on a personal repo (no offense to Erik). I guess that's also an argument for using converters everywhere rather than inheritance. |
@jnievelt here is a branch that removes field There are only 3 fields defined: Float, Double, Boolean. I'm fairly sure no one uses Boolean fields with division (you can just divide things by true to get the input again), and Float and Double are not even lawful. In scalding there are two places we use field (I think) in the (almost unused?) matrix API to do division (which again, can be fixed by requiring Float or Double there). About algebra: that is a project @avibryant @non @tixxit @larsrh and myself started some time ago. It will soon be moved under the typelevel org and the work there became cats-kernel: https://github.com/typelevel/cats/tree/master/kernel/src/main/scala/cats/kernel In fact, as I look at this now, I realize that we can just depend on cats-kernel. But the goal is that cats, spire and algebird can all share core types so you don't have all these walls between projects. As for field, it just does not come up very much. I would be very surprised if there are more than 5 uses at Twitter (and I couldn't guess how many semigroups are there). Since the name on Group inverse in cats-kernel (nee algebra) collides with what we were using in Field, to get simple subclass interop with cats-kernel (and hence spire, algebra and cats) we should either:
I have pressed for, and gotten claims that cats-kernel (and algebra) will be very conservative on these core interfaces, and they do not export any instances by default (the instances are opt in). I think the benefit of not siloing all these efforts is worth making some small changes to the project, so I strongly support us moving in this direction. |
@@ -52,11 +52,9 @@ object UtilAlgebras { | |||
implicit def futureMonoid[T: Monoid]: Monoid[Future[T]] = new ApplicativeMonoid[T, Future] | |||
implicit def futureGroup[T: Group]: Group[Future[T]] = new ApplicativeGroup[T, Future] | |||
implicit def futureRing[T: Ring]: Ring[Future[T]] = new ApplicativeRing[T, Future] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in fact future is not even a group because Throw/Failure
has no inverse, same with Try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I assume that these are here for backwards compatibility?
(We do have algebra.ring.Semiring
if you wanted to be more precise, although you might still not want to define exception * 0
as 0
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we should remove them. Added #548
910393f
to
053fd81
Compare
*/ | ||
override def additive: AMonoid[T] = this | ||
override def empty: T = zero | ||
override def combineAll(t: TraversableOnce[T]): T = sum(t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is defining zero now ? AdditiveMonoid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. Actually AdditiveMonoid
is basically identical to our Monoid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between non-additive Monoid and additive Monoid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiplication as "plus" and 1 as "zero" is a valid Monoid. Algebird has
just settled this by saying we always prefer additive so that Monoid can be
a subclass of group.
Not everyone agrees with that design. So in algebra, there is a Monoid that
has a combine method, and in the ring package, there are Additive and
Multiplicative monoids. The difference is only used in spire where the
syntax extensions are enabled. + needs an additive Semigroup. * needs a
multiplicative Semigroup.
To keep comparability with our source code, and interop with
monoids/semigroups from other packages, I think we need to do what I did
(extend Semigroup and AdditiveSemigroup).
On Thursday, June 9, 2016, Alex Levenson notifications@github.com wrote:
In algebird-core/src/main/scala/com/twitter/algebird/Monoid.scala
#523 (comment):@@ -47,8 +48,14 @@ trait Monoid[@specialized(Int, Long, Float, Double) T] extends Semigroup[T] {
None
}
}
- // Override this if there is a more efficient means to implement this
- def sum(vs: TraversableOnce[T]): T = sumOption(vs).getOrElse(zero)
- override def sum(vs: TraversableOnce[T]): T = sumOption(vs).getOrElse(zero)
- /**
- * These are from algebra.Monoid
- */
- override def additive: AMonoid[T] = this
- override def empty: T = zero
- override def combineAll(t: TraversableOnce[T]): T = sum(t)
What's the difference between non-additive Monoid and additive Monoid?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/twitter/algebird/pull/523/files/053fd81dafbd88982cad3b9b5c8839ab16ceb399#r66546774,
or mute the thread
https://github.com/notifications/unsubscribe/AAEJdkKu96bh2NZ25MRTFe_VCf1H-veFks5qKLRrgaJpZM4Ieuhu
.
P. Oscar Boykin, Ph.D. | http://twitter.com/posco | http://pobox.com/~boykin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, still don't understand. Is this just about naming?
I thought a monoid was:
- A set of elements, E
- A (closed) binary operator over the elements in E
- An identity element present in E
You can call the operator "plus" and the identity "zero" or you can call it "times" and "one" but that doesn't really mean anything in terms of how it works right? Or am I missing something and there's a functional difference between these two?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You aren't missing anything. It is totally a naming thing.
On Thursday, June 9, 2016, Alex Levenson notifications@github.com wrote:
In algebird-core/src/main/scala/com/twitter/algebird/Monoid.scala
#523 (comment):@@ -47,8 +48,14 @@ trait Monoid[@specialized(Int, Long, Float, Double) T] extends Semigroup[T] {
None
}
}
- // Override this if there is a more efficient means to implement this
- def sum(vs: TraversableOnce[T]): T = sumOption(vs).getOrElse(zero)
- override def sum(vs: TraversableOnce[T]): T = sumOption(vs).getOrElse(zero)
- /**
- * These are from algebra.Monoid
- */
- override def additive: AMonoid[T] = this
- override def empty: T = zero
- override def combineAll(t: TraversableOnce[T]): T = sum(t)
sorry, still don't understand. Is this just about naming?
I thought a monoid was:
- A set of elements, E
- A (closed) binary operator over the elements in E
- An identity element present in E
You can call the operator "plus" and the identity "zero" or you can call
it "times" and "one" but that doesn't really mean anything in terms of how
it works right? Or am I missing something and there's a functional
difference between these two?—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/twitter/algebird/pull/523/files/053fd81dafbd88982cad3b9b5c8839ab16ceb399#r66557027,
or mute the thread
https://github.com/notifications/unsubscribe/AAEJdhyBqldBPjEvwCIrlhP-IHQI5yJXks5qKNaxgaJpZM4Ieuhu
.
P. Oscar Boykin, Ph.D. | http://twitter.com/posco | http://pobox.com/~boykin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, this is a distinction that was important to us over in Spire-land, and which @johnynek agreed wouldn't be too difficult to encode here. I think the strategy of having Algebird's instances extend both the generic and additive traits is an elegant solution that preserves the previous semantics while aiding interop.
Super bullish on this, small dependency. Looks like a small code change too and should be able to easily consume downstream. |
@@ -82,7 +87,16 @@ class ArrayGroup[T: ClassTag](implicit grp: Group[T]) | |||
}.toArray | |||
} | |||
|
|||
object Group extends GeneratedGroupImplicits with ProductGroups { | |||
class FromAlgebraGroup[T](m: AGroup[T]) extends FromAlgebraMonoid(m) with Group[T] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just make this an implicit class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would have to be an inner class of a trait to get the priority right. I think it is just clearer to manually handle it personally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Also, I think having one of the implicit classes extend the other would get ugly when both were defined in traits. This encoding seems pretty straightforward to me.
This looks good to me. 👍 I feel bad you had to remove |
class FromAlgebraSemigroup[T](sg: ASemigroup[T]) extends Semigroup[T] { | ||
override def plus(l: T, r: T): T = sg.combine(l, r) | ||
override def sumOption(ts: TraversableOnce[T]): Option[T] = sg.combineAllOption(ts) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's slightly annoying to have to do this, but if you could provide the same thing from AdditiveSemigroup
as well it would be really helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I can do that. I can just call the implicit on .additive
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think that should do it.
I compiled scalding against this with the diff below. This looks really minimal to me, and I think is well worth the ability for spire, cats, and algebird to all interoperate. What do you think @jnievelt of the solution I have for field? I used a normal object not a package object because scala recommends not nesting classes in package objects. diff --git a/build.sbt b/build.sbt
index 846af97..f573fb0 100644
--- a/build.sbt
+++ b/build.sbt
@@ -19,7 +19,7 @@ def scalaBinaryVersion(scalaVersion: String) = scalaVersion match {
}
def isScala210x(scalaVersion: String) = scalaBinaryVersion(scalaVersion) == "2.10"
-val algebirdVersion = "0.12.1"
+val algebirdVersion = "0.12.2-SNAPSHOT"
val apacheCommonsVersion = "2.2"
val avroVersion = "1.7.4"
val bijectionVersion = "0.9.1"
diff --git a/scalding-core/src/main/scala/com/twitter/scalding/mathematics/Matrix.scala b/scalding-core/src/main/scala/com/twitter/scalding/mathematics/Matrix.scala
index a053efd..f94a886 100644
--- a/scalding-core/src/main/scala/com/twitter/scalding/mathematics/Matrix.scala
+++ b/scalding-core/src/main/scala/com/twitter/scalding/mathematics/Matrix.scala
@@ -16,6 +16,7 @@ limitations under the License.
package com.twitter.scalding.mathematics
import com.twitter.algebird.{ Monoid, Group, Ring, Field }
+import com.twitter.algebird.field._ // backwards compatiblity support
import com.twitter.scalding._
import cascading.pipe.assembly._
@@ -461,7 +462,7 @@ class Matrix[RowT, ColT, ValT](val rowSym: Symbol, val colSym: Symbol, val valSy
def /(that: LiteralScalar[ValT])(implicit field: Field[ValT]) = {
field.assertNotZero(that.value)
- mapValues(elem => field.div(elem, that.value))(field)
+ mapValues(elem => field.div(elem, that.value))
}
def /(that: Scalar[ValT])(implicit field: Field[ValT]) = {
@@ -469,7 +470,7 @@ class Matrix[RowT, ColT, ValT](val rowSym: Symbol, val colSym: Symbol, val valSy
.mapValues({ leftRight: (ValT, ValT) =>
val (left, right) = leftRight
field.div(left, right)
- })(field)
+ })
}
// Between Matrix value reduction - Generalizes matrix addition with an arbitrary value aggregation function
diff --git a/scalding-core/src/test/scala/com/twitter/scalding/mathematics/Matrix2Test.scala b/scalding-core/src/test/scala/com/twitter/scalding/mathematics/Matrix2Test.scala
index bd66a47..f4c19c4 100644
--- a/scalding-core/src/test/scala/com/twitter/scalding/mathematics/Matrix2Test.scala
+++ b/scalding-core/src/test/scala/com/twitter/scalding/mathematics/Matrix2Test.scala
@@ -15,6 +15,7 @@ limitations under the License.
*/
package com.twitter.scalding.mathematics
+import com.twitter.algebird.field._
import com.twitter.scalding._
import com.twitter.scalding.serialization._
import com.twitter.scalding.source.TypedText
diff --git a/scalding-core/src/test/scala/com/twitter/scalding/mathematics/MatrixTest.scala b/scalding-core/src/test/scala/com/twitter/scalding/mathematics/MatrixTest.scala
index 8dfc963..33a2621 100644
--- a/scalding-core/src/test/scala/com/twitter/scalding/mathematics/MatrixTest.scala
+++ b/scalding-core/src/test/scala/com/twitter/scalding/mathematics/MatrixTest.scala
@@ -15,6 +15,7 @@ limitations under the License.
*/
package com.twitter.scalding.mathematics
+import com.twitter.algebird.field._
import com.twitter.scalding._
import cascading.pipe.joiner._
import org.scalatest.{ Matchers, WordSpec } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added some notes.
@@ -160,13 +163,14 @@ def module(name: String) = { | |||
} | |||
|
|||
lazy val algebirdCore = module("core").settings( | |||
test := { }, // All tests reside in algebirdTest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put some basic tests of the algebra interop here (they don't use any of the test properties we export, so I think it is the right place).
Current coverage is 62.56% (diff: 34.78%)@@ develop #523 diff @@
==========================================
Files 110 111 +1
Lines 4435 4493 +58
Methods 4041 4079 +38
Messages 0 0
Branches 355 375 +20
==========================================
- Hits 2819 2811 -8
- Misses 1616 1682 +66
Partials 0 0
|
👍 not super familiar with the code so I'd recommend waiting for @isnotinvain / @jnievelt to take a look before merging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Is the plan to eventually deprecate the algebird types?
def product(iter: TraversableOnce[T]): T = Ring.product(iter)(this) | ||
trait Ring[@specialized(Int, Long, Float, Double) T] extends Group[T] with ARing[T] { | ||
def one: T | ||
def times(a: T, b: T): T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: do we already use a
, b
somewhere? I do like l
, r
a little better, to emphasize that these are not commutative.
@@ -112,38 +114,84 @@ object LongRing extends Ring[Long] { | |||
else Some(sum(t)) | |||
} | |||
|
|||
object FloatRing extends Ring[Float] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe def product = iter.product ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that would be no faster than the default product. That said, adding a custom sumOption
and product
there is a chance for improvement because we can remove half of the boxing that has to go on with TraversableOnce
.
@jnievelt I don't think we will deprecate algebird types in the short term, only because the resolution for built-in types is done without imports in algebird. Twitter (and others) would need to add an algebra import to get the old behavior. Also, some behavior would be different (for instance the very questionable |
Makes sense 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proper 👍
👍 for Sam's changes. |
👍 thanks oscar! |
since scalding should probably update its algebird dependency now.
|
@koertkuipers yeah. https://github.com/twitter/algebird/releases/tag/0.12.4 that should not have been published as 0.12.4 since it is not backwards compatible. That was a goof up. But the horse is out of the barn. 0.13.0 will be released soon, and then we will release a scalding version shortly there-after. |
great thanks |
@koertkuipers yeah sorry about the 0.12.4 push. We'll have a 0.13.0 version out soon and I'll be updating Scalding to pick it up as well. |
This is a first step to using standard scala algebra classes.
This depends on non/algebra (the base types Semigroup, Monoid, etc.. are now in typelevel/cats). There may be some changes, but the goal is to get a shared common set of basic mathematical typeclasses, and algebird will then migrate to just being about the implementations we have here (specifically the big-data/sketch algorithm examples).
Since Field is so seldom used, I propose we remove it here to smooth interop, and if someone needs it direct them to non/algebra (which will soon be typelevel/algebra). The problem is algebra
Group.inverse
is what we callednegate
soField.inverse
cannot be1/x
. The type signature is the same, so it is dangerous to leave it around./cc @non @avibryant