Optimise boilerplate generators, use instance constructors #3871

joroKr21 · 2021-04-26T08:51:56Z

Make it a bit more readable by formatting constraints on a new line.

This change results in around 40% jar size reduction on average of cats-kernel 😲

Artifact	This PR	main	Reduction
cats-kernel_2.12.jar	3068152	5034406	39.0%
cats-kernel_2.13.jar	3299524	5262009	37.3%
cats-kernel_3.0.0-RC3.jar	1418767	2491382	43.0%
cats-kernel_sjs1_2.12.jar	6336434	10614940	40.3%
cats-kernel_sjs1_2.13.jar	6883734	10614940	35.1%
cats-kernel_sjs1_3.0.0-RC3.jar	2197991	4255929	48.3%
cats-kernel_native0.4_2.12.jar	6330092	10796161	41.4%
cats-kernel_native0.4_2.13.jar	6862749	11322497	39.4%

I've also give cats-core a similar treatment, but the savings will be much smaller there.

joroKr21 · 2021-04-27T06:53:07Z

Hmm I still see classes generated - is that expected?

joroKr21 · 2021-04-27T08:21:28Z

TLDR - it's good for Scala 3, Scala 2 is deceiving itself into generating a class file 😭 See scala/scala3#5928

joroKr21 · 2021-04-27T22:41:18Z

As an alternative - use instance constructors that take functions. The externally provided functions will be delambdafied in both Scala 2 and Scala 3.

johnynek

I think this is worth doing but I do wonder if we can make it a bit more principled by using InvariantFunctor, ContravariantCartesian, or similar typeclasses on these kernel typeclasses (typeclasses can have typeclasses too!)

johnynek · 2021-07-09T01:56:30Z

kernel/src/main/scala/cats/kernel/CommutativeGroup.scala

+  /**
+   * Create a `CommutativeGroup` instance from the given inverse and combine functions and empty value.
+   */
+  @inline def instance[A](emp: A, inv: A => A, cmb: (A, A) => A): CommutativeGroup[A] =


What if instead we make an instance of InvariantFunctor[CommutativeGroup] and then we use the InvariantFunctor instances in the tuple code Gen?

You probably mean InvariantSemigroupal so that we can use tupledN right? That's doable but it would mean more allocations and reduced performance at runtime (going from (a, b, c, d) to (a, (b, (c, d))) and back) so I'm not sure that would be acceptable for cats-kernel which is also used by algebra.

joroKr21 · 2021-08-26T19:31:55Z

Also optimized AlgebraBoilerplate now that #3918 was merged

armanbilge · 2021-08-26T19:33:34Z

Thanks for being on top of this!

johnynek · 2021-08-26T21:47:51Z

algebra-core/src/main/scala/algebra/ring/Ring.scala

@@ -115,4 +115,13 @@ trait RingFunctions[R[T] <: Ring[T]] extends AdditiveGroupFunctions[R] with Mult

 object Ring extends RingFunctions[Ring] {
  @inline final def apply[A](implicit ev: Ring[A]): Ring[A] = ev
+
+  private[algebra] def instance[A](z: A, o: A, neg: A => A, add: (A, A) => A, mul: (A, A) => A): Ring[A] =


where are these private[algebra] but it seems the others are @inline. Is there a reason they aren't all the same?

Cats already has public instance methods for some (but not all) type classes and they are @inline whereas Algebra doesn't have any. I just tried to follow the conventions of each project, but I think it doesn't hurt to add the annotation in Algebra too.

joroKr21 · 2021-08-27T12:24:15Z

I don't know why the "Microsite" job is failing but I'm quite sure it's not related to this PR

rossabaker

The build failure could be related to #3974. I'll bring it up there.

Make it a bit more readable by formatting constraints on a new line.

Due to limitations in Scala 2 SAM types often end up generating classes after all. `instance` constructors don't suffer from this issue and also let us handle type classes with multiple abstract methods.

armanbilge · 2021-11-26T02:10:06Z

Personally, I'm not convinced we need or even want this change. Apologies if I'm missing the point.

In Move typelevel/algebra into cats repo #3877 (comment) @joroKr21 mentioned that classes are loaded/unloaded dynamically.
Scala.js and Native involve a static linking step that removes all unused code from the final generated JS/binary.
I understand the relative numbers are significant, but in absolute terms we're basically talking megabytes, right?

Additionally, isn't this de-optimizing the current implementations? By replacing dedicated classes with lambdas and thus adding a level of indirection and a larger per-instance footprint. Like, these are micro-optimizations, but isn't saving a few megabytes? :)

joroKr21 · 2021-11-26T06:40:33Z

In Move typelevel/algebra into cats repo #3877 (comment) @joroKr21 mentioned that classes are loaded/unloaded dynamically.

Yes, that's mostly true on the JVM these days unless you are using the CMS GC without class unloading enabled.

Scala.js and Native involve a static linking step that removes all unused code from the final generated JS/binary.

Oh that's cool, I didn't think much about that. Do you know how fine-grained that is? Does it mean that the concerns for including java.time instances are unfounded (e.g. #3910)?

I understand the relative numbers are significant, but in absolute terms we're basically talking megabytes, right?

Well yes, that's megabytes per download per jar. I don't know what's the multiplier, but there is some multiplier. I would prefer to fix that in the compiler to be honest but we can't because of binary compatibility constraints. (Aside: at least that's what I've been told. Now that I think about it - those classes should be private anyway so why not? 🤔 Maybe I will give it a try but let's not discuss that here).

Additionally, isn't this de-optimizing the current implementations? By replacing dedicated classes with lambdas and thus adding a level of indirection and a larger per-instance footprint. Like, these are micro-optimizations, but isn't saving a few megabytes? :)

That's a good question - do we have any benchmarks on that? I can't say without trying.

joroKr21 · 2021-11-26T07:14:05Z

Some shower thoughts - the biggest cost of these instances are probably boxing of primitives (would be interesting to check in which cases it occurs) and tuple allocations. Lambda calls should not be that expensive because otherwise we would never get SAMs on the JVM. That leaves open the question about the cost of the additional indirection (which is necessary because of the compiler bug).

joroKr21 · 2022-02-09T09:56:40Z

Since we've reached the limit of our JVM knowledge - what about applying this only to the SAM type classes and leaving the multiple method type classes as they were? SAM conversion is supposed to work automatically by the compiler but because of the bug it's not the case. Obviously that means our jar size savings will be much less, but there will be no danger of performance regressions.

armanbilge · 2022-02-09T18:15:05Z

👍 that sounds like a good way forward.

joroKr21 · 2022-02-10T08:57:11Z

@armanbilge done - now only SAM type classes are optimised in this way. I converted multi method type class instances back to anonymous classes. There were not that many actually - only Hash, Group, CommutativeGroup, Ring, Rig, Rng and Semiring.

armanbilge · 2022-02-10T11:09:39Z

core/src/main/scala/cats/Show.scala

-  def show[A](f: A => String): Show[A] =
-    new Show[A] {
-      def show(a: A): String = f(a)
-    }
+  def show[A](f: A => String): Show[A] = f(_)


Is there a practical difference between these? e.g. no class is emitted.

Yes, indeed 👍 - because Show has only one abstract method

Right, thanks. Is this affected by the aforementioned Scala 2 bug?

Also, can Semigroup etc. get the same treatment?

cats/kernel/src/main/scala/cats/kernel/Semigroup.scala

Lines 151 to 154 in 6ba7e4b

@inline def instance[A](cmb: (A, A) => A): Semigroup[A] =

new Semigroup[A] {

override def combine(x: A, y: A): A = cmb(x, y)

}

I think for Semigroup it doesn't matter because it has other (non-abstract) methods. So then the bug applies and there would be an anonymous classes generated even if we use the SAM syntax. So the only benefit would be aesthetics.

I see, so it's not just that Show has one abstract method, it's that it has no other methods.

I agree it's only aesthetics on Scala 2, but does the bug apply to Scala 3 as well?

I agree it's only aesthetics on Scala 2, but does the bug apply to Scala 3 as well?

I don't remember - that's a good question.

The reason I ask is, I don't care so much about the handful of Semigroup.instance etc. that can be re-written.

But I'm wondering if in Scala 3 the boilerplate instances themselves could be written directly like this instead of relying on instance. So we get the win in terms of jar size without introducing any indirection at all.

Even if that's possible it would be quite hard to split like this 😄

FTR Scala 3 doesn't have this bug - but again I'm sceptical about version-specific boilerplate generators 🤔

If there's interest it can be a followup PR. I think it shouldn't be too hard (famous last words): currently all the instances are in the same file I think, when they could easily be split among a few files. And some of those files can go into the scala-2 and scala-3 srcs.

armanbilge

I didn't review the boilerplate code in detail, but I did spent some time studying the decompiled bytecode of the generated classes. The extra indirection is unfortunate but quite possibly not a big deal (and potentially avoidable for Scala 3 in follow-up work). Given the various constraints I think this is good 👍

DavidGregory084 · 2022-05-03T12:45:26Z

This inv lambda must also be storing references to A1 and A2.

Sorry for coming to this late but there is a great page about the lambda translation process here and a good blog post with some further explanation of what happens at runtime here.

Scalac uses the LambdaMetafactory provided in java.lang.invoke and it works in a similar way.

In the example that you provide:

val cmb = (x, y) => (A1.cmb(x._1, y._1), A2.cmb(x._2, y._2))
val inv = x => (A1.inv(x._1), A2.inv(x._2))

These lambdas are capturing lambdas, but they are not instance-capturing lambdas (they do not use this or super because A1 and A2 are constructor arguments of TupleGroup), so they are encoded as static methods. That does make me wonder if it behaves differently when you use an implicit val argument instead of using a context bound or just an implicit argument though!

The static methods will be declared with the lambda parameters with any captured variables prepended to the parameter list. You can actually see this in the bytecode for the INVOKEDYNAMIC call that @joroKr21 posted above:

// handle kind 0x6 : INVOKESTATIC
cats/kernel/instances/TupleMonoidInstances.$anonfun$catsKernelMonoidForTuple2$1(Lcats/kernel/Monoid;Lcats/kernel/Monoid;Lscala/Tuple2;Lscala/Tuple2;)Lscala/Tuple2; itf,

The first two parameters are cats.kernel.Monoid instances. The ALOAD 1 and ALOAD 2 calls just before the INVOKEDYNAMIC bytecode stack those "static arguments" for the bootstrap method that creates the lambda object.

The VM spec says that:

At run time, evaluation of a lambda expression is similar to evaluation of a class instance creation expression, insofar as normal completion produces a reference to an object. Evaluation of a lambda expression is distinct from execution of the lambda body.

Either a new instance of a class with the properties below is allocated and initialized, or an existing instance of a class with the properties below is referenced. If a new instance is to be created, but there is insufficient space to allocate the object, evaluation of the lambda expression completes abruptly by throwing an OutOfMemoryError.

So like @armanbilge suggests, these lambda expressions will capture A1 and A2 separately as the static arguments to those lambda expression objects. Each static lambda expression will probably result in the creation of one lambda object with a reference to each of those static arguments, although the behaviour is deliberately left completely up to the VM implementors so it's hard to be sure that different VMs will even behave the same:

These rules are meant to offer flexibility to implementations of the Java programming language, in that:

A new object need not be allocated on every evaluation.

Objects produced by different lambda expressions need not belong to different classes (if the bodies are identical, for example).

Every object produced by evaluation need not belong to the same class (captured local variables might be inlined, for example).

If an "existing instance" is available, it need not have been created at a previous lambda evaluation (it might have been allocated during the enclosing class's initialization, for example).

As of relatively recent versions of the JDK it seems that:

In the current implementation, the metafactory delegates to code that uses an internal, shaded copy of the ASM bytecode libraries to spin up an inner class that implements the target type.

joroKr21 · 2022-05-03T13:53:12Z

Thanks for detailed explanation @DavidGregory084 - I think that supports our current approach to apply this change to SAM type classes only

joroKr21 · 2022-07-14T08:32:05Z

Hey, is there any reason not to get this merged? 🙏

armanbilge · 2022-07-14T12:48:31Z

I'm happy to move forward with this as of #3871 (review). I feet like I remember someone (you?) saying we should still benchmark it or something before merging, but maybe I'm confused.

joroKr21 · 2022-07-14T20:48:44Z

I feet like I remember someone (you?) saying we should still benchmark it or something before merging, but maybe I'm confused.

I thought that referred to the instances which have more than one abstract method which are now removed.
I personally won't have time to benchmark the rest 😄

danicheg

This is tremendous.

armanbilge · 2022-07-15T18:46:43Z

We have previous approvals from Ross and Oscar as well, let's go ahead 🚀

armanbilge · 2022-07-16T17:52:33Z

In unrelated research, I found out that SAMs still generate classes on JS (unsurprisingly). So this PR actually didn't help things there, although I don't think it made things worse.

Not sure the situation on Native, but if I had to guess it would be the same as JS.

joroKr21 · 2022-07-16T20:40:17Z

I assume both JS and Native have functions though - otherwise I can't explain the reduction of jar size I observed in both. Note that we use functions here, not SAM syntax directly to work around the Scala 2 bug.

armanbilge · 2022-07-16T20:41:46Z

Ah, sorry, to clarify I'm not talking about the bytecode/SJSIR size. I'm talking about the size of the final generated JavaScript. Since JS is delivered into browsers this is a sensitive subject 😉

armanbilge · 2022-07-16T20:45:56Z

Note that we use functions here, not SAM syntax directly to work around the Scala 2 bug.

Right, sorry, I forgot this :)

joroKr21 changed the title ~~Refactor KernelBoiler, use SAM instances when possible~~ Refactor KernelBoiler, use instance constructors Apr 28, 2021

joroKr21 mentioned this pull request May 3, 2021

Use helper constructors to instantiate type classes #3870

Closed

joroKr21 changed the title ~~Refactor KernelBoiler, use instance constructors~~ Refactor boilerplate, use instance constructors May 8, 2021

joroKr21 changed the title ~~Refactor boilerplate, use instance constructors~~ Optimise boilerplate generators, use instance constructors May 8, 2021

joroKr21 mentioned this pull request May 18, 2021

Move typelevel/algebra into cats repo #3877

Closed

joroKr21 mentioned this pull request Jun 8, 2021

Avoid generating anonymous classes moia-oss/teleproto#142

Merged

1 task

johnynek reviewed Jul 9, 2021

View reviewed changes

joroKr21 force-pushed the sam-boiler branch from e4bea48 to 50a4dec Compare August 26, 2021 19:21

johnynek previously approved these changes Aug 26, 2021

View reviewed changes

joroKr21 dismissed johnynek’s stale review via c00d7c1 August 27, 2021 05:49

joroKr21 force-pushed the sam-boiler branch from 50a4dec to c00d7c1 Compare August 27, 2021 05:49

rossabaker previously approved these changes Aug 27, 2021

View reviewed changes

joroKr21 added 4 commits August 29, 2021 15:19

Refactor KernelBoiler, use SAM instances when possible

09972d8

Make it a bit more readable by formatting constraints on a new line.

Use instance constructors instead of SAM types

5b7bd81

Due to limitations in Scala 2 SAM types often end up generating classes after all. `instance` constructors don't suffer from this issue and also let us handle type classes with multiple abstract methods.

Optimize cats-core boilerplate

da85e2e

Optimize AlgebraBoilerplate by using instance constructors

8ae7a40

joroKr21 dismissed rossabaker’s stale review via 8ae7a40 August 29, 2021 12:19

joroKr21 force-pushed the sam-boiler branch from c00d7c1 to 8ae7a40 Compare August 29, 2021 12:19

johnynek previously approved these changes Aug 29, 2021

View reviewed changes

joroKr21 mentioned this pull request Oct 1, 2021

Introducing .flatMapN(f) as a shorthand for .mapN(f).flatten #4003

Closed

rossabaker previously approved these changes Nov 18, 2021

View reviewed changes

armanbilge mentioned this pull request Feb 8, 2022

Add F[TupleN] syntax #4125

Closed

joroKr21 added 5 commits February 10, 2022 07:33

Merge branch 'main' into sam-boiler

531b51a

Restore AlgebraBoilerplate to classes

f24bb2d

Use Show.show

1185c8b

Simplify GenTupleMonadInstances

7600153

Convert back multi-method type class instances to anonymous classes

6ba7e4b

armanbilge reviewed Feb 10, 2022

View reviewed changes

armanbilge approved these changes Feb 11, 2022

View reviewed changes

armanbilge mentioned this pull request Mar 7, 2022

Algebra in core erikerlandson/coulomb#254

Merged

armanbilge mentioned this pull request Mar 31, 2022

experiment with named classes in operations erikerlandson/coulomb#265

Merged

armanbilge mentioned this pull request May 16, 2022

Added flatMapN #4009

Merged

danicheg approved these changes Jul 15, 2022

View reviewed changes

armanbilge added the optimization label Jul 15, 2022

armanbilge added this to the 2.9.0 milestone Jul 15, 2022

armanbilge merged commit fbad4be into typelevel:main Jul 15, 2022

joroKr21 deleted the sam-boiler branch July 16, 2022 09:55

armanbilge mentioned this pull request Aug 11, 2022

Use SAM syntax for typeclass instances where possible #4279

Merged

8 tasks

	@inline def instance[A](cmb: (A, A) => A): Semigroup[A] =
	new Semigroup[A] {
	override def combine(x: A, y: A): A = cmb(x, y)
	}

Optimise boilerplate generators, use instance constructors #3871

Optimise boilerplate generators, use instance constructors #3871

Conversation

joroKr21 commented Apr 26, 2021 • edited Loading

joroKr21 commented Apr 27, 2021

joroKr21 commented Apr 27, 2021 • edited Loading

joroKr21 commented Apr 27, 2021

johnynek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joroKr21 Jul 9, 2021 • edited Loading

Choose a reason for hiding this comment

joroKr21 commented Aug 26, 2021

armanbilge commented Aug 26, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joroKr21 commented Aug 27, 2021

rossabaker left a comment

Choose a reason for hiding this comment

armanbilge commented Nov 26, 2021 • edited Loading

joroKr21 commented Nov 26, 2021

joroKr21 commented Nov 26, 2021

joroKr21 commented Feb 9, 2022

armanbilge commented Feb 9, 2022

joroKr21 commented Feb 10, 2022 • edited Loading

Choose a reason for hiding this comment

joroKr21 Feb 10, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

armanbilge Feb 10, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joroKr21 Feb 10, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

armanbilge left a comment

Choose a reason for hiding this comment

DavidGregory084 commented May 3, 2022

joroKr21 commented May 3, 2022

joroKr21 commented Jul 14, 2022

armanbilge commented Jul 14, 2022

joroKr21 commented Jul 14, 2022

danicheg left a comment

Choose a reason for hiding this comment

armanbilge commented Jul 15, 2022

armanbilge commented Jul 16, 2022

joroKr21 commented Jul 16, 2022

armanbilge commented Jul 16, 2022

armanbilge commented Jul 16, 2022

joroKr21 commented Apr 26, 2021 •

edited

Loading

joroKr21 commented Apr 27, 2021 •

edited

Loading

joroKr21 Jul 9, 2021 •

edited

Loading

armanbilge commented Nov 26, 2021 •

edited

Loading

joroKr21 commented Feb 10, 2022 •

edited

Loading

joroKr21 Feb 10, 2022 •

edited

Loading

armanbilge Feb 10, 2022 •

edited

Loading

joroKr21 Feb 10, 2022 •

edited

Loading