Use Cogen for arbitrary instances #1666

ceedubs · 2017-05-13T19:47:49Z

Resolves #1605.

This is a replacement for #1619. The approach taken in that PR led to
implicit resolution failures in scala.js.

Resolves typelevel#1605. This is a replacement for typelevel#1619. The approach taken in that PR led to implicit resolution failures in scala.js.

kailuowang

I think this is fine. Much better than the existing ones.

edmundnoble

Fine with me. Very, very curious how Scala.js can have implicit ambiguities when JVM doesn't.

ceedubs · 2017-05-14T18:00:48Z

@edmundnoble that surprised me too.

kailuowang · 2017-05-15T15:02:20Z

merge with two sign-off and no objections.

kailuowang · 2017-05-15T16:05:00Z

after merging build fails on

Kleisli[Option, Int, String].monoid.isEmpty *** FAILED ***
[info] GeneratorDrivenPropertyCheckFailedException was thrown during property evaluation.
[info] (Discipline.scala:14)
[info] Falsified after 1 successful property evaluations.
[info] Location: (Discipline.scala:14)
[info] Occurred when passed generated values (
[info] arg0 = Kleisli()
[info] )
[info] Label of failing property:
[info] (true ?== false) failed

I'm investigating.

ceedubs · 2017-05-15T16:42:45Z

@kailuowang I have been looking into this too. So far I've only seen it failing in the JS build and scala versions less than 2.12.

ceedubs · 2017-05-15T16:46:49Z

@kailuowang one other thing that I noticed is that the isId method that is called in the isEmpty tests may be a little troublesome conceptually when it comes to Kleisli/function instances. It will generate random Kleislis and ensure that we get the same result from calling isEmpty on that Kleisli instance and using the Kleisli Eq instance to compare the generated Kleisli with the result of calling empty on the Kleisli monoid. I think that it's possible (though I suspect extremely unlikely) that we could get inconsistent results on those two calls based on the 50 random values that we happen to pass into the kleisli instances when checking for "equality".

ceedubs · 2017-05-15T17:06:07Z

Could this be some sort of race condition heisenbug in scala.js? I can't seem to recreate it once I put in printlns to try to track down values that it's failing on.

Edit: I was able to recreate it with printlns in place (though it did take many repetitions, so this theory isn't completely ruled out)

ceedubs · 2017-05-15T17:44:40Z

Sorry for the live stream of my thought process while debugging this.

I just realized that a difference here is that for the JVM we are using 50 values to test function equality and for JS we are only using 5. So I think that the real issue here is just what I brought up with #1666 (comment) and it's not as unlikely as I thought when I posted that comment since we are only using 5 values in this case.

So essentially the problem here is that with our testing Eq instance that we use for Kleisli, there's a chance that the Monoid instance for Kleisli returns true for isEmpty, even if the Kleisli doesn't always return an empty value. This is an issue with our tests and not the Kleisli Monoid itself, but I don't know of a particularly good solution to it.

kailuowang · 2017-05-15T21:37:20Z

@ceedubs the law is basically testing isEmpty being consistent with empty. In Monoid the isEmpty is using Eq[A], for Kleisli, it means that it requires an Eq instance for Kleisli, i.e. an Eq for A => F[B] , there simply isn't a true Eq for Functions, so I think the isEmpty method is simply not available for Monoid[Kleisli[F, A, B] (and there is no way to override that method in the Kleisli instance). We were able to test this because we have a proximate Eq for Functions, but I don't think we should be testing it against laws. Thus we should not be testing this law against Kleisli. WDYT?

ceedubs · 2017-05-15T21:41:40Z

Thus we should be testing this law against Kleisli.

@kailuowang sorry I don't think that I followed you there at the end. We are testing this law against Kleisli aren't we?

kailuowang · 2017-05-15T21:42:56Z

I meant simply put if there is no Eq for kleisli, we shouldn't be testing isEmpty which isn't available for kleisli.

ceedubs · 2017-05-15T21:53:35Z

@kailuowang I see. It's currently part of the monoid laws, so you would need to remove it from there to not test it for Kleisli. You can't call the monoid laws without an Eq instance, so I guess it kind of makes sense that they would test isEmpty as well.

Another potential thing that you could do is change the isEmpty law to check that Monoid[A].isEmpty(Monoid[A].empty) is true without passing arbitrary A values in and performing the comparison. This would change the semantics a bit in that it would no longer enforce that there's only "one" empty value for a monoid.

kailuowang · 2017-05-15T21:56:00Z

I was just thinking should empty be unique?

edmundnoble · 2017-05-15T23:52:27Z

I think empty has to be unique, by contradiction, as @alexknvl stated in the gitter:
e1 === e1 combine e2 === e2.

kailuowang · 2017-05-16T02:18:49Z

I agree that empty has to be unique by law.
After further inspection of the tests, it seems to me that the test is basically testing the two separatedly generated instances of Eq[A => F[B]] being consistent with each other. This Eq in tests is implemented by passing a number of arbitrary input and comparing the outputs.
On scalajs, the number is only 5, which makes it subject to false positive, i.e. two different functions having a chance to return the same result for 5 inputs and thus become a false positive equal. So the test fails when one Eq instance report false equal while the other doesn't. I verified this hypothesis by turning the number to 3, which made the test fail consistently.
For a short-term solution, I propose we increase the number in scalajs to 10. I just created a PR #1671 to test how it behave on travis.

edmundnoble · 2017-05-16T05:23:21Z

In the long term, Eq for Function1 is giving us false positives, which translate to false negatives on the left side of an implication. Should we discourage negation of the Eq instance for this reason, or remove it altogether? I don't think false negatives are acceptable given property-based testing's already mammoth issues with reproducibility.

ceedubs · 2017-05-16T14:22:56Z

Thanks @kailuowang. I think that bumping up the number of checks for the JS build is a pretty reasonable solution, at least for the short-to-medium- term.

@edmundnoble I agree that this isn't great. Just to make sure that we are on the same page, you know that the Eq instance only exists in the test scope, right? The problem with removing it is that then we wouldn't be able to test laws on our Function1/Kleisli/State instances at all, which seems worse to me than longer build times and possible occasional build failures due to false positives. I would love to find a more elegant solution, though.

edmundnoble · 2017-05-16T20:17:50Z

Yeah, I'm aware it's only in test scope. My only issue is with negating the Eq causing false positives to translate to false negatives. The issue is with our usage of the Eq instance.

ceedubs · 2017-05-16T20:19:46Z

@edmundnoble could you please expand upon what you mean by "negating the Eq"? You've said that a couple of times now, and I don't think that I understand it.

hobwekiva · 2017-05-16T23:26:36Z

When you check

P(f, g) = f === g

using Eq[A => B], it's the same as just adding an extra parameter like this:

P(f, g, a) = f(a) === g(a)

At worst you might get some false positives (f is not the same as g, but on your test set they are the same).

Now consider

P(f, g) = !(f === g)

this will instead produce false negatives (f is not the same as g, but on your test set they coincide)

There are other boolean expressions (apart from negation) that you can not use on results of === in tests. For instance, consider if (f === g) P(f, g) else R(f, g). And it is not limited to functions either.

kailuowang · 2017-05-17T00:03:52Z

Thanks @alexknvl

ceedubs · 2017-05-17T14:25:40Z

Thanks @alexknvl. @edmundnoble is this something that we are doing in our tests? Also I guess the hope would be that with enough checks over enough rng seeds (a new one each build), false negatives would eventually be uncovered.

kailuowang · 2017-05-17T14:53:27Z

To help facilitate the discussion. Here is the code
The test is Rules.isId("isEmpty", A.empty)(A.isEmpty)
which translate to

forAll {( x: A) => Eq[A].eqv(x, A.empty) ?== A.isEmpty(x) }

The default implementation of A.isEmpty is

Eq[A].eqv(a, A.empty)

So for instances that don't override the default implementation this test is equvilent to

forAll { ( x: A) => 
 Eq[A].eqv(x, A.empty) ?== Eq[A].eqv(x, A.empty)
}

This is a work in progress and there is a bit more work that should probably be done before merging this. However, I have already put a fair amount of time into this and wanted to see what people thought about it before pushing through to do all of the relevant work. Cats has a lot of instances for function-like types. When we go to check the laws for these types, we are required to supply an `Eq` instance. But defining equality for functions is awkward. So far we've been approaching this by generating a bunch of values and passing them into both functions and ensuring that we get the same results from both. This can produce false positives if two functions differ but we just happen to sample values that produce the same output in both. For some purposes, it isn't a big deal if we get some occasional false positives, because over many runs of the tests with different RNG seeds, we should eventually catch any discrepancies. But here be dragons. Some tests use the results of these equality checks on the left side of an implication, so a false positive in the equality check becomes a false _negative_ (and thus a build failure) in the test result. See [here](typelevel#1666 (comment)) for further discussion. This is where my adventure with this PR begins. Cats builds have been timing out quite a bit recently, so I tried to reduce the number of random values that we sample when comparing two functions for equality. While this did speed up the build a little bit, it started leading to a much higher frequency of build failures due to false negatives in tests. So I started to rethink how we determine function equivalence. Instead of relying on nondeterministic behavior for equality, we can only provide function equality for functions whose domains are small enough to exhaustively check. If two functions produce the same output for the entirety of their domain, they are equivalent. I've introduced an `ExhaustiveCheck[A]` type class that is similar to `Gen[A]` but produces a `Stream[A]` of the entire domain of `A`. I made the name somewhat specific to tests as opposed to something like `Finite[A]`, because types like `Int` have a finite domain but would be prohibitively expensive to exhaustively check in tests and therefore shouldn't have an instance for this type class. I also added some `Eq` instances for function-like types whose domains have `ExhaustiveCheck` instances. For the sake of compatibility, I didn't remove the old `Eq` instances, but I've put them in a lower implicit priority scope, and I've changed the sites that were using them to use the new instances (well not quite all of them yet -- that's why this PR isn't quite complete yet). The main benefits of this change as I see it are: 1. Remove some nondeterministic behavior from the build. 2. Allow for shrinking of the number of values checked to improve build times without triggering build failures. 3. Allow for future deprecation of some problematic instances that are exposed through cats-laws but that users should probably not depend on. The main potential downside that I can think of is that we might be checking 15 examples where we were checking 50 before, which could be considered a reduction in test coverage. However, I think that all of the places where this sort of approach is used are parametric on the type, so I don't think that it should matter much that the domain for this type is much smaller. Let me know what you think. If people like this approach then I can switch over the remaining bits.

) * Don't depend on random sampling to determine function equivalence This is a work in progress and there is a bit more work that should probably be done before merging this. However, I have already put a fair amount of time into this and wanted to see what people thought about it before pushing through to do all of the relevant work. Cats has a lot of instances for function-like types. When we go to check the laws for these types, we are required to supply an `Eq` instance. But defining equality for functions is awkward. So far we've been approaching this by generating a bunch of values and passing them into both functions and ensuring that we get the same results from both. This can produce false positives if two functions differ but we just happen to sample values that produce the same output in both. For some purposes, it isn't a big deal if we get some occasional false positives, because over many runs of the tests with different RNG seeds, we should eventually catch any discrepancies. But here be dragons. Some tests use the results of these equality checks on the left side of an implication, so a false positive in the equality check becomes a false _negative_ (and thus a build failure) in the test result. See [here](#1666 (comment)) for further discussion. This is where my adventure with this PR begins. Cats builds have been timing out quite a bit recently, so I tried to reduce the number of random values that we sample when comparing two functions for equality. While this did speed up the build a little bit, it started leading to a much higher frequency of build failures due to false negatives in tests. So I started to rethink how we determine function equivalence. Instead of relying on nondeterministic behavior for equality, we can only provide function equality for functions whose domains are small enough to exhaustively check. If two functions produce the same output for the entirety of their domain, they are equivalent. I've introduced an `ExhaustiveCheck[A]` type class that is similar to `Gen[A]` but produces a `Stream[A]` of the entire domain of `A`. I made the name somewhat specific to tests as opposed to something like `Finite[A]`, because types like `Int` have a finite domain but would be prohibitively expensive to exhaustively check in tests and therefore shouldn't have an instance for this type class. I also added some `Eq` instances for function-like types whose domains have `ExhaustiveCheck` instances. For the sake of compatibility, I didn't remove the old `Eq` instances, but I've put them in a lower implicit priority scope, and I've changed the sites that were using them to use the new instances (well not quite all of them yet -- that's why this PR isn't quite complete yet). The main benefits of this change as I see it are: 1. Remove some nondeterministic behavior from the build. 2. Allow for shrinking of the number of values checked to improve build times without triggering build failures. 3. Allow for future deprecation of some problematic instances that are exposed through cats-laws but that users should probably not depend on. The main potential downside that I can think of is that we might be checking 15 examples where we were checking 50 before, which could be considered a reduction in test coverage. However, I think that all of the places where this sort of approach is used are parametric on the type, so I don't think that it should matter much that the domain for this type is much smaller. Let me know what you think. If people like this approach then I can switch over the remaining bits. * Remove ExhaustiveCheck.map method * Fix InvariantMonoidal tests * Fix tests with failing implicit resolution

Use Cogen for arbitrary instances

2d6165c

Resolves typelevel#1605. This is a replacement for typelevel#1619. The approach taken in that PR led to implicit resolution failures in scala.js.

ceedubs added the in progress label May 13, 2017

ceedubs mentioned this pull request May 13, 2017

Allow non-constant arbitrary functions for (Co)Kleisli, State, Func #1619

Closed

kailuowang approved these changes May 13, 2017

View reviewed changes

edmundnoble approved these changes May 14, 2017

View reviewed changes

kailuowang merged commit 461781e into typelevel:master May 15, 2017

kailuowang removed the in progress label May 15, 2017

kailuowang mentioned this pull request May 16, 2017

Fixing Eq[Function1] in testsJS; break JS build to separate matrix build. #1671

Merged

ceedubs deleted the issue-1605-take-3 branch May 16, 2017 14:23

kailuowang modified the milestone: 1.0.0-MF May 18, 2017

ceedubs mentioned this pull request Oct 21, 2018

Don't depend on random sampling to determine function equivalence #2577

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Cogen for arbitrary instances #1666

Use Cogen for arbitrary instances #1666

ceedubs commented May 13, 2017

kailuowang left a comment

edmundnoble left a comment

ceedubs commented May 14, 2017

kailuowang commented May 15, 2017

kailuowang commented May 15, 2017

ceedubs commented May 15, 2017

ceedubs commented May 15, 2017

ceedubs commented May 15, 2017 •

edited

Loading

ceedubs commented May 15, 2017 •

edited

Loading

kailuowang commented May 15, 2017 •

edited

Loading

ceedubs commented May 15, 2017

kailuowang commented May 15, 2017 •

edited

Loading

ceedubs commented May 15, 2017

kailuowang commented May 15, 2017

edmundnoble commented May 15, 2017

kailuowang commented May 16, 2017

edmundnoble commented May 16, 2017

ceedubs commented May 16, 2017

edmundnoble commented May 16, 2017

ceedubs commented May 16, 2017

hobwekiva commented May 16, 2017 •

edited

Loading

kailuowang commented May 17, 2017

ceedubs commented May 17, 2017

kailuowang commented May 17, 2017

Use Cogen for arbitrary instances #1666

Use Cogen for arbitrary instances #1666

Conversation

ceedubs commented May 13, 2017

kailuowang left a comment

Choose a reason for hiding this comment

edmundnoble left a comment

Choose a reason for hiding this comment

ceedubs commented May 14, 2017

kailuowang commented May 15, 2017

kailuowang commented May 15, 2017

ceedubs commented May 15, 2017

ceedubs commented May 15, 2017

ceedubs commented May 15, 2017 • edited Loading

ceedubs commented May 15, 2017 • edited Loading

kailuowang commented May 15, 2017 • edited Loading

ceedubs commented May 15, 2017

kailuowang commented May 15, 2017 • edited Loading

ceedubs commented May 15, 2017

kailuowang commented May 15, 2017

edmundnoble commented May 15, 2017

kailuowang commented May 16, 2017

edmundnoble commented May 16, 2017

ceedubs commented May 16, 2017

edmundnoble commented May 16, 2017

ceedubs commented May 16, 2017

hobwekiva commented May 16, 2017 • edited Loading

kailuowang commented May 17, 2017

ceedubs commented May 17, 2017

kailuowang commented May 17, 2017

ceedubs commented May 15, 2017 •

edited

Loading

ceedubs commented May 15, 2017 •

edited

Loading

kailuowang commented May 15, 2017 •

edited

Loading

kailuowang commented May 15, 2017 •

edited

Loading

hobwekiva commented May 16, 2017 •

edited

Loading