Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make NonEmpty functions less gratuitously lazy #107

Open
treeowl opened this issue Nov 21, 2022 · 26 comments
Open

Make NonEmpty functions less gratuitously lazy #107

treeowl opened this issue Nov 21, 2022 · 26 comments

Comments

@treeowl
Copy link

treeowl commented Nov 21, 2022

This is going to be ... hard. Some decisions, I think, will not be very controversial. Others will likely be quite controversial. Let's go through them one by one and try to figure things out. But first, I'd like to mention that along with the definition we have,

data NonEmpty a = a :| [a]

there's another, equally valid, expression of the concept of a nonempty list:

data NonEmpty' a = EndNE a | ConsNE a (NonEmpty' a)

Each of these expressions is better at certain things and worse at certain things. Personally, I find the NonEmpty' expression more natural or fundamental, and therefore I will tend towards implementations that reflect the "natural" strictness we'd find there. However, I don't want to push for that where it feels unnatural.


unfold

This function was deprecated ages and ages ago, and the time has long since come to delete it. There's no point discussing its strictness.

uncons

Currently,

uncons :: NonEmpty a -> (a, Maybe (NonEmpty a))
uncons ~(a :| as) = (a, nonEmpty as)

I propose

uncons (a :| as) = (a, nonEmpty as)

What I actually want, based on my stated bias, is

uncons :: NonEmpty a -> Either a (a, NonEmpty a)
uncons (a :| []) = Left a
uncons (a :| b : bs) = Right (a, b :| bs)

Would that be a step too far? If so, would it be worth offering such a function by another name?

init and last

These have irrefutable patterns, but they're actually strict. Confusing, but just an implementation issue we don't need to discuss.

<| and cons

This is defined

a <| ~(b :| bs) = a :| b : bs

I believe this is the correct amount of laziness and we should leave it as is.

toList

Currently,

toList ~(a :| as) = a : as

I propose

toList (a :| as) = a : as

lift

Currently,

lift :: Foldable f => ([a] -> [b]) -> f a -> NonEmpty a
lift f = fromList . f . Foldable.toList

I don't have much intuition about how this function should behave. If we change toList, then it will automatically get stricter when applied to NonEmptys; is that okay?

map

Currently,

map f ~(a :| as) = f a :| fmap f as

I propose to remove the irrefutable pattern.

inits, inits1, tails, and tails1

I have yet to form any opinion on these. The change in toList behavior affects them too.

insert

Currently,

insert  :: (Foldable f, Ord a) => a -> f a -> NonEmpty a
insert a = fromList . List.insert a . Foldable.toList

I think the proposed change in toList behavior is fine for this. It might make a difference for some sort of degenerate Ord instance, but I don't imagine we'll get any complaints.

scanl, scanr, scanl1, scanr1

Currently, these are lazy; I propose to make them strict.

intersperse

Currently,

intersperse :: a -> NonEmpty a -> NonEmpty a
intersperse a ~(b :| bs) = b :| case bs of
    [] -> []
    _ -> a : List.intersperse a bs

I'd definitely remove the irrefutable pattern. My bias would suggest forcing bs as well, but I doubt that's what people will actually want.

reverse

Currently, reverse is actually strict, if I read it right. I'm fine with leaving it that way.

take

Currently,

take :: Int -> NonEmpty a -> [a]
take n = List.take n . toList

The proposed change to toList would make this strict, which I think would be better.

drop

Currently,

drop :: Int -> NonEmpty a -> [a]
drop n = List.drop n . toList

which is lazy when n <= 0 and strict otherwise. The proposed change to toList would make it unconditionally strict, which I think would be better.

splitAt

Currently,

splitAt :: Int -> NonEmpty a -> ([a],[a])
splitAt n = List.splitAt n . toList

This is odd the same way drop is. The proposed change to toList will make it strict.

takeWhile, dropWhile

There's a pattern here; I think these are better with stricter toList.

zip, zipWith

Currently,

zipWith :: (a -> b -> c) -> NonEmpty a -> NonEmpty b -> NonEmpty c
zipWith f ~(x :| xs) ~(y :| ys) = f x y :| List.zipWith f xs ys

I would remove the irrefutable patterns.

unzip

Currently,

unzip :: Functor f => f (a,b) -> (f a, f b)
unzip xs = (fst <$> xs, snd <$> xs)

I propose changing this to

unzip :: NonEmpty (a, b) -> (NonEmpty a, NonEmpty b)
unzip ((a, b) :| abs) = (a :| as, b :| bs)
  where
    (as, bs) = List.unzip abs

Again, my bias would suggest forcing abs, but I don't think that's what people will want.

transpose

No clear opinion.

append

See <> below.

appendList

This is strict, and I think should remain so.

prependList

This is strict, and I think should remain so.

Foldable instance

We have

  foldr f z ~(a :| as) = f a (List.foldr f z as)
  foldl f z (a :| as) = List.foldl f (f z a) as
  foldl1 f (a :| as) = List.foldl f a as
  foldr1 f (p :| ps) = foldr go id ps p
    where
      go x r prev = f prev (r x)
  foldMap f ~(a :| as) = f a `mappend` foldMap f as
  fold ~(m :| ms) = m `mappend` fold ms
  toList ~(a :| as) = a : as

I propose to remove all the irrefutable patterns.

Functor instance

Currently,

fmap f ~(a :| as) = f a :| fmap f as
b <$ ~(_ :| as)   = b   :| (b <$ as)

I propose to remove the irrefutable patterns.

Traversable instance

Currently,

  traverse f ~(a :| as) = liftA2 (:|) (f a) (traverse f as)

I propose to remove the irrefutable pattern.

Semigroup instance

We currently have

(a :| as) <> ~(b :| bs) = a :| (as ++ b : bs)

This looks correct to me.

@clyring
Copy link
Member

clyring commented Nov 23, 2022

This has been bothering me for a long time. Thanks for making this proposal, @treeowl. I would suggest the following principle:

  • For every NonEmpty function that is different from a corresponding List function only in the presence of NonEmpty in its type, both the List and NonEmpty functions should have the same strictness properties.

Mostly this is so I don't have to keep two different sets of subtly different strictness behaviors in my mind, but also it may eventually make sense to make the NonEmpty versions just casts-with-proof around the List versions.

In a quick scan, I disagree with your initial assessments of the following functions:

  • cons and (<|): Technically we don't have Data.List.cons, so my principle doesn't exactly apply. But x <| y = x :| toList y is the natural definition, and is (with the proposed change to toList) slightly stricter than its current definition, since it doesn't allocate a (:) cell until y is evaluated. The two selector thunks the current version tends to allocate when used can seriously hurt performance when recursively building a NonEmpty. (And thanks to worker/wrapper-for-CPR the toList is often free in this scenario.)
  • (<>): Similarly, I would expect (a :| as) <> bs = a :| (as ++ toList bs), with no immediate allocation of a (:) cell corresponding to the second argument.

@nomeata
Copy link

nomeata commented Nov 23, 2022

@treeowl, can you explain why NonEmpty' is more natural to you?

(It's not to me - I don't see why I should have to worry about whether there are more elements after the first just to get the first element. But then my personal natural model for a non-empty list type is a refinement type (or, in first approximation, an invariant-protecting newtype) of the list type.)

@treeowl
Copy link
Author

treeowl commented Nov 23, 2022

@nomeata, I probably said that too strongly. One thing, for me, is that consing and unconsing seem like really basic operations for something I'm calling a list, and for NonEmpty those require shuffling elements through different positions, with both potential run-time costs and also the unpleasant question of how strict cons should be.

@nomeata
Copy link

nomeata commented Nov 23, 2022

Oh, absolutely! I share that sentiment, and was one reason I was briefly considering experimenting with a newtype based approach for NonEmpty (but abandoned it for deficienies of the module system of Haskell, which means I couldn't protect the invariant as much as I hoped for): https://gitlab.haskell.org/ghc/ghc/-/issues/22270

clyring added a commit to haskell/bytestring that referenced this issue Nov 28, 2022
* Add basic benchmarks for inits/tails

* Add NonEmpty variants of inits and tails

  The lazy versions use new implementations:
  - Lazy tails got about 10% faster with ghc-9.2. (A happy accident!)
  - Lazy inits got much faster:
    - For the first few chunks it is about 1.5x faster, due to better list fusion.
    - When there are many chunks it is about 4x faster.

* Formatting and comments, as suggested in review

* Add link to a relevant CLC issue about NonEmpty

  - haskell/core-libraries-committee#107
vdukhovni pushed a commit to vdukhovni/bytestring that referenced this issue Dec 7, 2022
* Add basic benchmarks for inits/tails

* Add NonEmpty variants of inits and tails

  The lazy versions use new implementations:
  - Lazy tails got about 10% faster with ghc-9.2. (A happy accident!)
  - Lazy inits got much faster:
    - For the first few chunks it is about 1.5x faster, due to better list fusion.
    - When there are many chunks it is about 4x faster.

* Formatting and comments, as suggested in review

* Add link to a relevant CLC issue about NonEmpty

  - haskell/core-libraries-committee#107
clyring added a commit to haskell/bytestring that referenced this issue Dec 29, 2022
* Add basic benchmarks for inits/tails

* Add NonEmpty variants of inits and tails

  The lazy versions use new implementations:
  - Lazy tails got about 10% faster with ghc-9.2. (A happy accident!)
  - Lazy inits got much faster:
    - For the first few chunks it is about 1.5x faster, due to better list fusion.
    - When there are many chunks it is about 4x faster.

* Formatting and comments, as suggested in review

* Add link to a relevant CLC issue about NonEmpty

  - haskell/core-libraries-committee#107

(cherry picked from commit d4933c6)
@parsonsmatt
Copy link

I think I second @nomeata 's request for clarification - there's a lot of "I want this" and "I propose this change" but not a lot of justification or explanation why.

Most of these changes seem pretty reasonable, but I'm also a bit hesitant to make breaking changes in the laziness/semantics of a datatype. Following in the pattern of containers and many other datatype-providing libraries, why not export these new definitions from a module Data.List.NonEmpty.Strict? The existing ones could be moved to Data.List.NonEmpty.Lazy, and the existing Data.List.NonEmpty would re-export the functions from Data.List.NonEmpty.Lazy.

This approach wouldn't break anyone's code, and would allow for folks to explicitly select lazy vs strict variants of functions. The two modules could have documentation suggesting when you would want one or the other.

In the event that the community decides that Data.List.NonEmpty.Strict is universally preferable, we could WARNING on Data.List.NonEmpty for a release, telling folks "This will change to exporting Data.List.NonEmpty.Strict instead, either switch to that now or switch to Data.List.NonEmpty.Lazy to preserve existing behavior." Then the next major release actually does the switch and removes the WARNING.

uncons

Can you explain why NonEmpty a -> Either a (a, NonEmpty a) is more in line with your stated bias?

That feels pretty unnatural to me. You have to duplicate the code for handling the "known" case (definitely have at least one a), and you can't use it easily in a pattern match without writing multiple cases.

case uncons xs of
    Left a -> foo a
    Right (a, as) -> foo a <> foos as

case uncons xs of
    (a, mas) -> foo a <> foldMap foos mas

let (a, mas) = uncons xs

(a, mas) <- do ...; pure (uncons xs)

toList

Verifying my own understanding here:

λ> import Data.List.NonEmpty
λ> toList undefined
[*** Exception: Prelude.undefined
CallStack (from HasCallStack):
  undefined, called at <interactive>:2:8 in interactive:Ghci2
λ> toList' (a :| as) = a : as
λ> toList' undefined
*** Exception: Prelude.undefined
CallStack (from HasCallStack):
  undefined, called at <interactive>:4:9 in interactive:Ghci3

The new variant will undefined when the toList call is evaluated, whereas the current version will undefined when the result of that toList call is evaluated. This appears to be in line with Set.toList.

lift

Oof, this one is gnarly - fromList is partial, and would really prefer to see Maybe in that result type.

map

Removing the irrefutable pattern would mean that map f undefined would be undefined, while currently map f undefined = undefined :| undefined. This seems pretty reasonable.

@hasufell
Copy link
Member

@parsonsmatt I like the idea, but do I understand correctly that in order to maintain both variants, we kind of have to fix the type signature of Data.List.NonEmpty.Lazy.unzip, because Data.List.NonEmpty.Strict.unzip will not use Functor. Otherwise we end up with type divergence, which seems like a problem (no drop-in replacement possible in the worst case).

@TeofilC
Copy link

TeofilC commented Feb 19, 2023

I think we should be really careful about introducing .Lazy/.Strict variants.
Most of the time there aren't two clear options and it can make things very confusing and/or code us into a dead end.

For containers it makes a lot of sense. .Lazy is spine-strict and .Strict is additionally value strict. It makes sense here because the design space has already been narrowed down quite a lot by making the .Lazy variant already quite strict (on the other hand it's a bit confusing that the lazy one is already quite strict).

But in other cases it makes a lot less sense. I think Control.Monad.Trans.State.Strict is an example of this. It's stricter than the .Lazy variant but then there are stricter possibilities as well and it gives users of the library unearned confidence that they are avoiding space leaks.

In the case of Data.NonEmpty I can think of two possible variants off the top of my head that would make sense for the Data.NonEmpty.Strict module to be.

  • data NonEmptySpineStrict = EndESpineStrict a | ConsESpineStrict a !NonEmptySpineStrict, which is spine strict
  • data NonEmptyValueStrict= EndEValueStrict !a | ConsEValueStrict !a !NonEmptyValueStrict, which is value strict.

I could imagine that at some point in the future we might want to add something like the value strict variant to base and it would be a shame if the Lazy/Strict distinction was already taken.

@Bodigrim
Copy link
Collaborator

The basic problem is that instance Functor NonEmpty is different (lazier) than the instance one can obtain via DeriveFunctor. This is outright a bug in my books, because it breaks expectations which every other instance Functor in base adheres to.

@Bodigrim
Copy link
Collaborator

I would suggest the following principle:

  • For every NonEmpty function that is differs from a corresponding List function only in the presence of NonEmpty in its type, both the List and NonEmpty functions should have the same strictness properties.

Yes, nicely said. I fully agree with this principle stated. I followed it when designing https://github.com/Bodigrim/infinite-list#laziness.


Just to reiterate the problem. A function returning a record type with a single constructor can always return this very constructor before even looking at its arguments, not even weak head normal form. Notice the irrefutable pattern matching with ~:

data Pair a = Pair a a 

myFmap :: (a -> b) -> Pair a -> Pair b
myFmap f ~(Pair x y) = Pair (f x) (f y)

Under the hood this definition translates to

myFmap f p = Pair (let Pair x _ = p in f x) (let Pair _ y = p in f y) 

On the first glance it might look like a good idea: there is nothing else other than Pair we can return, so let's be lazy to the core. The problem with such definition arises later, when you try to fight space leaks. If you seq the result of myFmap f x virtually nothing happens: no way to trigger evaluation of x at all, you just hold Pair with two thunks in it. For example,

> myFmap undefined undefined `seq` ()
()

That's not what we usually want, and that's why {-# LANGUAGE DeriveFoldable #-} does not generate irrefutable patterns in such cases. A normal derived instance would be

instance Functor Pair where
  fmap f (Pair x y) = Pair (f x) (f y)

and

> fmap undefined (undefined :: Pair ()) `seq` ()
*** Exception: Prelude.undefined

This principle holds for every data type in base, including normal lists, except... NonEmpty, which defines fmap manually with an irrefutable pattern:

instance Functor NonEmpty where
  fmap f ~(a :| as) = f a :| fmap f as

or equivalently

instance Functor NonEmpty where
  fmap f aas = (let a :| _ = aas in f a) :| (let _ :| as = aas in fmap f as)

There are multiple issues:

  1. The behaviour of instance Functor NonEmpty semantically differs from the one which would be derived automatically.
  2. It also behaves differently from instance Functor [].
  3. It also does not follow intuition about Functor instances for other record types in base.
  4. It is bad for space leaks, usual seq and combinators built atop it do nothing, one has to deepseq to force arguments to WHNF.
  5. Lastly, it is bad for performance, because you pattern-match on everything twice.

Other functions in Data.List.NonEmpty are equally misbehaving, but we could have defined Data.List.NonEmpty.Strict with stricter versions. We cannot do this for instance Functor and I'd say that in practice it is the crux: if we do not want to fix it, it's better not to touch this at all and keep NonEmpty at least internally consistent.

@parsonsmatt
Copy link

That's convincing to me. +1

@mixphix
Copy link
Collaborator

mixphix commented Mar 26, 2023

This issue spurred me into making llun, which uses pattern synonyms to address @treeowl's point that Either a (a, NonEmpty a) is another useful representation. I haven't had a chance to figure out benchmarks, or document it, but most functions are implemented locally so it should be easy to compare with the current implementations.

@Bodigrim
Copy link
Collaborator

Dear CLC members. Could you please provide (non-binding) opinions on the proposal? Do you agree with the principle suggested in #107 (comment)? Would you like to apply it to Data.List.NonEmpty?

@tomjaguarpaw @chshersh @angerman @hasufell @mixphix @parsonsmatt

@tomjaguarpaw
Copy link
Member

The correspondence principle between [] and NonEmpty seems desirable to me.

@hasufell
Copy link
Member

How do we assess impact? I also like the idea, but I'm not sure I see enough motivation if this can break code.

@chshersh
Copy link
Member

chshersh commented Mar 28, 2023

I agree with the proposal and with further suggestions in #107 (comment).

My view is simple: if you label arguments with ~ explicitly, there should be a good reason to do so. Ideally, it should be documented in each case why arguments use irrefutable patterns. I don't see a good reason for NonEmpty, so let's remove it.

Side comment: In addition to cons x xs = x :| toList xs I also want cons' x (x :| xs) = x :| (x : xs) so cons' x1 $ cons' x2 $ cons' x3 $ singleton x4 would be equivalent to x1 :| [x2, x3, x4] but this could be done via a separate proposal.


How do we assess impact? I also like the idea, but I'm not sure I see enough motivation if this can break code.

I think, clc-stackage could also run tests in addition to the compilation. And I'm pretty sure that Stackage already runs non-disabled tests for the entire snapshot. It would be nice to use these capabilities for the impact assessment procedure but I'm not sure CLC has either the budget or capacity to implement this, so this should be taken to Haskell Foundation.

@Bodigrim
Copy link
Collaborator

I think, clc-stackage could also run tests in addition to the compilation.

Not in its current structure, unfortunately. Besides, running all tests of all packages will likely fail too often. Stackage curators maintain a list of test suites which should be excluded. Maybe one can run stackage-curator with all Stackage metadata but an updated GHC?.. I don't know about their infrastructure.

@Bodigrim
Copy link
Collaborator

@treeowl how would you like to proceed with this? There seems to be enough support of the idea, but a convincing impact assessment is likely to be very hard.

CC @juhp @DanBurton @cdornan @alexeyzab @mihaimaruseac as Stackage Curators (and sorry if I missed anyone else). Is there an easy way to build a Stackage snapshot and run all enabled test suites against a custom GHC? We would like to take the latest GHC 9.4 release, modify laziness of routines in Data.List.NonEmpty as described above and run tests to ensure that there is no breakage.

@juhp
Copy link

juhp commented Apr 14, 2023

@Bodigrim it should be possible - we use a dedicated buildserver (thanks to @fpco) to run curator with quite a bit of diskspace - but I am not sure how reproducible our build environment is, though most of it is in a container. Also cc @bergmark.

Also curator just uses stack underneath, so for the custom ghc, it should be possible to setup, but I am not sure if curator makes it harder.

I don't think we can use the Stackage server to run your tests (we had a similar request a while back, which we couldn't fulfil), but we can try to help with questions/issues that arise.

(This is my personal take, other curators can also chime in if they have something to add.)

@cdornan
Copy link

cdornan commented Apr 14, 2023 via email

@Bodigrim
Copy link
Collaborator

Thanks @juhp and @cdornan. In such case I imagine this proposal awaits an enthusiatic volunteer to setup a clone of Stackage build server and run Stackage tests with a patched GHC. On constrast to our usual practices, just building clc-stackage with Cabal is not enough: there is no change in type signatures, the only change is runtime behaviour, so one has to run actual tests to provide a meaningful impact assessment.

@Bodigrim Bodigrim added the awaits-impact-assessment Awaits an impact assessment (e. g., using https://github.com/Bodigrim/clc-stackage) label Apr 15, 2023
@Bodigrim
Copy link
Collaborator

I'm afraid we are stuck here, unless there is an enthusiast to run tests for all Stackage packages. We might have a better luck with finding such individual, if there was an MR at hand. @treeowl could you possibly prepare one?

I strongly believe that this is an important issue, it would be a shame to drop it.

@Bodigrim
Copy link
Collaborator

@treeowl is there is no progress within two weeks, I'll close this as abandoned. We can return back anytime there are resources to execute.

@Bodigrim
Copy link
Collaborator

Bodigrim commented Jul 2, 2023

Closing as abandoned, feel free to reopen when there are resources to make some progress.

@Bodigrim Bodigrim closed this as completed Jul 2, 2023
@Bodigrim Bodigrim added abandoned Abandoned by proposer (no-show) and removed awaits-impact-assessment Awaits an impact assessment (e. g., using https://github.com/Bodigrim/clc-stackage) labels Jul 2, 2023
@sergv
Copy link

sergv commented Oct 8, 2024

Can the proposal please be reopened?

I want to take over this proposal since the original author apparently let it go. I've prepared MR with the changes in base (https://gitlab.haskell.org/ghc/ghc/-/merge_requests/12824) and an impact assessment of building Stackage snapshot and running its tests.

@Bodigrim Bodigrim reopened this Oct 8, 2024
@Bodigrim Bodigrim removed the abandoned Abandoned by proposer (no-show) label Oct 8, 2024
@sergv
Copy link

sergv commented Oct 17, 2024

So let me provide the detailed update about removing ~ from Data.List.NonEmpty. The most important parts below are under Merge request and Testing conclusion section.

Merge request

First, a merge request to the GHC repository was prepared at https://gitlab.haskell.org/ghc/ghc/-/merge_requests/12824, initially with the suggestions from the first message in this issue. Later, during review by @clyring a bit more changes were proposed, notably removing ~ some other function like (<|), cons, (<>), and (>>=).

The overall idea was to have the same laziness as Data.List which seems to have been achieved.

Testing methodology

Then was the question of testing, which from the beginning was focusing on building full of Stackage and running its tests. Having failed to reuse original Stackage’s tooling, I’m still confident I managed to honor the spirit of the intended testing and in some respects actually run more tests that the original tooling would have.

So the testing methodology ultimately boils down to downloading Stackage snapshot in the form of cabal.config file, e.g. https://www.stackage.org/lts-22.38/cabal.config, parsing it and building each package one by one along with its tests and benchmarks then running the tests just like regular cabal build and cabal test would. In the process the outputs of cabal build and, more importantly, cabal test are stored in the files named after the package being processed so that these logs can later be compared between the two runs with different GHCs.

For doing the runs I developed following tool https://github.com/sergv/stackage-tester which does what I outlined in the previous paragraph. The builds are actually performed in the docker image that Stackage is itself built with, e.g. ghcr.io/commercialhaskell/stackage/build:lts22.

To get a bit more detailed about the build process, one important note is that this is not an original stackage tooling so it doesn’t pay attention to Stackage’s build-constraints.yaml that specifies which tests should and should not be run. My tool runs all the tests with the expectation that there should be no material differences between test runs, e.g. no tests started failing with the change of GHC.

The builds of each package are carried in such a way that only the packages and versions from the specified snapshot will be used. There we subtle points regarding Setup.hs not picking up versions as specyified in https://www.stackage.org/lts-22.38/cabal.config and leading to build failures but they were tracked down and resolved (ask me how if you really want to know).

Results of test run

I have built LTS 22.36 snapshot about a week ago with my tool and stored produced logs in https://github.com/sergv/stackage-build-logs. You can see two commits there, first is the baseline 116d212 that built with vanilla GHC, then the results on top of it in 07e962e to better see the difference.

There are 4 folders, storing relevant logs for successful builds in build-logs, failed builds in build-logs-failed, successfull test runs in test-logs, and failed test runs in test-logs-failed.

The build-logs directory contains both logs from building given package named build-${PACKAGE_NAME}-${VERSION}.log and logs showing builds of their dependencies named as build-deps-${PACKAGE_NAME}-${VERSION}.log. Changes in building dependencies are not really interesting if they succeeded so you can easily skip those.

Please note that some build failures are expected - some Stackage packages build the library but disable tests completely so Stackage doesn’t even try building them. When my tool builds everything and tests don’t build with current snapshot then it will be recorded as a build failure since the point of the exercise is to run tests.

Failing tests are also expected since Stackage disables tests for some packages based on author’s wishes. Surprisingly, tests for aeson are disabled on Stackage so had I used the original Stackage tooling they wouldn’t be covered in this report.

What to look for in the tests

Please feel free to double check build and test results, I could have missed something. Overall run is summarised by

  • LTS 22.36, GHC 9.6.6 vanilla - 234 out of 3343 tests failed (10508.26s)
  • LTS 22.36, GHC 9.6.6 with backport - 233 out of 3343 tests failed (14337.96s)

There are around 230 total packages were either did not build or whose tests failed during the run.

What I was primarily looking for to find breakages caused by the removal of ~ is packages whose tests were passing in vanilla GHC but then turned into failing in the patched GHC. I’m glad to announce that only two packages did this: lapack-0.5.2 and cdar-mBound-0.1.0.4, which are explained below.

$ git show --summary 07e962eb31b04658e1880cae4a2b08f1695ae5ba
commit 07e962eb31b04658e1880cae4a2b08f1695ae5ba (HEAD -> master, origin/patched-ghc-perf, origin/master, patched-ghc-perf)
Author: Sergey Vinokurov <serg.foo@gmail.com>
Date:   Mon Oct 7 22:05:21 2024 +0100

    Build LTS 22.36 with patched GHC 9.6.6, --flavour=perf --bignum=native

 create mode 100644 test-logs-failed/test-cdar-mBound-0.1.0.4.test.log
 delete mode 100644 test-logs-failed/test-hexml-lens-0.2.2.courses.log
 rename {test-logs => test-logs-failed}/test-lapack-0.5.2.lapack-test.log (97%)
 delete mode 100644 test-logs/test-cdar-mBound-0.1.0.4.test.log
 rename {test-logs-failed => test-logs}/test-friday-0.2.3.2.test.log (68%)
 create mode 100644 test-logs/test-hexml-lens-0.2.2.courses.log
 rename {test-logs-failed => test-logs}/test-json-rpc-1.0.4.test-json-rpc.log (72%)

Explaining why there’s 234 failures with vanilla GHC but only 233 with the patched one

There’s less failures so it should be OK. On a more serious note the changes seem to be in intermittent failures causes of which are not known.

With patched GHC the tests for hexml-lens-0.2.2, json-rpc-1.0.4, and friday-0.2.3.2 got fixed, for cdar-mBound-0.1.0.4 and lapack-0.5.2 started failing. So overall there’s 1 fix more than before, therefore 1 failure less.

Explaining the two failures I’ve chosen to care about and ignoring all the other noise

  • Diff shows renaming of file {test-logs => test-logs-failed}/test-lapack-0.5.2.lapack-test.log: lapack-0.5.2 due to test which seems unrelated to Data.List.Nonempty
+BandedHermitian.eigenvaluesDeterminant.Float: *** Failed! Falsified (after 199 tests):
+(Array (StorableArray.fromList (BandedHermitian (BandedHermitian {bandedHermitianOffDiagonals = unary u9, bandedHermitianOrder = RowMajor, bandedHermitianSize = ZeroBased {zeroBasedSize = 5}})) [72.0,-14.0,5.0,1.0,-10.0,0.0,0.0,0.0,0.0,0.0,42.0,-27.0,-19.0,-12.0,0.0,0.0,0.0,0.0,0.0,0.0,29.0,6.0,34.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,44.0,12.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,76.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]),Match)
  • Diff shows creation of new new file test-logs-failed/test-cdar-mBound-0.1.0.4.test.log - cdar-mBound-0.1.0.4 failed due to timeout, this one was notably jittery and it involves QuickCheck generation
      sqr . sqrt:                  TIMEOUT (0.54s)
        Timed out after 0.5s
        Use -p '/sqr . sqrt/' to rerun this test only.

Notably some packages were failing but became working with the patched GHC friday-0.2.3.2, this does not seem to have anything to do with Data.List.NonEmpty:

-  Sum of an normalized histogram equals its size: [Failed]
-*** Failed! Falsified (after 10 tests):
-0.5
-Histogram {shape = Z :. 256, vector = [8,8,3,5,16,14,19,2,13,3,1,15,1,3,14,13,14,8,3,8,2,16,13,16,2,1,12,11,16,16,18,12,15,6,16,1,12,2,3,16,15,9,4,2,2,2,1,9,3,15,7,14,16,9,8,3,14,14,3,4,9,2,1,15,5,6,15,10,11,7,10,4,9,6,14,15,12,11,2,4,8,6,5,11,27,7,7,9,9,1,12,1,5,16,7,7,9,6,14,12,7,11,8,13,9,13,16,1,3,14,2,5,3,11,7,14,14,6,6,2,7,4,3,11,9,11,15,14,4,4,8,2,6,4,1,8,10,5,3,7,11,12,11,6,4,7,13,14,16,8,9,4,12,8,15,4,4,16,4,9,10,12,5,11,15,4,8,16,2,15,15,11,9,9,16,3,13,14,9,12,5,10,11,14,3,7,9,14,9,3,2,13,3,5,4,7,1,16,15,8,14,9,14,12,12,10,8,7,16,10,13,16,6,13,13,13,14,6,2,4,4,8,6,4,10,13,12,4,9,15,9,6,3,3,13,1,16,11,27,14,8,10,8,12,1,13,3,8,12,7,15,1,7,15,16,16]}
-(used seed 6881873416147142106)
+  Sum of an normalized histogram equals its size: [OK, passed 100 tests]

Overall I did not expect to fix any tests with the removal of ~ so those changes are probably not worth investigating.

Testing conclusion

Looks like Stackage tests don’t really depend on laziness in Data.List.NonEmpty, at least not to the extent that anyone cared to exercise it in package’s testsuite.

I may have missed something in this sea of diffs but it seems to be safe to stsate that out of 3109 packages no testsuite started failing when compiled with GHC that included the MR that implements this proposal’s change.

@Bodigrim
Copy link
Collaborator

Bodigrim commented Nov 1, 2024

Thanks for the analysis and extensive testing, @sergv.

Dear CLC members, I'd like to open the vote soon. Any more opinions / comments / suggestions / considerations? See #107 (comment) for a kind of executive summary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests