Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add groupBy for vectors #427

Merged
merged 1 commit into from
Jan 11, 2022
Merged

Conversation

alexkalderimis
Copy link
Contributor

This adds groupBy for generic vectors, returning a list of
subvectors.

This is similar to #143, aiming to resolve #192

Some of the comments on #143 have been applied here.

Input on testing would be welcome

@Shimuuar
Copy link
Contributor

Shimuuar commented Jan 7, 2022

Looks reasonable to me. Could you add specialized version to Data.Vector, etc and add some haddocks?

@alexkalderimis
Copy link
Contributor Author

Thanks for taking a look @Shimuuar - I'll ping you once this has been updated.

vector/src/Data/Vector.hs Outdated Show resolved Hide resolved
vector/src/Data/Vector.hs Outdated Show resolved Hide resolved
vector/src/Data/Vector.hs Outdated Show resolved Hide resolved
vector/src/Data/Vector/Generic.hs Outdated Show resolved Hide resolved
vector/src/Data/Vector/Generic.hs Outdated Show resolved Hide resolved
vector/src/Data/Vector/Generic.hs Outdated Show resolved Hide resolved
vector/src/Data/Vector/Primitive.hs Outdated Show resolved Hide resolved
vector/src/Data/Vector/Storable.hs Outdated Show resolved Hide resolved
vector/src/Data/Vector/Unboxed.hs Outdated Show resolved Hide resolved
@alexkalderimis
Copy link
Contributor Author

Thanks for the suggestions @konsumlamm - please review the updated haddocks, which include some changes to fix failing doctests

@alexkalderimis
Copy link
Contributor Author

@Shimuuar - the haddocks and specialized versions should be ready for your review now

@Shimuuar
Copy link
Contributor

Thank you! PR look great.

Doctests fails because of extra whitespace. And could also mentiaion that function does not fuse, and resulting vectors are slices of original vector

@alexkalderimis
Copy link
Contributor Author

Thank you! PR look great.

Thanks

Doctests fails because of extra whitespace.

D'oh - easy fix at least

And could also mentiaion that function does not fuse

Can do - thought this was not necessary since the result of the function is a list, not a Vector

and resulting vectors are slices of original vector

The haddocks say: "Split a vector into a list of slices." - is this enough?

@alexkalderimis
Copy link
Contributor Author

Back to you, @Shimuuar

@Shimuuar
Copy link
Contributor

Still one test failure. (It's really annoying that doctests only work on 8.6 & 8.8) otherwise good to go. Also could you add changelog entry while you at it

This adds `groupBy` and `group` for generic vectors, returning a list of
slices of the input vector.

Specialized versions of groupBy and group are added for:

- Data.Vector
- Data.Vector.Unboxed
- Data.Vector.Storable
- Data.Vector.Primitive

These functions are all `O(n)`, and do not fuse, but because they
return slices of the input, they do not copy.
@alexkalderimis
Copy link
Contributor Author

Still one test failure. (It's really annoying that doctests only work on 8.6 & 8.8) otherwise good to go. Also could you add changelog entry while you at it

It is a little annoying, especially since when I use 8.8.4 locally and run the test suite, I get errors like:

src/Data/Vector.hs:978: failure in expression `import qualified Data.Vector as V'
expected: 
 but got: 
          ^
          <no location info>: error:
              Could not find module ‘Data.Vector’
              Perhaps you meant Data.Functor (from base-4.13.0.0)

Tips on reproducing doctests locally would certainly, be appreciated, but I notice that the checks are running now on push so that is not a big deal.

It looks like I have fixed the remaining issues, so I have added the changelog entry, and squashed the commits for a cleaner history. Back to you @Shimuuar

@Shimuuar
Copy link
Contributor

To run doctests one need not very secret incantation:

cabal build all --write-ghc-environment-files=always

I still hope we'll get doctests which just works and don't combust spontaneously.

Thank you, it's been a pleasure. Also thank you @konsumlamm for proofreading haddocks

@Shimuuar Shimuuar merged commit 3883438 into haskell:master Jan 11, 2022
@Shimuuar Shimuuar mentioned this pull request Jan 11, 2022
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Aug 19, 2022
# Changes in version 0.13.0.0

 * `mkType` from `Data.Vector.Generic` is deprecated in favor of
   `Data.Data.mkNoRepType`
 * The role signatures on several `Vector` types were too permissive, so they
   have been tightened up:
   * The role signature for `Data.Vector.Mutable.MVector` is now
     `type role MVector nominal representational` (previously, both arguments
     were `phantom`). [#224](haskell/vector#224)
   * The role signature for `Data.Vector.Primitive.Vector` is now
     `type role Vector nominal` (previously, it was `phantom`).
     The role signature for `Data.Vector.Primitive.Mutable.MVector` is now
     `type role MVector nominal nominal` (previously, both arguments were
     `phantom`). [#316](haskell/vector#316)
   * The role signature for `Data.Vector.Storable.Vector` is now
     `type role Vector nominal` (previous, it was `phantom`), and the signature
     for `Data.Vector.Storable.Mutable.MVector` is now
     `type role MVector nominal nominal` (previous, both arguments were
     `phantom`). [#235](haskell/vector#235)

     We pick `nominal` for the role of the last argument instead of
     `representational` since the internal structure of a `Storable` vector is
     determined by the `Storable` instance of the element type, and it is not
     guaranteed that the `Storable` instances between two representationally
     equal types will preserve this internal structure.  One consequence of this
     choice is that it is no longer possible to `coerce` between
     `Storable.Vector a` and `Storable.Vector b` if `a` and `b` are nominally
     distinct but representationally equal types. We now provide
     `unsafeCoerce{M}Vector` and `unsafeCast` functions to allow this (the onus
     is on the user to ensure that no `Storable` invariants are broken when
     using these functions).
 * Methods of type classes `Data.Vector.Generic.Mutable.MVector` and
   `Data.Vector.Generic.Vector` use concrete monads (`ST`, etc) istead of being
   polymorphic (`PrimMonad`, etc). [#335](haskell/vector#335).
   This makes it possible to derive `Unbox` with:
   * `GeneralizedNewtypeDeriving`
   * via `UnboxViaPrim` and `Prim` instance
   * via `As` and `IsoUnbox` instance: [#378](haskell/vector#378)
 * Add `MonadFix` instance for boxed vectors: [#312](haskell/vector#312)
 * Re-export `PrimMonad` and `RealWorld` from mutable vectors:
   [#320](haskell/vector#320)
 * Add `maximumOn` and `minimumOn`: [#356](haskell/vector#356)
 * The functions `scanl1`, `scanl1'`, `scanr1`, and `scanr1'` for immutable
   vectors are now defined when given empty vectors as arguments,
   in which case they return empty vectors. This new behavior is consistent
   with the one of the corresponding functions in `Data.List`.
   Prior to this change, applying an empty vector to any of those functions
   resulted in an error. This change was introduced in:
   [#382](haskell/vector#382)
 * Change allocation strategy for `unfoldrN`: [#387](haskell/vector#387)
 * Remove `CPP` driven error reporting in favor of `HasCallStack`:
   [#397](haskell/vector#397)
 * Remove redundant `Storable` constraints on to/from `ForeignPtr` conversions:
   [#394](haskell/vector#394)
 * Add `unsafeCast` to `Primitive` vectors: [#401](haskell/vector#401)
 * Make `(!?)` operator strict: [#402](haskell/vector#402)
 * Add `readMaybe`: [#425](haskell/vector#425)
 * Add `groupBy` and `group` for `Data.Vector.Generic` and the specialized
   version in `Data.Vector`, `Data.Vector.Unboxed`, `Data.Vector.Storable` and
   `Data.Vector.Primitive`. [#427](haskell/vector#427)
 * Add `toArraySlice` and `unsafeFromArraySlice` functions for conversion to and
   from the underlying boxed `Array`: [#434](haskell/vector#434)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

group and groupBy
3 participants