Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compilation for big datatypes with Generics is quite slow #5

Open
mgsloan opened this issue Apr 21, 2016 · 5 comments
Open

Compilation for big datatypes with Generics is quite slow #5

mgsloan opened this issue Apr 21, 2016 · 5 comments

Comments

@mgsloan
Copy link
Owner

mgsloan commented Apr 21, 2016

Currently, Data.Store.Instances uses GHC generics to define Store instances for all TH datatypes. This makes the module take 1 minutes 40 seconds to compile instead of 15 seconds.

I'm thinking we should do the following:

  1. Add a clear warning to the docs that using generics for this will slow down your compile times quite a lot

  2. Implement TH definition of Store instances

@mgsloan
Copy link
Owner Author

mgsloan commented Apr 28, 2016

I have done (2), and unfortunately, the TH generated instances take a long time to compile as well. I have an idea of how to resolve it:

  • See if explicit type annotations speed things up

@bitonic
Copy link
Contributor

bitonic commented May 4, 2016

Most likely due to https://ghc.haskell.org/trac/ghc/ticket/5642

@mgsloan
Copy link
Owner Author

mgsloan commented May 12, 2016

Actually, I don't think it's the quadratic typechecking thing, though that probably isn't helping much. I can load store into ghci in less than 10 seconds, so it must be the amount of work done during optimization / codegen.

I can rule out the problem being codegen by just doing stack build --fast and having that take 8.5s. Perhaps the issue is overzealous {-# INLINE ... #-}

@mgsloan
Copy link
Owner Author

mgsloan commented May 14, 2016

Wow, somewhere along the line the full time to build store crept up to 4m36.683s. In RyanGIScott's blogpost, recently on reddit, http://ryanglscott.github.io/2016/05/12/whats-new-with-ghc-generics-in-80/ , he has a section on the performance of generics. It turns out that splitting up the methods of your generics class helps compile time. After that I got 2m5.591s.

This can be further dropped to 1m26.969s, by removing all the INLINE pragmas on the generic instances. However, this negatively impacts the actual performance.

With INLINEs removed:

benchmarking encode/ (SmallProduct)
time                 48.96 ns   (48.94 ns .. 49.00 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 49.00 ns   (48.97 ns .. 49.05 ns)
std dev              132.4 ps   (83.64 ps .. 188.9 ps)

benchmarking encode/ ([SmallSum])
time                 657.6 ns   (652.1 ns .. 665.3 ns)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 656.6 ns   (653.1 ns .. 663.4 ns)
std dev              15.99 ns   (8.863 ns .. 28.37 ns)
variance introduced by outliers: 32% (moderately inflated)

benchmarking decode/ (SmallProduct)
time                 73.27 ns   (69.96 ns .. 76.63 ns)
                     0.989 R²   (0.984 R² .. 0.996 R²)
mean                 72.48 ns   (70.61 ns .. 75.44 ns)
std dev              7.300 ns   (5.777 ns .. 9.999 ns)
variance introduced by outliers: 91% (severely inflated)

benchmarking decode/ ([SmallSum])
time                 364.0 ns   (350.1 ns .. 384.8 ns)
                     0.991 R²   (0.982 R² .. 0.999 R²)
mean                 355.7 ns   (352.1 ns .. 363.8 ns)
std dev              19.19 ns   (9.734 ns .. 33.50 ns)
variance introduced by outliers: 72% (severely inflated)

Whereas with INLINE pragmas:

benchmarking encode/ (SmallProduct)
time                 28.30 ns   (28.20 ns .. 28.51 ns)
                     0.996 R²   (0.993 R² .. 0.998 R²)
mean                 30.28 ns   (29.37 ns .. 32.07 ns)
std dev              3.907 ns   (2.238 ns .. 5.873 ns)
variance introduced by outliers: 95% (severely inflated)

benchmarking encode/ ([SmallSum])
time                 199.5 ns   (181.3 ns .. 210.5 ns)
                     0.961 R²   (0.948 R² .. 0.976 R²)
mean                 166.3 ns   (155.9 ns .. 179.0 ns)
std dev              36.38 ns   (29.29 ns .. 39.93 ns)
variance introduced by outliers: 98% (severely inflated)

benchmarking decode/ (SmallProduct)
time                 45.71 ns   (42.66 ns .. 47.69 ns)
                     0.968 R²   (0.954 R² .. 0.981 R²)
mean                 38.56 ns   (36.03 ns .. 41.67 ns)
std dev              8.639 ns   (7.505 ns .. 9.239 ns)
variance introduced by outliers: 98% (severely inflated)

benchmarking decode/ ([SmallSum])
time                 261.7 ns   (243.4 ns .. 288.3 ns)
                     0.948 R²   (0.940 R² .. 0.961 R²)
mean                 316.1 ns   (299.1 ns .. 328.3 ns)
std dev              49.71 ns   (38.45 ns .. 60.29 ns)
variance introduced by outliers: 96% (severely inflated)

So I think it's worth keeping those INLINEs for now. Note that the extra 30s is mostly due to using this generic deriving stuff on some quite large datatypes - all the TH types.

@mgsloan
Copy link
Owner Author

mgsloan commented May 14, 2016

I closed it due to the drastic improvement. However, we may still want to revisit this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants