Compilation for big datatypes with Generics is quite slow #5

mgsloan · 2016-04-21T00:55:55Z

Currently, Data.Store.Instances uses GHC generics to define Store instances for all TH datatypes. This makes the module take 1 minutes 40 seconds to compile instead of 15 seconds.

I'm thinking we should do the following:

Add a clear warning to the docs that using generics for this will slow down your compile times quite a lot
Implement TH definition of Store instances

The text was updated successfully, but these errors were encountered:

mgsloan · 2016-04-28T01:08:59Z

I have done (2), and unfortunately, the TH generated instances take a long time to compile as well. I have an idea of how to resolve it:

See if explicit type annotations speed things up

bitonic · 2016-05-04T13:48:30Z

Most likely due to https://ghc.haskell.org/trac/ghc/ticket/5642

mgsloan · 2016-05-12T00:18:27Z

Actually, I don't think it's the quadratic typechecking thing, though that probably isn't helping much. I can load store into ghci in less than 10 seconds, so it must be the amount of work done during optimization / codegen.

I can rule out the problem being codegen by just doing stack build --fast and having that take 8.5s. Perhaps the issue is overzealous {-# INLINE ... #-}

mgsloan · 2016-05-14T01:57:46Z

Wow, somewhere along the line the full time to build store crept up to 4m36.683s. In RyanGIScott's blogpost, recently on reddit, http://ryanglscott.github.io/2016/05/12/whats-new-with-ghc-generics-in-80/ , he has a section on the performance of generics. It turns out that splitting up the methods of your generics class helps compile time. After that I got 2m5.591s.

This can be further dropped to 1m26.969s, by removing all the INLINE pragmas on the generic instances. However, this negatively impacts the actual performance.

With INLINEs removed:

benchmarking encode/ (SmallProduct)
time                 48.96 ns   (48.94 ns .. 49.00 ns)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 49.00 ns   (48.97 ns .. 49.05 ns)
std dev              132.4 ps   (83.64 ps .. 188.9 ps)

benchmarking encode/ ([SmallSum])
time                 657.6 ns   (652.1 ns .. 665.3 ns)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 656.6 ns   (653.1 ns .. 663.4 ns)
std dev              15.99 ns   (8.863 ns .. 28.37 ns)
variance introduced by outliers: 32% (moderately inflated)

benchmarking decode/ (SmallProduct)
time                 73.27 ns   (69.96 ns .. 76.63 ns)
                     0.989 R²   (0.984 R² .. 0.996 R²)
mean                 72.48 ns   (70.61 ns .. 75.44 ns)
std dev              7.300 ns   (5.777 ns .. 9.999 ns)
variance introduced by outliers: 91% (severely inflated)

benchmarking decode/ ([SmallSum])
time                 364.0 ns   (350.1 ns .. 384.8 ns)
                     0.991 R²   (0.982 R² .. 0.999 R²)
mean                 355.7 ns   (352.1 ns .. 363.8 ns)
std dev              19.19 ns   (9.734 ns .. 33.50 ns)
variance introduced by outliers: 72% (severely inflated)

Whereas with INLINE pragmas:

benchmarking encode/ (SmallProduct)
time                 28.30 ns   (28.20 ns .. 28.51 ns)
                     0.996 R²   (0.993 R² .. 0.998 R²)
mean                 30.28 ns   (29.37 ns .. 32.07 ns)
std dev              3.907 ns   (2.238 ns .. 5.873 ns)
variance introduced by outliers: 95% (severely inflated)

benchmarking encode/ ([SmallSum])
time                 199.5 ns   (181.3 ns .. 210.5 ns)
                     0.961 R²   (0.948 R² .. 0.976 R²)
mean                 166.3 ns   (155.9 ns .. 179.0 ns)
std dev              36.38 ns   (29.29 ns .. 39.93 ns)
variance introduced by outliers: 98% (severely inflated)

benchmarking decode/ (SmallProduct)
time                 45.71 ns   (42.66 ns .. 47.69 ns)
                     0.968 R²   (0.954 R² .. 0.981 R²)
mean                 38.56 ns   (36.03 ns .. 41.67 ns)
std dev              8.639 ns   (7.505 ns .. 9.239 ns)
variance introduced by outliers: 98% (severely inflated)

benchmarking decode/ ([SmallSum])
time                 261.7 ns   (243.4 ns .. 288.3 ns)
                     0.948 R²   (0.940 R² .. 0.961 R²)
mean                 316.1 ns   (299.1 ns .. 328.3 ns)
std dev              49.71 ns   (38.45 ns .. 60.29 ns)
variance introduced by outliers: 96% (severely inflated)

So I think it's worth keeping those INLINEs for now. Note that the extra 30s is mostly due to using this generic deriving stuff on some quite large datatypes - all the TH types.

mgsloan · 2016-05-14T05:33:47Z

I closed it due to the drastic improvement. However, we may still want to revisit this.

mgsloan added a commit that referenced this issue May 14, 2016

Improve compiletime by splitting up generics #5

7c56417

mgsloan closed this as completed May 14, 2016

mgsloan mentioned this issue May 14, 2016

Split up store typeclass #21

Open

mgsloan reopened this May 14, 2016

mgsloan added the compiletime optimization label May 31, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compilation for big datatypes with Generics is quite slow #5

Compilation for big datatypes with Generics is quite slow #5

mgsloan commented Apr 21, 2016

mgsloan commented Apr 28, 2016

bitonic commented May 4, 2016

mgsloan commented May 12, 2016

mgsloan commented May 14, 2016 •

edited

Loading

mgsloan commented May 14, 2016

Compilation for big datatypes with Generics is quite slow #5

Compilation for big datatypes with Generics is quite slow #5

Comments

mgsloan commented Apr 21, 2016

mgsloan commented Apr 28, 2016

bitonic commented May 4, 2016

mgsloan commented May 12, 2016

mgsloan commented May 14, 2016 • edited Loading

mgsloan commented May 14, 2016

mgsloan commented May 14, 2016 •

edited

Loading