RFC: use pairwise summation for sum, cumsum, and cumprod #4039

stevengj · 2013-08-13T03:28:53Z

This patch modifies the sum and cumsum functions (and cumprod) to use pairwise summation for summing AbstractArrays, in order to achieve much better accuracy at a negligible computational cost.

Pairwise summation recursively divides the array in half, sums the halves recursively, and then adds the two sums. As long as the base case is large enough (here, n=128 seemed to suffice), the overhead of the recursion is negligible compared to naive summation (a simple loop). The advantage of this algorithm is that it achieves O(sqrt(log n)) mean error growth, versus O(sqrt(n)) for naive summation, which is almost indistinguishable from the O(1) error growth of Kahan compensated summation.

For example:

A = rand(10^8)
es = sum_kbn(A)
(oldsum(A) - es)/es, (newsum(A) - es)/es

where oldsum and newsum are the old and new implementations, respectively, gives (-1.2233732622777777e-13,0.0) on my machine in a typical run: the old sum loses three significant digits, whereas the new sum loses none. On the other hand, their execution time is nearly identical:

@time oldsum(A);
@time newsum(A);

gives

elapsed time: 0.116988841 seconds (105816 bytes allocated)
elapsed time: 0.110502884 seconds (64 bytes allocated)

(The difference is within the noise.)

@JeffBezanson and @StefanKarpinski, I think I mentioned this possibility to you at Berkeley.

stevengj · 2013-08-13T03:43:18Z

(We could use the old definitions for AbstractArray{T<:Integer}, as pairwise summation has no accuracy advantage there, although it doesn't really hurt either.)

GunnarFarneback · 2013-08-13T07:11:33Z

If you do image processing with single precision, this is an enormously big deal. E.g.

julia> mean(fill(1.5f0, 3000, 4000))
1.766983f0

(Example chosen for its striking effect. Unfortunately the mean implementation is disjoint from the sum implementation that this patch modifies.)

stevengj · 2013-08-13T12:32:23Z

@GunnarFarneback, yes, it is a pain to modify all of the different sum variants, although it would be possible in principle. But it would help with the case you mentioned:

julia> newsum(fill!(Array(Float32, 3000*4000), 1.5f0)) / (3000*4000) 
1.5f0

stevengj · 2013-08-13T14:02:39Z

Of course, in single precision, the right thing is to just do summation and other accumulation in double precision, rounding back to single precision at the end.

GunnarFarneback · 2013-08-13T14:02:55Z

To clarify my example was meant show the insufficiency of the current state of summing. That pairwise summation is an effective solution I considered as obvious, so I'm all in favor of this patch. That mean and some other functions fail to take advantage of it is a separate issue.

GunnarFarneback · 2013-08-13T14:05:08Z

The right thing with respect to precision, not necessarily with respect to speed.

StefanKarpinski · 2013-08-13T14:31:51Z

I'm good with this as a first step and think we should, as @GunnarFarneback points out, integrate even further so that mean and other stats functions also use pairwise summation.

…omplex arrays (should use absolute value and return a real number)

stevengj · 2013-08-13T15:39:59Z

Updated to use pairwise summation for mean, var, and varm as well. At least when looking at the whole array; for sums etc. along a subset of the dimensions of a larger array, we need a pairwise variant of reducedim (for associative operations).

The variance computation is also more efficient now because (at least when operating on the whole array) it no longer constructs a temporary array of the same size. (Would be even better if abs2 were inlined, but that will be easy once something like #3796 lands.)

Also, I noticed that var was buggy for complex matrices, since it used x.*x instead of abs2(x). Our varm function used dot, which did the complex conjugation, but it returned a complex number with zero imaginary part instead of a real number. Now these both should work and return a real value; I added a test case.

…rted) mapreduce_associative

stevengj · 2013-08-13T16:18:40Z

Also added an associative variant of mapreduce which uses pairwise reduction.

_Question:_ Although reduce and mapreduce currently do left-associative reduction ("fold left"), this isn't documented in the manual. Is this intentional, i.e. are these operations supposed unconstrained in their association ordering? (The docs should really be explicit on this point.) If so, then we don't need a separate mapreduce_associative function, right?

StefanKarpinski · 2013-08-13T16:26:25Z

Another approach would be to have a Associativity type that can be passed into reduce and mapreduce as a keyword (defaulting to pairwise?) and then we can specialize the internals of the reductions on that type, which will minimize the overhead. I.e. similar to the sorting infrastructure.

pao · 2013-08-13T17:02:47Z

Although reduce and mapreduce currently do left-associative reduction ("fold left"), this isn't documented in the manual. Is this intentional, i.e. are these operations supposed unconstrained in their association ordering? (The docs should really be explicit on this point.)

That would be a good idea--for instance, https://github.com/pao/Monads.jl/blob/master/src/Monads.jl#L54 relies on the fold direction.

stevengj · 2013-08-13T17:09:32Z

The only sensible associativities to have in Base are left, right, and unspecified. A Pairwise associativity makes no sense because we don't even want to implement a fully pairwise case (because, for performance, the base case should be a simple left-associative loop) and we want to be free to change the details as needed. Note also that for parallel operations you really want to have unspecified associativity so that more efficient tree algorithms can be used.

My suggestion would be that mapreduce and reduce should be explicitly documented as having implementation-dependent associativity, and then to have separate reduce_left and mapreduce_left functions which enforce left-associativity. (And maybe right-associative variants? But those can't easily be implemented for iterators, and I'm not sure how useful they are; I would just punt on those until there is demand.) Since there are only three useful cases, defining a type for this doesn't make sense to me. But probably this should be a separate issue.

stevengj · 2013-08-13T18:23:57Z

@StefanKarpinski, should I go ahead and merge this?

ViralBShah · 2013-08-13T18:44:44Z

This is great, and I would love to see this merged. Also, we should probably not export sum_kbn anymore, given this patch.

stevengj · 2013-08-13T18:48:13Z

sum_kbn is still more accurate in some cases. e.g. sum([1 1e-100 -1]) still gives 0.0, but sum_kbn gives 1.0e-100. But I agree that the need for sum_kbn is greatly reduced.

StefanKarpinski · 2013-08-13T18:58:58Z

I'm good with merging this. @JeffBezanson?

RFC: use pairwise summation for sum, cumsum, and cumprod

timholy · 2013-08-14T07:28:26Z

This is great. I had played with my own variant of this idea, breaking it up into blocks of size sqrt(N), but for small arrays the sqrt was a performance-killer. This is much better, thanks for contributing it.

@KristofferC

Stdlib: Pkg URL: https://github.com/JuliaLang/Pkg.jl.git Stdlib branch: master Julia branch: master Old commit: 7b759d7f0 New commit: d84a1a38b Julia version: 1.12.0-DEV Pkg version: 1.12.0 Bump invoked by: @KristofferC Powered by: [BumpStdlibs.jl](https://github.com/JuliaLang/BumpStdlibs.jl) Diff: JuliaLang/Pkg.jl@7b759d7...d84a1a3 ``` $ git log --oneline 7b759d7f0..d84a1a38b d84a1a38b Allow use of a url and subdir in [sources] (#4039) cd75456a8 Fix heading (#4102) b61066120 rename FORMER_STDLIBS -> UPGRADABLE_STDLIBS (#4070) 814949ed2 Increase version of `StaticArrays` in `why` tests (#4077) 83e13631e Run CI on backport branch too (#4094) ``` Co-authored-by: Dilum Aluthge <dilum@aluthge.com>

stevengj and others added 2 commits August 13, 2013 10:53

use pairwise summation for sum, cumsum, and cumprod

d987550

use pairwise summation for mean, var, varm; also fix bug in var for c…

d4d9d39

…omplex arrays (should use absolute value and return a real number)

use pairwise reduction for sum(f, A) and prod(f, A) via new (non-expo…

01ff268

…rted) mapreduce_associative

stevengj mentioned this pull request Aug 13, 2013

(un)specify associativity of reduce and mapreduce #4046

Closed

JeffBezanson added a commit that referenced this pull request Aug 13, 2013

Merge pull request #4039 from stevengj/pairwise

c8f89d1

RFC: use pairwise summation for sum, cumsum, and cumprod

JeffBezanson merged commit c8f89d1 into JuliaLang:master Aug 13, 2013

stevengj added a commit that referenced this pull request Aug 13, 2013

mention #4039 in NEWS

de7bc3f

This was referenced Aug 14, 2013

implement a better summation algorithm #199

Closed

sum does not use K-B-N summation for >1D arrays #1258

Closed

simonster mentioned this pull request Oct 29, 2013

Pairwise reductions? lindahua/NumericExtensions.jl#21

Closed

stevengj mentioned this pull request Jan 6, 2015

more accurate cumsum #9648

Closed

stevengj mentioned this pull request Jan 28, 2015

var and std do not work for Any[] #8319

Closed

stevengj deleted the pairwise branch December 5, 2015 02:36

stevengj mentioned this pull request May 3, 2016

performance regression in sum(a) #16185

Closed

stevengj mentioned this pull request Sep 5, 2016

cumsum fixes (fixes #18363 and #18336) #18364

Merged

TestSubjector mentioned this pull request Jun 12, 2017

Add imf function JuliaAstro/AstroLib.jl#19

Merged

stevengj mentioned this pull request Nov 22, 2017

use a larger base case for pairwise summation rreusser/summation-algorithms#3

Open

helgee mentioned this pull request Mar 18, 2018

Ported Tdbtt JuliaAstro/AstroTime.jl#14

Merged

stevengj mentioned this pull request Oct 19, 2018

reduce round-off errors in field integrals and fluxes with pairwise summation NanoComp/meep#557

Open

oschulz mentioned this pull request Nov 8, 2019

Add unconjugated dot product dotu #27677

Open

simeonschaub mentioned this pull request Apr 16, 2022

mapreduce: use simpler reduction order in some cases #45000

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: use pairwise summation for sum, cumsum, and cumprod #4039

RFC: use pairwise summation for sum, cumsum, and cumprod #4039

stevengj commented Aug 13, 2013

stevengj commented Aug 13, 2013

GunnarFarneback commented Aug 13, 2013

stevengj commented Aug 13, 2013

stevengj commented Aug 13, 2013

GunnarFarneback commented Aug 13, 2013

GunnarFarneback commented Aug 13, 2013

StefanKarpinski commented Aug 13, 2013

stevengj commented Aug 13, 2013

stevengj commented Aug 13, 2013

StefanKarpinski commented Aug 13, 2013

pao commented Aug 13, 2013

stevengj commented Aug 13, 2013

stevengj commented Aug 13, 2013

ViralBShah commented Aug 13, 2013

stevengj commented Aug 13, 2013

StefanKarpinski commented Aug 13, 2013

timholy commented Aug 14, 2013

RFC: use pairwise summation for sum, cumsum, and cumprod #4039

RFC: use pairwise summation for sum, cumsum, and cumprod #4039

Conversation

stevengj commented Aug 13, 2013

stevengj commented Aug 13, 2013

GunnarFarneback commented Aug 13, 2013

stevengj commented Aug 13, 2013

stevengj commented Aug 13, 2013

GunnarFarneback commented Aug 13, 2013

GunnarFarneback commented Aug 13, 2013

StefanKarpinski commented Aug 13, 2013

stevengj commented Aug 13, 2013

stevengj commented Aug 13, 2013

StefanKarpinski commented Aug 13, 2013

pao commented Aug 13, 2013

stevengj commented Aug 13, 2013

stevengj commented Aug 13, 2013

ViralBShah commented Aug 13, 2013

stevengj commented Aug 13, 2013

StefanKarpinski commented Aug 13, 2013

timholy commented Aug 14, 2013