You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 24, 2019. It is now read-only.
The right way to implement pairwise summation, from a performance perspective, is simply to use a larger base case. Then you can still use explicit recursion, and the cost should be very similar to naive summation.
Around N=100–200 is usually sufficient to amortize the overhead of recursion, in my experience. For a sufficiently large array (a few thousand elements or more, IIRC), the error almost indistinguishable from that of a base case of N=1.
This is what was done in NumPy (numpy/numpy#3685) and Julia (JuliaLang/julia#4039) to achieve the accuracy of pairwise summation without a performance penalty.
("Coarsening" the base case is a standard technique to improve the performance of any recursive algorithm.)
The text was updated successfully, but these errors were encountered:
Thanks for the feedback! Yeah, I was poking around in the dark a bit here when I should have just been reading more code. That makes perfect sense. (I need to apply a base case to my gpu fft code… it's been eating away at me in the back of my head, but hard to find the time 😞) The other goal here was to really nail down empirically the numerical stability of the different approaches, though I didn't do a good job there either. 😞
TBH the real takeaway for a lot of this is that it's really difficult to write good numerical software in your spare time. If you're interested in people doing this right and not just cross-compiling into js, stdlib is the place to look. @kgryte is doing an impressive job of digging into existing implementations and tackling this like the large undertaking that it actually is. I believe he's currently rounding the corner on an ndarray implementation and slowly chipping away at blas (native bindings in node; wasm + idiomatic native js implementation for client-side).
The right way to implement pairwise summation, from a performance perspective, is simply to use a larger base case. Then you can still use explicit recursion, and the cost should be very similar to naive summation.
Around N=100–200 is usually sufficient to amortize the overhead of recursion, in my experience. For a sufficiently large array (a few thousand elements or more, IIRC), the error almost indistinguishable from that of a base case of N=1.
This is what was done in NumPy (numpy/numpy#3685) and Julia (JuliaLang/julia#4039) to achieve the accuracy of pairwise summation without a performance penalty.
("Coarsening" the base case is a standard technique to improve the performance of any recursive algorithm.)
The text was updated successfully, but these errors were encountered: