Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate (cum)sum_kbn #24869

Merged
merged 1 commit into from
Dec 8, 2017
Merged

Deprecate (cum)sum_kbn #24869

merged 1 commit into from
Dec 8, 2017

Conversation

ararslan
Copy link
Member

This removes sum_kbn and cumsum_kbn in favor of sum and cumsum, respectively, with a note about the potential loss in precision. Those can be moved to a package if someone wants them, as was done with rref.

Fixes #24804

@ararslan ararslan added arrays [a, r, r, a, y, s] deprecation This change introduces or involves a deprecation maths Mathematical functions labels Nov 30, 2017
cv2 = OffsetArray([1,-1e100,-1e100,2], (5,))*1000
@test isequal(cumsum_kbn(v), cv)
@test isequal(cumsum_kbn(v2), cv2)
@test isequal(sum_kbn(v), sum_kbn(parent(v)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps modify rather than eradicate these tests? I conjecture that testing OffsetArray is these tests' purpose rather than testing [cum]sum_kbn specifically :).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, even if I convert these to BigFloats, using cumsum rather than cumsum_kbn seems to give a different answer:

julia> v = OffsetArray(BigFloat[1,1e100,1,-1e100], (-3,))*1000
OffsetArray{BigFloat,1,Array{BigFloat,1}} with indices -2:1:
  1.0e+03
  1.000000000000000015902891109759918046836080856394528138978132755774783877217038e+103
  1.0e+03
 -1.000000000000000015902891109759918046836080856394528138978132755774783877217038e+103

julia> cumsum(v)
OffsetArray{BigFloat,1,Array{BigFloat,1}} with indices -2:1:
 1.0e+03
 1.000000000000000015902891109759918046836080856394528138978132755774783877217038e+103
 1.000000000000000015902891109759918046836080856394528138978132755774783877217038e+103
 1.0e+03

julia> cumsum_kbn(v)
OffsetArray{BigFloat,1,Array{BigFloat,1}} with indices -2:1:
 1.0e+03
 1.000000000000000015902891109759918046836080856394528138978132755774783877217038e+103
 1.000000000000000015902891109759918046836080856394528138978132755774783877217038e+103
 2.0e+03

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps need higher precision, and/or cumsum doesn't preserve precision in the accumulation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd just use isapprox here. The fact that the result was exact before must have been due to the way the KBN algorithm worked in that particular case, but we don't really care here since the goal is just to test that it works on OffsetArray.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with isapprox in this case is that 1000 and 2000 aren't approximately equal. 😛

Copy link
Member

@nalimilan nalimilan Dec 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I misread. That's just because BigFloat doesn't have enough precision by default. It works with a higher precision:

julia> setprecision(BigFloat, 500)
500

julia> cumsum(BigFloat[1,1e100,1,-1e100]*1000)
4-element Array{BigFloat,1}:
 1.0e+03                                                                                                    
 1.0000000000000000159028911097599180468360808563945281389781327557747838772170381060813469985856815105e+103
 1.0000000000000000159028911097599180468360808563945281389781327557747838772170381060813469985856815106e+103
 2.0e+03                                                                                                    

julia> cumsum_kbn(BigFloat[1,1e100,1,-1e100]*1000)
4-element Array{BigFloat,1}:
 1.0e+03                                                                                                    
 1.0000000000000000159028911097599180468360808563945281389781327557747838772170381060813469985856815105e+103
 1.0000000000000000159028911097599180468360808563945281389781327557747838772170381060813469985856815106e+103
 2.0e+03                                                                                                    

But it's really weird to mix testing OffsetArray with testing precision with very high values. Maybe split them into two separate tests, with a comment saying what is being tested?

test/reduce.jl Outdated
cs = cumsum(z)
@test (es - cs[end]) < es * 1e-13
@test (es2 - cs[10^5]) < es2 * 1e-13
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether these tests' purpose is testing [cum]sum accuracy rather than testing sub_kbn per se? And if so, perhaps retaining these tests with some other mechanism to calculate an accurate sum would be worthwhile?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like using BigFloat?

@musm
Copy link
Contributor

musm commented Dec 1, 2017

Instead of deprecating this, how about following the OPs suggestion of adding a named optional argument for the algorithm:

...A more general and Julian approach would be to use something like sum(x, alg=KahanBabuskaNeumaier) or sum(KahanBabuskaNeumaier, x) (or just KBN).

@KristofferC
Copy link
Member

I think that is an excessive generalization. This seems very much in package territory to me.

@ararslan
Copy link
Member Author

ararslan commented Dec 2, 2017

@ViralBShah
Copy link
Member

Wow all green!

@ararslan ararslan force-pushed the aa/dep-kbn branch 3 times, most recently from 8992771 to be6141a Compare December 5, 2017 03:13
@ararslan
Copy link
Member Author

ararslan commented Dec 5, 2017

CircleCI failure:

ERROR: LoadError: LoadError: cannot obtain socket name: operation not permitted (EPERM)
uv_error at ./libuv.jl:68 [inlined]
_sockname(::TCPSocket, ::Bool) at ./socket.jl:1001
bind_client_port at ./distributed/managers.jl:492 [inlined]
socket_reuse_port() at ./distributed/managers.jl:471
connect_to_worker(::SubString{String}, ::UInt16) at ./distributed/managers.jl:498
connect(::TopoTestManager, ::Int64, ::WorkerConfig) at ./distributed/managers.jl:437
create_worker(::TopoTestManager, ::WorkerConfig) at ./distributed/cluster.jl:513
setup_launched_worker(::TopoTestManager, ::WorkerConfig, ::Array{Int64,1}) at ./distributed/cluster.jl:459
(::getfield(Base.Distributed, Symbol("##41#44")){TopoTestManager,WorkerConfig})() at ./task.jl:348

...and 7 more exception(s).

Surely unrelated. FreeBSD CI was down when CI last ran for this PR. Everything else is green though.

This removes `sum_kbn` and `cumsum_kbn` in favor of `sum` and `cumsum`,
respectively, with the `*_kbn` functions moving to a package.

Fixes #24804
@ararslan
Copy link
Member Author

ararslan commented Dec 7, 2017

I've rebased this, so unless there are any further comments, I plan to merge this pretty soon.

@yurivish
Copy link
Contributor

yurivish commented Dec 7, 2017

@JeffreySarnoff posted this very minimal sum_kbn use case in Slack today that I found quite interesting as someone who hasn't thought deeply about summation algorithms. Not sure it says anything either way about this PR, but I thought I'd post it here to preserve it once the Slack history is lost to the sands of time:

a, b, c = 1.0, 1.0e16, -1.0e16;
fwd = [a, b, a, c];
rev = [c, a, b, a];

sum(fwd), sum(rev), sum_kbn(fwd), sum_kbn(rev)
0.0, 1.0, 2.0, 2.0

@ararslan
Copy link
Member Author

ararslan commented Dec 8, 2017

CI failures are unrelated. AppVeyor i686 is a failure to download Cygwin and Circle x86-64 is something about deserializing a remote libgit2 exception.

@ararslan ararslan merged commit 5f929a4 into master Dec 8, 2017
@ararslan ararslan deleted the aa/dep-kbn branch December 8, 2017 18:27
@JeffreySarnoff
Copy link
Contributor

I put sum_kbn and cumsum_kbn here.

@diegozea
Copy link
Contributor

diegozea commented Dec 8, 2017

This is missing in NEWS.md

@ararslan
Copy link
Member Author

ararslan commented Dec 8, 2017

@JeffreySarnoff I already made and registered https://github.com/JuliaMath/KahanSummation.jl.

@diegozea Good catch, I'll rectify that. Thanks.

@JeffreySarnoff
Copy link
Contributor

@ararslan good!

evetion pushed a commit to evetion/julia that referenced this pull request Dec 12, 2017
This removes `sum_kbn` and `cumsum_kbn` in favor of `sum` and `cumsum`,
respectively, with the `*_kbn` functions moving to a package.

Fixes JuliaLang#24804
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] deprecation This change introduces or involves a deprecation maths Mathematical functions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants