Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of inv(::ComplexF64) #39303

Merged
merged 1 commit into from
Feb 4, 2021

Conversation

thchr
Copy link
Contributor

@thchr thchr commented Jan 18, 2021

This is essentially just a rebase of #29058 (but my fork of that had gotten deleted so I had to open it anew): leads to a speedup of about 30% for inv(::ComplexF64) without changing the output.

The gist of the changes are the same as in the original PR: speed up inv(::ComplexF64) by combining two if statements into an if-elseif and opt for a cheaper variant of max, because we know we won't benefit from checking NaN or Inf values in this context. Also now use muladd explicitly in one spot.

Benchmarks:

v = rand(ComplexF64, 100000);

1.6-beta:

@btime inv($v) # 2.244 ms (2 allocations: 1.53 MiB)

PR:

@btime inv($v) # 1.562 ms (2 allocations: 1.53 MiB)

Tagging @simonbyrne again, in case you still recall the PR that #29058 branched off originally (~2 years ago now 😮 ).

@thchr
Copy link
Contributor Author

thchr commented Jan 18, 2021

Test error in spawn testset doesn't seem related afaict.

@dkarrasch dkarrasch added maths Mathematical functions performance Must go faster labels Jan 20, 2021
@thchr
Copy link
Contributor Author

thchr commented Jan 25, 2021

Bump; just so I don't forget this for 2 years again.

(I suspect nothing here is too controversial: maybe the main point to consider is whether to use muladd or not - using it can change the output of inv very slightly when fma is available, but it should be an improvement in both accuracy and speed, afaiu)

@oscardssmith
Copy link
Member

In my opinion, the use of muladd here is perfectly justified. We very explicitly do not guarantee exact outputs of floating point arithmatic. I'm also assigning myself as a reviewer for this so we can hopefully get it merged soon.

@oscardssmith oscardssmith self-requested a review January 25, 2021 18:36
@oscardssmith
Copy link
Member

Overall, the code for this looks good. The worst error I've seen from this is 2.6 ULP (about the same as before the PR), but I don't really know what we aim for for the accuracy of complex arithmatic.

@thchr
Copy link
Contributor Author

thchr commented Jan 26, 2021

The worst error I've seen from this is 2.6 ULP (about the same as before the PR).

Thanks for checking this explicitly; good to know.

base/complex.jl Show resolved Hide resolved
@oscardssmith oscardssmith added the merge me PR is reviewed. Merge when all tests are passing label Jan 28, 2021
@oscardssmith
Copy link
Member

@thchr any idea why the windows tests are failing? I don't want this to get lost. It might be worth rebasing and hoping they just go away.

@thchr
Copy link
Contributor Author

thchr commented Feb 3, 2021

@thchr any idea why the windows tests are failing? I don't want this to get lost. It might be worth rebasing and hoping they just go away.

Hmm, no; when I looked at it originally, it didn't look related.
Anyway, I've rebased and pushed again, so let's see if it's clear now; thanks for the prompt to do so :).

@thchr
Copy link
Contributor Author

thchr commented Feb 3, 2021

Oki, all green now test-wise.

@oscardssmith
Copy link
Member

Anyone have objections to this merging?

@JeffBezanson JeffBezanson merged commit b3d82d3 into JuliaLang:master Feb 4, 2021
@thchr thchr deleted the complexf64-inv branch February 4, 2021 21:15
@simeonschaub simeonschaub removed the merge me PR is reviewed. Merge when all tests are passing label Apr 3, 2021
ElOceanografo pushed a commit to ElOceanografo/julia that referenced this pull request May 4, 2021
antoine-levitt pushed a commit to antoine-levitt/julia that referenced this pull request May 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maths Mathematical functions performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants