Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exec: fix NaN comparison logic #38881

Merged
merged 1 commit into from
Jul 16, 2019

Conversation

solongordon
Copy link
Contributor

I added special NaN handling for float comparisons. In SQL, NaNs are
treated as less than any other float value.

Thankfully I'm not seeing a performance hit when I run our sort
benchmarks with float64 values.

Fixes #38751

Release note: None

@solongordon solongordon requested review from a team July 15, 2019 19:24
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: it's great how clean of a change this is

@solongordon
Copy link
Contributor Author

bors r+

@jordanlewis
Copy link
Member

Mind blown that this doesn't affect benchmarks too much. Does it affect selection benchmarks?

@solongordon
Copy link
Contributor Author

Yeah, good point, I should have run that one too. Noticeable drop-off in performance there unfortunately.

name                                                      old time/op    new time/op    delta
SelLTFloat64Float64ConstOp/useSel=true,hasNulls=true-4      1.28µs ± 1%    2.68µs ± 2%  +109.95%  (p=0.016 n=4+5)
SelLTFloat64Float64ConstOp/useSel=true,hasNulls=false-4      816ns ± 9%    1601ns ± 1%   +96.10%  (p=0.008 n=5+5)
SelLTFloat64Float64ConstOp/useSel=false,hasNulls=true-4     1.19µs ± 2%    1.82µs ± 3%   +52.50%  (p=0.008 n=5+5)
SelLTFloat64Float64ConstOp/useSel=false,hasNulls=false-4     645ns ± 3%    1877ns ± 1%  +191.04%  (p=0.008 n=5+5)
SelLTFloat64Float64Op/useSel=true,hasNulls=true-4           1.80µs ± 4%    2.41µs ± 2%   +33.47%  (p=0.008 n=5+5)
SelLTFloat64Float64Op/useSel=true,hasNulls=false-4           807ns ± 4%    1303ns ± 2%   +61.52%  (p=0.008 n=5+5)
SelLTFloat64Float64Op/useSel=false,hasNulls=true-4          1.38µs ±11%    2.76µs ± 2%   +99.86%  (p=0.008 n=5+5)
SelLTFloat64Float64Op/useSel=false,hasNulls=false-4          641ns ± 8%    1171ns ± 2%   +82.77%  (p=0.008 n=5+5)

name                                                      old speed      new speed      delta
SelLTFloat64Float64ConstOp/useSel=true,hasNulls=true-4    6.40GB/s ± 1%  3.05GB/s ± 2%   -52.35%  (p=0.016 n=4+5)
SelLTFloat64Float64ConstOp/useSel=true,hasNulls=false-4   10.1GB/s ± 9%   5.1GB/s ± 1%   -49.09%  (p=0.008 n=5+5)
SelLTFloat64Float64ConstOp/useSel=false,hasNulls=true-4   6.86GB/s ± 2%  4.50GB/s ± 3%   -34.41%  (p=0.008 n=5+5)
SelLTFloat64Float64ConstOp/useSel=false,hasNulls=false-4  12.7GB/s ± 2%   4.4GB/s ± 1%   -65.62%  (p=0.008 n=5+5)
SelLTFloat64Float64Op/useSel=true,hasNulls=true-4         9.08GB/s ± 4%  6.80GB/s ± 2%   -25.12%  (p=0.008 n=5+5)
SelLTFloat64Float64Op/useSel=true,hasNulls=false-4        20.3GB/s ± 4%  12.6GB/s ± 2%   -38.12%  (p=0.008 n=5+5)
SelLTFloat64Float64Op/useSel=false,hasNulls=true-4        11.9GB/s ±10%   5.9GB/s ± 2%   -50.14%  (p=0.008 n=5+5)
SelLTFloat64Float64Op/useSel=false,hasNulls=false-4       25.6GB/s ± 8%  14.0GB/s ± 2%   -45.34%  (p=0.008 n=5+5)

@jordanlewis
Copy link
Member

I figured. I bet if we changed things around to do another pass over the vectors searching for NaNs, things would be different... but I think the effort there is probably too high to justify. You could also imagine doing something similar to nulls/sel array for NaNs, and fast path when you know there's no NaNs at all, but again I don't think it's worth it here.

@yuzefovich
Copy link
Member

Hm, it appears that bors just ignored "r+" from several hours ago :)

return 1
}
if math.IsNaN(a) {
if math.IsNaN(b) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could avoid this second check by changing a == b to be int64(a) == int64(b) or whatever - aren't the bits of NaN the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thought. I think we would use math.Float64bits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I was worried this might be slower than the normal float equality check, but it's actually faster! Presumably because the normal one has to do NaN checks.

I added special NaN handling for float comparisons. In SQL, NaNs are
treated as less than any other float value.

Thankfully I'm not seeing a performance hit when I run our sort
benchmarks with float64 values.

Fixes cockroachdb#38751

Release note: None
@solongordon
Copy link
Contributor Author

bors r+

craig bot pushed a commit that referenced this pull request Jul 16, 2019
38767: exec: fix planning of count operator r=yuzefovich a=yuzefovich

Previously, when planning a count operator, we would add it to the
flow and would ignore any post-operator planning (like projections).
Now, this is fixed.

Additionally, this commit fixes slicing within projections operators -
previously, we would always slice up to BatchSize, but the underlying
memory not always has sufficient capacity (for example, count operator
uses a batch with a capacity of 1) which would cause an index out of
bounds.

Fixes: #38752.

Release note: None

38881: exec: fix NaN comparison logic r=solongordon a=solongordon

I added special NaN handling for float comparisons. In SQL, NaNs are
treated as less than any other float value.

Thankfully I'm not seeing a performance hit when I run our sort
benchmarks with float64 values.

Fixes #38751

Release note: None

38891: c-deps: bump rocksdb for macOS build fix r=ajkr a=ajkr

Pick up cockroachdb/rocksdb#39

Release note: None

Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Solon Gordon <solon@cockroachlabs.com>
Co-authored-by: Andrew Kryczka <andrew.kryczka2@gmail.com>
@craig
Copy link
Contributor

craig bot commented Jul 16, 2019

Build succeeded

@craig craig bot merged commit 9bec27f into cockroachdb:master Jul 16, 2019
@solongordon solongordon deleted the fix-nan-comparisons branch July 16, 2019 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

exec: float 'NaN' comparisons are incorrect
5 participants