From f272c47ad5d0b626ff5fb14e1abe2a86e330309a Mon Sep 17 00:00:00 2001 From: Piotr Findeisen Date: Thu, 29 Jul 2021 14:36:37 +0200 Subject: [PATCH] Fix -NaN ordering in spec In Java, -NaN is not distinguishable from NaN (is the same value, as introspectable with `Float.floatToIntBits` or `Double.doubleToLongBits`). Moreover, Java sorts all possible `NaN` values within single peer group. The different NaN values compare equal, even if they are not the same. They always compare as "greater" than positive infinity. --- site/docs/spec.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/site/docs/spec.md b/site/docs/spec.md index 6bcfd379d7f8..b189c09631c4 100644 --- a/site/docs/spec.md +++ b/site/docs/spec.md @@ -267,7 +267,8 @@ A sort order is defined by an sort order id and a list of sort fields. The order Order id `0` is reserved for the unsorted order. -Sorting floating-point numbers should produce the following behavior: `-NaN` < `-Infinity` < `-value` < `-0` < `0` < `value` < `Infinity` < `NaN`. This aligns with the implementation of Java floating-point types comparisons. +Sorting floating-point numbers should produce the following behavior: `-Infinity` < `-value` < `-0` < `0` < `value` < `Infinity` < `NaN`. +The different `NaN` representation can be in an arbitrary order. This aligns with the implementation of Java floating-point types comparisons. A data or delete file is associated with a sort order by the sort order's id within [a manifest](#manifests). Therefore, the table must declare all the sort orders for lookup. A table could also be configured with a default sort order id, indicating how the new data should be sorted by default. Writers should use this default sort order to sort the data on write, but are not required to if the default order is prohibitively expensive, as it would be for streaming writes.