The `quantile` function can return incorrect results for integer arrays (Int8, Int16, Int32) #119

yurivish · 2022-06-27T22:41:11Z

For example, using quantile to compute the maximum of the array [-128, 127] incorrectly returns -129 if the array consists of Int8 values:

julia> using Statistics

julia> quantile([-128, 127], 1)     # right
127

julia> quantile(Int8[-128, 127], 1) # wrong
-129

This happens because of the following code in _quantile:

Statistics.jl/src/Statistics.jl

Line 1010 in 0588f2c

return a + γ*(b-a)

For the above array, that line computes the following expression, causing the incorrect result:

julia> Int8(-128) + Int64(Int8(127) - Int8(-128))
-129

This behavior occurs with arrays of Int8, Int16, and Int32:

julia> function test(type)
           A = [typemin(type), typemax(type)]
           result = quantile(A, 1)
           correct_result = quantile(map(Int, A), 1)
           (; result, correct_result)
       end;

julia> test(Int8)
(result = -129, correct_result = 127)

julia> test(Int16)
(result = -32769, correct_result = 32767)

julia> test(Int32)
(result = -2147483649, correct_result = 2147483647)

Using quantile to compute the median also returns an answer that is not correct:

julia> quantile(Int8[-128, 127], .5)
-128.5

The equivalent function in NumPy, np.quantile, returns the correct result in all cases:

julia> using PyCall

julia> np = pyimport("numpy");

julia> a = np.array(Int8[-128, 127], dtype=np.int8);

julia> np.quantile(a, 1)
127

julia> function test_python()
	quantiles = (1, 0.5)
	types = (
		Int8 => np.int8,
		Int16 => np.int16,
		Int32 => np.int32
	)
	for q in quantiles, (type, dtype) in types
	    A = np.array([typemin(type), typemax(type)]; dtype)
	    result = np.quantile(A, q)
	    correct_result = quantile(map(Int, A), q)
	    println((; q, type, result, correct_result))
    end
end;

julia> test_python()
(q = 1, type = Int8, result = 127, correct_result = 127)
(q = 1, type = Int16, result = 32767, correct_result = 32767)
(q = 1, type = Int32, result = 2147483647, correct_result = 2147483647)
(q = 0.5, type = Int8, result = -0.5, correct_result = -0.5)
(q = 0.5, type = Int16, result = -0.5, correct_result = -0.5)
(q = 0.5, type = Int32, result = -0.5, correct_result = -0.5)

The text was updated successfully, but these errors were encountered:

nalimilan · 2023-09-30T20:33:42Z

Fixed by #145.

yurivish changed the title ~~The quantile function can return incorrect results for some integer arrays (Int8, Int16, Int32)~~ The quantile function can return incorrect results for integer arrays (Int8, Int16, Int32) Jun 27, 2022

nalimilan closed this as completed Sep 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The `quantile` function can return incorrect results for integer arrays (Int8, Int16, Int32) #119

The `quantile` function can return incorrect results for integer arrays (Int8, Int16, Int32) #119

yurivish commented Jun 27, 2022 •

edited

Loading

nalimilan commented Sep 30, 2023

The quantile function can return incorrect results for integer arrays (Int8, Int16, Int32) #119

The quantile function can return incorrect results for integer arrays (Int8, Int16, Int32) #119

Comments

yurivish commented Jun 27, 2022 • edited Loading

nalimilan commented Sep 30, 2023

The `quantile` function can return incorrect results for integer arrays (Int8, Int16, Int32) #119

The `quantile` function can return incorrect results for integer arrays (Int8, Int16, Int32) #119

yurivish commented Jun 27, 2022 •

edited

Loading