You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems possible to have negative variances due to numerical inaccuracies. This is because nanops.py, line 120 does not take the absolute value of the result. Having negative values will cause std() to return NaN when it should be 0.
The code below should [probabilistically] recreate the problem. It could also be turned into a unit test.
Hello,
It seems possible to have negative variances due to numerical inaccuracies. This is because nanops.py, line 120 does not take the absolute value of the result. Having negative values will cause std() to return NaN when it should be 0.
The code below should [probabilistically] recreate the problem. It could also be turned into a unit test.
Thanks!
from pandas import DataFrame
import numpy as np
random_repeated_rows = np.array( [np.random.random((10000,)),] * 10 )
my_var = DataFrame( random_repeated_rows ).var()
len( my_var[ my_var < 0 ] ) # returns a negative slightly less than half of the time
np.min( DataFrame( random_repeated_rows ).var() ) # returns a tiny negative -9.8686491077791697e-16
np.min( DataFrame( random_repeated_rows ).values.var(axis=0) ) # returns 0
The text was updated successfully, but these errors were encountered: