-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bottleneck.move_std() produces nans, doesn't match pandas.rolling_std() #50
Comments
I don't know what you mean by produces NaNs. Could you give a short example showing the input (x) and what you would like bottleneck to return? |
Test.pyx has the x vector in the gist link with exact instructions to reproduce the results. The results should look like the factor version or the pandas version without the nans. Closed on accident... |
Too much work to go through your example. Perhaps you can make a brief example as below. Are these the NaNs you don't like? (I can't change that behavior)
Do you want to normalize by N:
Or N-1?
|
To reproduce it, just copy/paste this entire thing into the shell:
|
The preceding nans are not what I'm talking about--there are nans in the middle of the data. However, pandas has a kwarg
|
And finally, I'm aware of the sample standard deviation vs population standard deviation. I suppose that accounts for the difference in the non-nan values. Maybe the docs should say you're using the sample std? Edit: I guess you're calculating the biased std, N=0. There's no description of ddof in the parameters list. |
Yes, that's a bug (good find!). The docstring should include the description of ddof, which we can copy from np.std:
|
I added |
You might want to try |
Same bug.
|
If there are three nans in a row in x then what should the std be when the window is 3? |
There are no nans in my x at all. But you need the full precision, hence loading it from the file as described above.
|
If there were nans in the x input, I suppose I don't care what |
It might be that you're taking the From move_std.pyx:
|
Yes, it's taking the
Here's the print statements I added: if count == window:
foo = (a2sum - asum * asum / count) / (count - ddof)
if foo < 0.:
print "%.50f, %.50f, %d, %d" % (a2sum, asum, count, ddof)
print "%.50f" % foo
y[i0] = sqrt(foo)
else:
y[i0] = NAN |
Note that these negative numbers are also less than negative zero, or else |
Yep, we found the same thing. We plan to set the output to 0 when |
Awesome. What about adding a |
Pandas also has |
BUG move_std and move_nanstd neg sqrt bugs fixed. As reported in issue #50.
The negative sqrt issue is fixed: |
docstring is fixed and negative sqrt bug is fixed, so I'm closing this issue. You can open a new issue with the |
https://gist.github.com/3624548
pandas-dev/pandas#1840
The text was updated successfully, but these errors were encountered: