-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEPR: mad (mean absolute difference) functions #11787
Comments
we would do this as |
|
In R, the Having an option to use the median in Pandas seems like a good idea; robustness is a common reason to use the MAD rather than standard deviation, but the current implementation is certainly not robust. I think
etc.... If this seems like a reasonable end-state, I can work on this issue. |
We could add a |
where the constant is by default ~1.48 which makes it comparable to the standard deviation, the center is provided by the user (could be median or anything). R manual
Although there exists such a thing, there is not much use for a mean absolute deviation. Your data is either normally distributed and you use standard deviation and mean or it is not normally distributed and you cannot use this but might use median absolute deviation and median. |
Given all the potential confusion, I am almost inclined to deprecate this method instead. There are lots of ways to implement MAD, and it's almost better to force the user to implement their own rather than use the built-in one with mistaken assumptions. Changing the implementation of methods like this is generally a non-starter because of a backwards compatibility concerns. |
@shoyer @TomAugspurger I'm interested trying to implement this but wanted to check in first before getting to work, as the discussion was inconclusive. Does the recently added good first issue label mean this feature would be welcome? I think the improvement in code readability would be awesome. |
I'm not convinced that it make sense to iterate on this method. Given that we provide the primitives to implement this as a one-liner, I would recommend users write their own helper function for |
i would be ok with deprecating .mad() entirely |
@jreback - I was thinking of taking this ticket up assuming that the plan is to deprecate .mad()? If that's the case then should this ticket title be changed from ENH to CLN (or DEPR - is this still being used?) Secondly - where should the deprecation test reside? I was thinking of putting it into the Generic class under pandas/tests/generic/test_generic.py? |
you don’t have to change the title the test should go near where the mad tests are now |
I am working on this as part of Walmart PandasHack |
The generic function
.mad()
calculates the mean absolute difference of a set of data, but in some cases the median absolute difference is more appropriate. InR
, themad()
function accepts acenter
argument to specify how the average absolute difference should be calculated. I propose to add the same to the pandas function.The text was updated successfully, but these errors were encountered: