Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fit mle weighted laplace #1676

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

dmetivie
Copy link

@dmetivie dmetivie commented Feb 7, 2023

I added fit_mle(Laplace, x, w) method using the weighted median plus a test in the test section of Laplace fit.

@codecov-commenter
Copy link

codecov-commenter commented Feb 7, 2023

Codecov Report

Base: 83.60% // Head: 85.60% // Increases project coverage by +1.99% 🎉

Coverage data is based on head (d5e4f6c) compared to base (c431d20).
Patch coverage: 88.88% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1676      +/-   ##
==========================================
+ Coverage   83.60%   85.60%   +1.99%     
==========================================
  Files         130      130              
  Lines        6642     8189    +1547     
==========================================
+ Hits         5553     7010    +1457     
- Misses       1089     1179      +90     
Impacted Files Coverage Δ
src/univariate/continuous/laplace.jl 94.52% <88.88%> (-0.03%) ⬇️
src/genericfit.jl 66.66% <0.00%> (-8.34%) ⬇️
src/univariate/continuous/rician.jl 94.87% <0.00%> (-1.85%) ⬇️
src/univariate/discrete/bernoullilogit.jl 12.00% <0.00%> (-1.64%) ⬇️
src/univariate/continuous/arcsine.jl 88.88% <0.00%> (-1.44%) ⬇️
src/samplers/poisson.jl 90.78% <0.00%> (-1.40%) ⬇️
src/multivariate/mvlognormal.jl 93.02% <0.00%> (-0.73%) ⬇️
src/multivariates.jl 44.82% <0.00%> (-0.63%) ⬇️
src/univariates.jl 74.54% <0.00%> (-0.59%) ⬇️
src/matrix/matrixnormal.jl 94.11% <0.00%> (-0.53%) ⬇️
... and 113 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Comment on lines +140 to +150
function fit_mle(::Type{<:Laplace}, x::AbstractArray{<:Real}, w::AbstractArray{<:Real})
n = length(x)
if n != length(w)
throw(DimensionMismatch("Inconsistent array lengths."))
end
xc = similar(x)
copyto!(xc, x)
m = median(xc, weights(w)) # https://github.com/JuliaStats/StatsBase.jl/blob/0b64a9c8a16355da16469d0fe5016e0fdf099af5/src/weights.jl#L788
xc .= abs.(x .- m)
return Laplace(m, mean(xc, weights(w)))
end
Copy link
Contributor

@ParadaCarleton ParadaCarleton Sep 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I've taken the opportunity to remove an unneeded allocation; there's no need to create an extra array for the centered variables.

Suggested change
function fit_mle(::Type{<:Laplace}, x::AbstractArray{<:Real}, w::AbstractArray{<:Real})
n = length(x)
if n != length(w)
throw(DimensionMismatch("Inconsistent array lengths."))
end
xc = similar(x)
copyto!(xc, x)
m = median(xc, weights(w)) # https://github.com/JuliaStats/StatsBase.jl/blob/0b64a9c8a16355da16469d0fe5016e0fdf099af5/src/weights.jl#L788
xc .= abs.(x .- m)
return Laplace(m, mean(xc, weights(w)))
end
function fit_mle(::Type{<:Laplace}, data::AbstractArray{<:Real})
med = median(data)
# TODO: is it possible to do a single-pass median and mean absolute deviation?
mean_abs_deviation = mean(data) do x
return abs(x - med)
end
return Laplace(med, mean_abs_dev)
end
function fit_mle(::Type{<:Laplace}, data::AbstractArray{<:Real}, w::AbstractWeights)
if length(data) != length(w)
throw(DimensionMismatch("Inconsistent array lengths."))
end
# TODO: can this be done in a single-pass?
# code for weighted median can be found at:
# https://github.com/JuliaStats/StatsBase.jl/blob/0b64a9c8a16355da16469d0fe5016e0fdf099af5/src/weights.jl#L788
med = median(data, weights(w))
mean_abs_deviation = mean(zip(data, weights)) / sum(w) do x, weight
return weight * abs(x - med)
end
return Laplace(med, mean_abs_deviation)
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suggestion breaks the non-weighted implementation without improving its allocations (I don't think the non-weighted algorithm can be improved allocationwise).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suggestion breaks the non-weighted implementation without improving its allocations

I made a mistake earlier. Not sure if it works now, but @dmetivie can double-check hopefully. (Is there an easy way to run tests on suggestions?)

I don't think the non-weighted algorithm can be improved allocationwise

The code for both algorithms is the same. But in any case, there's definitely no need to copy an array just to center it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in any case, there's definitely no need to copy an array just to center it.

The code on master seems to be as efficient as possible: The copy is used by the median calculations and then only reused by the centering operation. Note that the code uses median! - if you switch to median, a copy is created internally and discarded afterwards. So IMO the current code should not be changed.

dw = fit(dist, x, w)
@test isa(dw, dist)
@test isapprox(location(dw), 5.0, atol=0.02)
@test isapprox(scale(dw), 3.0, atol=0.03)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you test the weighted method gives the same answers as the unweighted method when w is a set of unit weights?

@test isa(d, dist)
@test isapprox(location(d), 5.0, atol=0.02)
@test isapprox(scale(d) , 3.0, atol=0.03)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test for the case where the weights don't sum to 1?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(And ideally one where the weights sum to 0, making sure we get either a NaN or an error?)

@dmetivie
Copy link
Author

dmetivie commented Sep 7, 2023

Thanks for the review, I'll do your suggestion and test, when I find some time.

Do you have a citation for this being equivalent to the MLE of a Laplace distribution?

I was going to write down the proof because I don't see it written explicitly anywhere.

@ParadaCarleton
Copy link
Contributor

I was going to write down the proof because I don't see it written explicitly anywhere.

Yeah, I think it's a well-known enough result that we can use it. (The proof should be identical to the one for the unweighted case).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants