Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement StatsAPI.pvalue #1719

Merged
merged 4 commits into from
May 9, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "Distributions"
uuid = "31c24e10-a181-5473-b8eb-7969acd0382f"
authors = ["JuliaStats"]
version = "0.25.89"
version = "0.25.90"

[deps]
ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"
Expand All @@ -15,6 +15,7 @@ Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
SparseArrays = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
SpecialFunctions = "276daf66-3868-5448-9aa4-cd146d93841b"
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
StatsAPI = "82ae8749-77ed-4fe6-ae5f-f523153014b0"
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
StatsFuns = "4c63d2b9-4356-54db-8cca-17b64c39e42c"
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
Expand All @@ -26,6 +27,7 @@ FillArrays = "0.9, 0.10, 0.11, 0.12, 0.13, 1"
PDMats = "0.10, 0.11"
QuadGK = "2"
SpecialFunctions = "1.2, 2"
StatsAPI = "1.6"
StatsBase = "0.32, 0.33, 0.34"
StatsFuns = "0.9.15, 1"
julia = "1.3"
Expand Down
4 changes: 4 additions & 0 deletions src/Distributions.jl
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ using Random
import Random: default_rng, rand!, SamplerRangeInt

import Statistics: mean, median, quantile, std, var, cov, cor
import StatsAPI
import StatsBase: kurtosis, skewness, entropy, mode, modes,
fit, kldivergence, loglikelihood, dof, span,
params, params!
Expand Down Expand Up @@ -306,6 +307,9 @@ include("pdfnorm.jl")
include("mixtures/mixturemodel.jl")
include("mixtures/unigmm.jl")

# Interface for StatsAPI
include("statsapi.jl")

# Extensions: Implementation of DensityInterface and ChainRulesCore API
if !isdefined(Base, :get_extension)
include("../ext/DistributionsChainRulesCoreExt/DistributionsChainRulesCoreExt.jl")
Expand Down
29 changes: 29 additions & 0 deletions src/statsapi.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
function _check_tail(tail::Symbol)
if tail !== :both && tail !== :left && tail !== :right
throw(ArgumentError("`tail=$(repr(tail))` is invalid"))
end
end

function StatsAPI.pvalue(dist::DiscreteUnivariateDistribution, x::Number; tail::Symbol=:both)
_check_tail(tail)
if tail === :both
p = 2 * min(ccdf(dist, x-1), cdf(dist, x))
min(p, oneunit(p)) # if P(X = x) > 0, then possibly p > 1
elseif tail === :left
cdf(dist, x)
else # tail === :right
ccdf(dist, x-1)
end
end

function StatsAPI.pvalue(dist::ContinuousUnivariateDistribution, x::Number; tail::Symbol=:both)
_check_tail(tail)
if tail === :both
p = 2 * min(cdf(dist, x), ccdf(dist, x))
min(p, oneunit(p)) # if P(X = x) > 0, then possibly p > 1
elseif tail === :left
cdf(dist, x)
else # tail === :right
ccdf(dist, x)
end
end
1 change: 1 addition & 0 deletions test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@ const tests = [
"multivariate/product",
"eachvariate",
"univariate/continuous/triangular",
"statsapi",

### missing files compared to /src:
# "common",
Expand Down
26 changes: 26 additions & 0 deletions test/statsapi.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
using Distributions
using StatsAPI: pvalue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to have Distributions reexport pvalue? I'd be generally in favor but don't feel too strongly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this break HypothesisTests?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not after HypothesisTests extends pvalue from StatsAPI.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but all existing releases of HypothesisTests would suddenly be broken, wouldn't they? Certainly, this could be fixed by modifying the registry but I think this makes it less attractive to re-export pvalue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll merge and release without re-exporting pvalue for now but this can be reconsidered and changed in a subsequent release if desired.


using Test

@testset "pvalue" begin
# For two discrete and two continuous distribution
for dist in (Binomial(10, 0.3), Poisson(0.3), Normal(1.4, 2.1), Gamma(1.9, 0.8))
# Draw sample
x = rand(dist)

# Draw 10^6 additional samples
ys = rand(dist, 1_000_000)

# Check that empirical frequencies match pvalues of left/right tail approximately
@test pvalue(dist, x; tail=:left) ≈ count(y -> y ≤ x, ys) / length(ys) rtol=5e-3
@test pvalue(dist, x; tail=:right) ≈ count(y -> y ≥ x, ys) / length(ys) rtol=5e-3
devmotion marked this conversation as resolved.
Show resolved Hide resolved

# Check consistency of pvalues of both tails
@test pvalue(dist, x; tail=:both) ==
min(1, 2 * min(pvalue(dist, x; tail=:left), pvalue(dist, x; tail=:right)))

# Incorrect value for keyword argument
@test_throws ArgumentError("`tail=:l` is invalid") pvalue(dist, x; tail=:l)
end
end