-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fischer's Exact test does extraneous calculations when printing #224
Comments
The cause is the calculation of the confidence intervals. In particular, the calculation of a unit root. MWE julia> x = FisherExactTest(777,888,999,1000);
julia> level = .95; tail = :right;
julia> dist(ω) = HypothesisTests.FisherNoncentralHypergeometric(x.a+x.b, x.c+x.d, x.a+x.c, ω);
julia> obj(ω) = pvalue(dist(ω), x.a, tail=tail) - (1-level);
julia> using BenchmarkTools;
julia> HypothesisTests.fzero(obj, HypothesisTests.find_brackets(obj)...)
0.7835029530244386
julia> @time HypothesisTests.fzero(obj, HypothesisTests.find_brackets(obj)...)
19.567936 seconds (51.44 k allocations: 640.463 MiB, 0.15% gc time)
0.7835029530244386 The Frustratingly, our implementation of However the R code generates an extra branch based on the value of the This is probably above my paygrade so I am cc-ing @andreasnoack as someone who might have deeper insight into what is going on. R code to verify is x = fisher.test(matrix (c(777,888,999,1000),ncol=2,byrow=T)) |
I think there are two issues here. One is that the The other issue is the
to reduce the costs. |
I don't think the first issue is what's going on. So if we get our calculation of the confidence interval to be as fast as R's, the |
I decided to implement the method of Liao and Rosen. With JuliaStats/Distributions.jl#1277 I'm getting julia> @time show(x);
Fisher's exact test
-------------------
Population details:
parameter of interest: Odds ratio
value under h_0: 1.0
point estimate: 0.875908
95% confidence interval: (0.7673, 0.9999)
Test summary:
outcome with 95% confidence: reject h_0
two-sided p-value: 0.0497
Details:
contingency table:
777 888
999 1000
0.002868 seconds (627 allocations: 20.422 KiB) instead of 23 seconds on my machine. @pdeffebach it would be great if you could review the PR. |
I can confirm that this works on Distributions.jl master. Does this deserve a patch release? |
I've made a patch release of Distributions, see JuliaRegistries/General#30355, so the fix will be available shortly. Nothing should be needed here so I'll close. |
See this discussion on Reddit.
With the
;
the calculation is very fast.But when we omit the
;
it takes a very long time to print the result, because it's calculating other things, like computing the p-value, on the fly.The text was updated successfully, but these errors were encountered: