-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New type hierarchy for ValueSupport #945
Conversation
…ypes for countable and continuous support respectively, and discontinuous distributions in general.
I've just stripped type hierarchy out of #941 without any improvements to the distributions as requested by @matbesancon. Do feel free to stick with the original PR if it's manageable on its own! May be of interest to @mschauer also. |
Codecov Report
@@ Coverage Diff @@
## master #945 +/- ##
==========================================
- Coverage 77.96% 77.53% -0.43%
==========================================
Files 112 112
Lines 5363 5409 +46
==========================================
+ Hits 4181 4194 +13
- Misses 1182 1215 +33
Continue to review full report at Codecov.
|
minor comments, otherwise this will be good with some tests |
alright, we'll freeze this one and wait for #945 to be merged, then the diff will be much smaller |
Co-Authored-By: Mathieu Besançon <mathieu.besancon@gmail.com>
Co-Authored-By: Mathieu Besançon <mathieu.besancon@gmail.com>
…amework. Minor bugfix for MixtureModel with non-fully parameterised types and add testing.
Was that last message intended for #941 @matbesancon? In the meanwhile I've added some very minor fixes to distributions (and tests) that are currently overriding the default behaviour because they use non-Float64's in continuous distributions. |
Yes absolutely sorry! |
Thanks for appveyor good catch, maybe we should have this as a separate PR and merge it right away in a separate one? |
@richardreeve just did, you can merge master into your branch :) |
I'm now pretty sure that this is complete. The only other thing I could add to this branch is switching pdf() -> pmf() for all of the discrete distributions. |
Cool, let's see what the coverage is saying, also let's keep pmf for the next PR, that's big enough I think |
Aside from the comment you made above (that I still don't understand!), is this good to go? |
Sorry for the misunderstanding, I was thinking I'm going to have a last review and let others voice their opinion, but looks good yup :) |
Ah, no - I understood the comment about pmf() - it was the (review) comment about the docs above that I didn't understand. |
Ah ok sorry, it's about not forcing it to |
Okay, right. The new "default" is to use At some point it would be nice to generalise this, but it's non-trivial. In particular being about to extend to allow Unitful numbers for many continuous distributions, which is the first thing I'd like to do, requires us to do detailed type checking since |
having this alias may create a barrier to these generic re-writes, (which we very much want to achieve). I would use |
I would like to react to this first, can we agree on a time-line when this is merged? |
Let us say in one week from now? |
That would be great, thank you a lot. |
Why? We’ve already had a week (three weeks since the first PR, and over two since this one started) and we’ve achieved nothing except that I now have very little time or energy left to finish this work. I introduced this PR to improve the package, as the several thousand other lines of code that I’ve contributed to this and Distributions.jl, which have added functionality, fixed bugs and added improved testing. Why are we waiting any longer? |
There will likely be no release in a week from now, I still need to fix the "no-argument-check" thing, the PR I created introduces too much complexity, fixing it will not happen before then |
Talking about it, I think PR this will break I am not sure about |
This PR or #941? |
I can do #951 in a non-breaking way if you think it’s a desirable... it would basically involve removing things like I actually think that’s desirable as there are too many const aliases knocking around, but it’s not currently implemented... |
This one, because subtyping
|
Fair enough. I hadn’t spotted this. Phylo is my package so easily fixed, but in any event I can easily fix this by changing |
It sounds reasonable yes. Maybe we could deprecate |
Well, it is a PR which substantially changes the internals, so some breakage is rather expected. This is just what I found when looking. So much from myself now. |
Yes, I was thinking about Edit: actually, I don’t really believe this in this situation, so I’ll go with Support. Thanks for spotting it anyway, @mschauer. |
@richardreeve did you have time to do the changes? |
Hi @matbesancon - I was actually walking in the alps last week (must stop checking github when I get a signal!), so not yet, but I'm back now and I'll get it done tonight! Edit: I've got corrections to two theses to finish going through (somewhat!) unexpectedly so it'll have to be Thursday, sorry. |
@@ -67,10 +80,11 @@ into an array, depending on the variate form. | |||
nsamples(t::Type{Sampleable}, x::Any) | |||
nsamples(::Type{D}, x::Number) where {D<:Sampleable{Univariate}} = 1 | |||
nsamples(::Type{D}, x::AbstractArray) where {D<:Sampleable{Univariate}} = length(x) | |||
nsamples(::Type{D}, x::AbstractVector) where {D<:Sampleable{Multivariate}} = 1 | |||
nsamples(::Type{D}, x::AbstractArray{<:AbstractVector}) where {D<:Sampleable{Multivariate}} = length(x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an unrelated change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was fixing the handling of numbers throughout the code and I spotted that multivariate was missing this function - this isn't a deletion... the edited code is in the next line:
nsamples(::Type{D}, x::AbstractVector{<:Number}) where {D<:Sampleable{Multivariate}} = 1
|
||
function expectation(distr::CountableUnivariateDistribution, | ||
g::Function, epsilon::Real) | ||
sum(support(distr)) do x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this make sense, this is a loop over typically 2ˆ64 elements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a slight editing of the original code:
f = x->pdf(distr,x)
(leftEnd, rightEnd) = getEndpoints(distr, epsilon)
sum(x -> f(x)*g(x), leftEnd:rightEnd)
I don't have a strong opinion about whether it's sensible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are dropping the epsilon
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. That's because this is for non-ContiguousSupport
types - they are not necessarily ordered, so we can't use getEndpoints()
. At the moment, the only type that falls into this category is DiscreteNonParametric
, but it will also include Dirac
, neither of which are expected to have anything like 2^64 elements..
length(searchsorted(support(d), x)) > 0 | ||
insupport(d::DiscreteNonParametric{T}, x::Number) where {T<:Number} = | ||
length(searchsorted(support(d), convert(T, x))) > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer support
to do the right thing here and not insupport
to second guess the needed transformation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you expect support
to do here? It's just returning the support, and then we're checking for presence of the value in that set.
FYI: I still have the "postponed by a week" in mind and will come back later today. |
One thing, to be clear, what would you consider the set of features I would have to reproduce when I would like to replace this PR by my own attempt? Besides making the additional tests pass? |
The problem with having split this off from #941 is that the new tests, which will emerge from the new distributions, are not included. The tests changed here mostly just change the test type hierarchy to match the new changes. This PR provides a mechanism for supporting non-Int, non-Float64 eltypes for countable and continuous support respectively, and discontinuous distributions in general, in a consistent and comprehensible way. It does this whilst keeping consistency with the old code base so there are no - or very few! - breaking changes, and without increasing the number of things that people will forget to do when implementing their own distributions. That's what it tries to do - it's an abstract thing, not a specific set of features per se. |
Note: This PR is now largely superseded by #951, which includes all of these changes, but also totally removes all references to the old const aliases from the codebase, and in so doing removes several new bugs that weren't identified here. This means that anyone referring to the code to understand how to write a new distribution will see the new API in action. I would strongly recommend merging that rather than this. Although it adds a few hundred lines of code, they are nearly all just replacing old aliases with more explicit types. |
Allows non-Int, non-Float64 eltypes for countable and continuous support respectively, and discontinuous distributions in general.