Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Allow DiscreteNonParametric to have non-Real support #916

Closed
wants to merge 1 commit into from
Closed

RFC: Allow DiscreteNonParametric to have non-Real support #916

wants to merge 1 commit into from

Conversation

DilumAluthge
Copy link

@DilumAluthge DilumAluthge commented Jun 18, 2019

Description

Let vs denote the list of support values used when creating an instance of DiscreteNonParametric. Currently, we require that:

  • vs::Ts where Ts<:AbstractVector{T} and T<:Real.

This pull request relaxes the requirement so that it is now:

  • vs::Ts where Ts<:AbstractVector{T}

This allows you to create instances of DiscreteNonParametric that have non-Real support.

All of the existing DiscreteNonParametric tests are preserved, to show that this change is backwards compatible and will have no effect on DiscreteNonParametric when T<:Real.

I also add some tests for DiscreteNonParametric when T is String.

Motivation

Suppose you have trained a multiclass classifier, and you would now like run the classifier on a single sample. For each of the classes, the classifier will output the probability of that class. For example, the output might be setosa=0.1, versicolor=0.7, verginica=0.2.

Since the output of a multiclass classifier (when run on a single sample) is a discrete non-parametric distribution on the set of classes, it would be natural to store it in a DiscreteNonParametric. However, currently DiscreteNonParametric requires that the support be a list of real values. This pull request relaxes the requirement and allows you to create a DiscreteNonParametric where the support values can be of any type (such as String).

@codecov-io
Copy link

codecov-io commented Jun 18, 2019

Codecov Report

Merging #916 into master will increase coverage by 2.35%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #916      +/-   ##
=========================================
+ Coverage   73.94%   76.3%   +2.35%     
=========================================
  Files         107     107              
  Lines        5269    5114     -155     
=========================================
+ Hits         3896    3902       +6     
+ Misses       1373    1212     -161
Impacted Files Coverage Δ
src/samplers/discretenonparametric.jl 100% <100%> (ø) ⬆️
src/univariate/discrete/discretenonparametric.jl 98.07% <100%> (+8.79%) ⬆️
src/univariates.jl 65.69% <100%> (+1.24%) ⬆️
src/multivariate/mvtdist.jl 52.74% <0%> (+0.57%) ⬆️
src/univariate/discrete/binomial.jl 73.4% <0%> (+0.77%) ⬆️
...c/univariate/continuous/generalizedextremevalue.jl 88.18% <0%> (+0.79%) ⬆️
src/matrix/wishart.jl 58.82% <0%> (+1.13%) ⬆️
src/multivariates.jl 79.03% <0%> (+1.25%) ⬆️
src/univariate/continuous/beta.jl 62.65% <0%> (+1.47%) ⬆️
... and 34 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 432545a...d438ccc. Read the comment docs.

@DilumAluthge
Copy link
Author

Bump. Any thoughts on this?

@richardreeve
Copy link
Contributor

Not that I have any say, but this looks like a good idea to me. However, am I right in thinking that DiscreteNonParametric currently rounds to integers? If so, then that seems like a slightly crazy restriction in its own right, and your code doesn't fix it... e.g.

logcdf(d::DiscreteUnivariateDistribution, x::Real) = logcdf(d, floor(Int,x))

@DilumAluthge
Copy link
Author

I wanted this pull request to be backwards-compatible, which is why I didn't make any changes to the behavior of DiscreteNonParametric when the support is Real.

I imagine that removing those calls to round, floor, etc. would probably be a breaking change.

@richardreeve
Copy link
Contributor

True, but only to people doing some seriously crazy stuff!

@DilumAluthge
Copy link
Author

Closing in favor of #941

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants