Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some comments on the route to Distributions 1.0 #1109

Open
5 tasks
azev77 opened this issue May 5, 2020 · 1 comment
Open
5 tasks

Some comments on the route to Distributions 1.0 #1109

azev77 opened this issue May 5, 2020 · 1 comment

Comments

@azev77
Copy link
Contributor

azev77 commented May 5, 2020

  • Make it easy to locate all relevant distributions.
    E.g. suppose I want all continuous univariate distributions w/ support [0, +\infty)...
    MLJ.jl makes this easy:
    using MLJ; X, y = @load_boston;
    m = models(): creates a vector of all 132 models.
    m = models(matching(X, y)): vector of 53 models that work w/ the data
    m = models(matching(X, y), x -> x.prediction_type == :deterministic): vec 50 models
    Distributions.jl doesn't currently have the equivalent:
    Distributions.continuous_distributions: arcsine should be Arcsine etc
    The following gets us part of the way there:
    filter(!isabstracttype, subtypes(Distribution))
    filter(!isabstracttype, subtypes(UnivariateDistribution))
    filter(!isabstracttype, subtypes(MultivariateDistribution))
    filter(!isabstracttype, subtypes(MatrixDistribution))
    filter(!isabstracttype, subtypes(ContinuousDistribution))
    filter(!isabstracttype, subtypes(ContinuousMultivariateDistribution))
    For distributions matching support we discussed: all(insupport.(dist, data))

  • it would simplify testing & other things if there was a slightly more structured (almost cookie-cutter) template for adding distributions.
    Some have no default params: Chi(): MethodError: no method matching Chi()
    Sometimes mean= NaN vs mean=Inf
    mean(LogitNormal()) gives error (perhaps use numerical?)
    Some dist entropy throws an error instead of NaN or Inf
    Perhaps: The entropy for this distribution has not been coded. Please submit a PR.
    If no closed form entropy is coded/exists, perhaps entropy() should compute it numerically?
    Fit truncated distributions Feature Request: Fit truncated normal #1108
    Some dist don't have quantiles coded: PGeneralizedGaussian, Skellam, VomMises

  • A convenient way to loop through all available (non-abstract type) distributions.
    You would find some inconsistencies.
    fieldnames(Normal) gives unicode (:μ, :σ)
    fieldnames(Dirichlet) gives (:alpha, :alpha0, :lmnB)

  • before 1.0 check out @cscherrer's note.

  • this repo is still missing many useful distributions (R Task Views)
    This seems like a great job for a student (maybe JSoC or GSoC or otherwise)

@nickrobinson251
Copy link
Contributor

xref #880

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants