-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] design discussion - pdf
and pmf
in distributions, discrete, continuous, and mixed
#229
Comments
Yes, that is correct it will handle all edge cases irrespective of |
pdf
and pmf
in distrubtions, discrete, continuous, and mixedpdf
and pmf
in distributions, discrete, continuous, and mixed
This PR adds a `pmf` and `log_pmf` method to the base interface. Fixes #289 In accordance with #229, these return 0 resp `-np.inf` if the distribution is continuous. Also makes the following, connected changes: * `pdf` return 0 for discrete distributions * removes the discrete/continuous handling logic from the `scipy` adapter, as this is now in the base class I've also changed the way in which `TestScipyAdapter` queries the distributions - by inheritance, not by tag. This is since the tag is "mechanical" (for internal testing only) and it might confuse users to see a value in `object_type` which is not related to an external API property.
This is a design discussion on how to handle
pdf
andpmf
in distrubtions, which can be discrete, continuous (short for "absolutely continuous"), and mixed. Assuming domain on the real numbers, and distributions without singular component.scipy
handles these as follows:pmf
is present andpdf
is not present, for discrete distributions.pdf
is present andpmf
is not present, for continuous distributions.I think it would be more consistent with composition and unified interfaces a la
sklearn
if all distributions had all these methods, and they correspond to the measures in the Lebesgue decomposition. That is,pmf
andpdf
are present in all distributionspmf
andpdf
is a probability measureIn particular, this would mean:
pmf
sums to one, andpdf
is always zeropdf
integrates to one, andpmf
is always zeropdf
and sum ofpmf
sum to one. In general, thepdf
integral, orpmf
sum are not equal to one.Being faithful to the Lebesgue decomposition also has an advantage in mixtures: the
pdf
andpmf
of am = Mixture([d1, d2], [w1, w2])
hasm.pdf = w1 * d1.pdf + w2 * d2.pdf
, andm.pmf = w1 * d1.pmf + w2 * d2.pmf
, irrespective of componentsd1
,d2
being continuous, discrete, or mixed. (assumingw1 + w2 == 1
).In a sense, this seems to be the convention that treats all edge cases consistently.
Thoughts?
The text was updated successfully, but these errors were encountered: