Adding StatsBase.predict to the API #466

sethaxen · 2023-02-20T09:34:02Z

In Turing, StatsBase.predict is overloaded to dispatch on DynamicPPL.Model and MCMCChains.Chains (https://github.com/TuringLang/Turing.jl/blob/d76d914231db0198b99e5ca5d69d80934ee016b3/src/inference/Inference.jl#L532-L564). This effectively does batch prediction, conditioning the model on each draw in the chains and calls rand on the model. We also want to do the same thing for InferenceData (see #465).

It would be convenient if StatsBase.predict was added to the DynamicPPL API. It's already an indirect dependency of this package. As suggested by @devmotion in #465 (comment), its default implementation could be to just call rand for a conditioned model:

StatsBase.predict(rng::AbstractRNG, model::DynamicPPL.Model, x) = rand(rng, condition(model, x))
StatsBase.predict(model::DynamicPPL.Model, x) = predict(Random.default_rng(), model, x)

The text was updated successfully, but these errors were encountered:

devmotion · 2023-02-20T10:03:25Z

Maybe this could even be part of AbstractPPL and be defined on AbstractPPL.AbstractProbabilisticProgram: condition is part of its API, only rand is not clearly specified there yet (probably should be done anyway).

sethaxen · 2023-02-20T10:53:48Z

Yeah, makes sense.

torfjelde · 2023-02-20T10:58:34Z

I'm down with this, but it's worth pointing out that just calling rand(rng, condition(model, x)) is probably not the greatest idea as it defaults to NamedTuple which can blow up compilation times for many models.

And regarding adding to APPL; we need to propagate that change back to v0.5 too then, because v0.6 is currently not compatible with DPPL (see #440).

sethaxen · 2023-02-20T11:01:38Z

I'm down with this, but it's worth pointing out that just calling rand(rng, condition(model, x)) is probably not the greatest idea as it defaults to NamedTuple which can blow up compilation times for many models.

Would rand(rng, OrderedDict, condition(model, x)) be the way to go then?

torfjelde · 2023-02-20T11:03:18Z

Would rand(rng, OrderedDict, condition(model, x)) be the way to go then?

For maximal model-compat, yes. But you do of course take a performance hit as a result 😕

sethaxen · 2023-02-20T11:08:21Z

Hrm. Maybe then predict should use a NamedTuple if x is a NamedTuple (imperfect because you can have few parameters but many data points). Or provide an API for specifying the return type, like rand does (but supporting two optional positional parameters rng and T complicates the interface)

devmotion · 2023-02-20T11:35:38Z

Or provide an API for specifying the return type, like rand does (but supporting two optional positional parameters rng and T complicates the interface)

Adding T to predict (with some default) would be in line with our API for rand though - there type T can be specified already.

@devmotion

This PR adds a 3-arg form of `rand` (suggested by @devmotion in TuringLang/DynamicPPL.jl#466 (comment)) to the interface for `AbstractProbabilisticProgram` and implements the default 1- and 2-arg methods that dispatch to this. Currently tests fail because this breaks the fallbacks for `GraphPPL.Model`, which expects `rand` to forward to its `rand!` method. I'm not certain how we want to define the interface for this `Model`. Co-authored-by: Xianda Sun <sunxdt@gmail.com>

sethaxen mentioned this issue Feb 24, 2023

[Merged by Bors] - Define rand defaults for AbstractProbabilisticProgram TuringLang/AbstractPPL.jl#79

Closed

sethaxen mentioned this issue Feb 25, 2023

Add StatsBase.predict to the interface TuringLang/AbstractPPL.jl#81

Merged

sethaxen mentioned this issue Jul 10, 2023

New Feature: Fix and improve coeftable for optimize() output TuringLang/Turing.jl#2034

Merged

4 tasks

sethaxen mentioned this issue Jul 30, 2023

Add InferenceObjects as a chain_type TuringLang/Turing.jl#1913

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding StatsBase.predict to the API #466

Adding StatsBase.predict to the API #466

sethaxen commented Feb 20, 2023

devmotion commented Feb 20, 2023

sethaxen commented Feb 20, 2023

torfjelde commented Feb 20, 2023

sethaxen commented Feb 20, 2023

torfjelde commented Feb 20, 2023

sethaxen commented Feb 20, 2023

devmotion commented Feb 20, 2023

Adding StatsBase.predict to the API #466

Adding StatsBase.predict to the API #466

Comments

sethaxen commented Feb 20, 2023

devmotion commented Feb 20, 2023

sethaxen commented Feb 20, 2023

torfjelde commented Feb 20, 2023

sethaxen commented Feb 20, 2023

torfjelde commented Feb 20, 2023

sethaxen commented Feb 20, 2023

devmotion commented Feb 20, 2023