Generative AI meets Probabilistic Programming.
ppchain
an open-source toolkit for intuitive, effective modeling.
Your copilot to build model internal representations and optimize your Bayesian workflow.
ppchain
aims to ease the pains of building a model.
Following the 3 main steps of the Bayesian data analysis process, as defined in [1], ppchain
provides a (progressively growing) toolbox of AI-assisted functions aiming to make your life easier along the way:
- Setting up a full probability model—a joint probability distribution for all observable and unobservable quantities in a problem.
ppchain
searches for domain knowledge about your underlying problem and helps building an internal representation that is consistent with both background knowledge and collected data. - Conditioning on observed data: calculating and interpreting the appropriate posterior distribution—the conditional probability distribution of the unobserved quantities of ultimate interest, given the observed data.
- Evaluating the fit of the model and the implications of the resulting posterior distribution: how well does the model fit the data? are the substantive conclusions reasonable? and how sensitive are the results to the modeling assumptions made?
ppchain
provides a (progressively growing) set of AI-assisted functions to progress through the following workflow (where
-
- Define the problem statement
-
- Problem statement (conversational AI)
- Specify hypothesis
- Select model type
- Data collection method
-
- Formalize priors,
$P(\theta)$ -
- Search for background knowledge
- Prior elicitation
- Formalize prior distributions
- Prior predictive checks
- Formalize priors,
-
- Determine the likelihood function,
$P(y \mid \theta)$ -
- Search for background knowledge
- Load & preprocess data
- Formalize the likelihood function
- Determine the likelihood function,
-
- Compute the posterior distribution,
$P(\theta \mid y) \propto P(y \mid \theta) \, P(\theta)$ -
- Variables selection, identifying the subset of predictors to include in the model
- Determine the functional form of the model
- Fit the model to the observed data to estimate the unknown model parameters
- Compute posterior distribution
- Compute the posterior distribution,
-
- Run posterior inference
-
- Compute posterior inference
- Posterior predictive checking
- Sensitivity analysis
- Make predictions about future events
- Documentation: https://ppchain.readthedocs.io
Contributions are very welcome, whether it is in the form of a new feature, improved infrastructure, or better documentation. For detailed information on how to contribute, see CONTRIBUTING.
If you are interested to get further involved with the ValueGrid team, please contact us.
Usage is provided under the MIT license. See LICENSE for full details.
- Initial inspiration for
ppchain
came from Thomas Wiecki, PhD and Daniel Lee, as explained in more details in this LinkedIn post and Medium article. - This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.
[1] | Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A. & Rubin, D. B. (2013). Bayesian data analysis (3rd ed.). Chapman & Hall/CRC |