gtypes is a prototype implementation of a probabilistic programming language that (i) features a coroutine-based paradigm for implementing generative models and custom inference guides (e.g., proposals for importance sampling), and (ii) uses a novel guide-type system to ensure that the distributions specified by a model and its guide have the same support; as a consequence, the model-guide pair is provably sound for probabilistic inference. Currently, our implementation compiles models and guides to Pyro programs, for either importance sampling or variational inference.
- Installing opam:
$ sh <(curl -sL https://raw.githubusercontent.com/ocaml/opam/master/shell/install.sh)
$ opam init -c 4.10.0
$ eval $(opam env)
- Installing the required packages:
$ opam install dune dune-build-info core menhir
$ pip3 install pyro-ppl greenlet
- Compiling gtypes:
Run make
to compile the main binary. You may then
also run make install
to install gtypes as an opam package.
gtypes inputs are monadic typed functional programs. Primitive types include
unit
, bool
, nat[n]
for natural numbers smaller than n
, nat
for natural
numbers, ureal
for real numbers on the unit interval, preal
for positive
real numbers, and real
for real numbers. Basic types are primitive types,
arrows (for functions), products (for pairs), Tb dist
for distributions on
inhabitants of basic type Tb
, and (Tp, [m1; m2; ...]) tensor
for tensors of
primitive type Tp
and shape [m1; m2; ...]
. For interaction with external code,
gtypes also allows external types, which are arbitrary identifiers.
gtypes programs separate expressions and commands. Expressions describe pure, effect-less computations, i.e., deterministic computations without any randomness. The expression language is essentially a simply-typed lambda calculus with constructors for primitive types. Commands deal with randomness, introduced by sampling commands, as well as communication among coroutines. gtypes adapts a message-passing mechanism; in a system, there are some named channels, each of which connects two processes, and processes can send/receive a value on channels.
Below is a list of probability distributions supported by gtypes:
-
BER(prob)
: Bernoulli distributions -
UNIF
: The uniform distribution on the unit interval -
BETA(concentration1; concentration0)
: Beta distributions -
GAMMA(concentration; rate)
: Gamma distributions -
Normal(loc; scale)
: Normal distributions -
CAT(prob1; prob2; ...; probn)
: Categorical distributions -
BIN(total_count; prob)
: Binomial distributions -
GEO(prob)
: Geometric distributions -
POIS(rate)
: Poisson distributions
Commands are organized in a monadic syntax:
-
return E
: Evaluate the expressionE
and return its value. -
x <- C1; C2
: Execute the commandC1
, bind the value tox
, and continue to the commandC2
. -
Pname(E1; E2; ...)
: Call the procedurePname
with argumentsE1
,E2
, ... -
sample{ch}(E)
: Evaluate the expressionE
to get a distributiond
, draw a random samplev
fromd
, and sendv
to the channel namedch
. -
observe{ch}(E)
: Evaluate the expressionE
to get a distributiond
, receive a valuev
from the channel namedch
, and proceed as ifv
is sampled fromd
. -
if{ch} E then C1 else C2
: Evaluate the expression E to get a Booleanb
, pickC1
orC2
based onb
, and sendb
to the channel namedch
. -
if{ch} . then C1 else C2
: Receive a Booleanb
from the channelch
, and proceed toC1
orC2
based on the value ofb
. -
if E then C1 else C2
: A standard conditional command. -
loop[n; Einit]( fn (x : Tb) -> C )
: Loop forn
iterations with an accumulator of basic typeTb
and initial valueEinit
; in each iteration, bind the accumulator from last iteration tox
and execute the commandC
, whose return value is the next accumulator. -
iter[E; Einit]( fn t (x : Tb) -> C )
: EvaluateE
to get a tensor, and then loop through the tensor along its first dimension with an accumulator of basic typeTb
and initial valueEinit
; in each iteration, bind the sub-tensor tot
and the accumulator from last iteration tox
, then execute the commandC
.
A program consists of a sequence of top-level bindings, each of which takes one of the forms below:
-
external ename : Tb
Declare an external binding of type
Tb
. This is usually used to allow operations on values of external types. -
type Gname
Declare a guide-type identifer
Gname
. gtypes requires each channel in the system to be associate with such an identifier; when type-checking, gtypes infers a communication protocol for each channel and ensures that different parties that operates on the same channel have followed the same communication protocol. -
proc Pname(x1 : Tb1; x2 : Tb2; ...) -> Tbr | ch1 : Gname1 | ch2 : Gname2 = C
Define a procedure
Pname
. Its formal parameters arex1
,x2
, ... with basic typesTb1
,Tb2
, ..., respectively. Its return type isTbr
. gtypes assumes each process is an invocation of a procedure, and the process can access at most two channels: it consumesch1
of guide-type identifierGname1
, and providesch2
of guide-type identiferGname2
. In case a procedure does not consume or provide a channel, you can write.
for thech : Gname
part.PS: in some inference algorithms, it is useful to have a notion for global variables, which are not possible in gtypes. For example, variational inference defines guides as parameterized distributions and then optimizes the parameters. To aid the inference, gtypes allows
proc Pname[y1 : Tb1; y2 : Tb2; ...](...) ...
to specify those global parametersy1
,y2
, ...
Examples can be found in the folder bench
.
For a description of the command-line usage options, run gtypes help
.
gtypes has several modes that specify different stages of compilation to Pyro:
-
only-parse FILE
: Only parse the input program and check lexing/parsing errors -
type-check FILE
: Type-check the input program, infer and check all the communication protocols -
normalize FILE
: Translate the input program to a-normal-form (this mode currently does not have any output, so it is mainly used to test normalization) -
compile-m FILE -o file.py
: Compile a model program to a Pyro program and write it tofile.py
-
compile-g -o file.py -m FILE1 -g FILE2
: Compile a guide program (for a specific model program) to a Pyro program and write it tofile.py
Example usage:
$ gtypes compile-g -m bench/anglican/branching -g bench/anglican/branching-proposal -o guide.py