Add rewrite for sum of normal RVs #239

larryshamalama · 2023-03-07T00:02:53Z

Closes 1/3 of #238, for now. As I'm opening this PR, the rewrite works for scalar-valued normal RVs:

import aesara.tensor as at
from aeppl.rewriting import construct_ir_fgraph

import aesara

srng = at.random.RandomStream(0)
X_rv = srng.normal(1, 3)
Y_rv = srng.normal(2, 4)
Z_rv = X_rv + Y_rv

fgraph, _, _ = construct_ir_fgraph({Z_rv: Z_rv.copy()})

aesara.dprint(fgraph.outputs[0])
# (below is subject to change...)
# ValuedVariable [id A]
#  |normal_rv{0, (0, 0), floatX, False}.1 [id B]
#  | |RandomGeneratorSharedVariable(<Generator(PCG64) at 0x12EB2D7E0>) [id C]
#  | |TensorConstant{[]} [id D]
#  | |TensorConstant{11} [id E]
#  | |Elemwise{add,no_inplace} [id F]
#  | | |TensorConstant{1} [id G]
#  | | |TensorConstant{2} [id H]
#  | |Elemwise{sqrt,no_inplace} [id I]
#  |   |Elemwise{add,no_inplace} [id J]
#  |     |Elemwise{pow,no_inplace} [id K]
#  |     | |TensorConstant{3} [id L]
#  |     | |TensorConstant{2} [id M]
#  |     |Elemwise{pow,no_inplace} [id N]
#  |       |TensorConstant{4} [id O]
#  |       |TensorConstant{2} [id P]
#  |normal_rv{0, (0, 0), floatX, False}.1 [id B]

I'm happy to receive comments and then address the list below. The main to-do is to extend to matrix-valued normal RVs.

Some questions:

Checking independence in AePPL. I believe that this identity would hold for conditional independence. I would have to think about this.
What to put for the rng argument in the make_node? For now, I left it as the rng of the first RV input, but this is incorrect.
I added the rewrite as an EquilibriumGraphRewriter. Also not sure about this.
I added this rewrite in a new file, math_stat.py. Would there be a better name for this file? Or should this be added to an existing file?

Happy to hear any comments!

codecov · 2023-03-07T12:30:02Z

Codecov Report

Patch coverage: 95.65% and project coverage change: -0.01 ⚠️

Comparison is base (906c10d) 95.76% compared to head (325afb8) 95.76%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #239      +/-   ##
==========================================
- Coverage   95.76%   95.76%   -0.01%     
==========================================
  Files          12       13       +1     
  Lines        2006     2029      +23     
  Branches      243      246       +3     
==========================================
+ Hits         1921     1943      +22     
  Misses         46       46              
- Partials       39       40       +1

Impacted Files	Coverage Δ
aeppl/convolutions.py	`95.65% <95.65%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

aeppl/math_stat.py

brandonwillard

This looks good!

I've added some comments regarding the handling of RNG objects, but that's a more general concern that we're trying to address in all of our rewriting work, and not a statement about this specific approach/implementation.

brandonwillard · 2023-03-07T18:33:07Z

aeppl/math_stat.py

+    mu_y, sigma_y = Y_rv.owner.inputs[-2:]
+
+    new_node = normal.make_node(
+        X_rv.owner.inputs[0],  # temporary rng?


This is a tough one we've been trying to deal with more generally. The real complications come from any "updates" that are associated with the RNG objects.

More specifically, RNG objects are usually SharedVariables, and those can have SharedVariable.default_update attributes that hold onto other Variables (i.e. graphs representing the new value of the SharedVariable after each call to a compiled Aesara function with this update). In the case of RNG objects created by RandomStreams, those default updates are the RNG objects output after drawing a sample from a RandomVariable node. In other words, the updates mechanism replaces a shared RNG object with a copy of the RNG object after a sample has been drawn using it, so the Aesara updates mechanism is emulating an in-place update of the RNG (e.g. just as rng.normal() automatically updates the internal state of rng in NumPy).

When performing replacements like this, it's possible that rng = X_rv.owner.inputs[0] will have a rng.default_update containing the original X_rv graph, and, if someone attempted to aesara.compile the resulting new_node, Aesara would pick up that old graph from the re-used RNG's default_update attribute and add it to the compiled results. We definitely don't want to have to sample some unrelated graphs just to update the RNG objects, especially when those update graphs aren't consistent with the underlying sampling process.

Anyway, I'm pointing this out because it's a general design issue and usability complication of which I'm trying to make more people aware—mostly so we can fix it/improve the usability.

In this exact case, we can probably just clone the RNG SharedVariables and add our own default_updates to those (e.g. similar to how RandomStream.gen does). Also, the graphs produced here are only supposed to be used as an intermediate representation for obtaining log-probabilities, so, as long as we don't expect people to actually compile and sample these graphs, the default updates shouldn't matter, and we can probably just clone the RNGs and remove their default updates altogether.

Hmm, there's a lot of info here and I have the general picture. Thanks for the explanation.

When you say clone the RNG SharedVariable, do you mean:

new_rng = X_rv.owner.inputs[0].clone() # something more needs to be done to `new_rng... new_node = normal.make_node(rng, *other_inputs) new_node.inputs[0].default_update = new_node

Something is certainly off because x_rv.owner.inputs[0].default_update == x_rv yields False.

I'm also confused because it feels like there are two RNGs here, one from X_rv and one from Y_rv... Or should I create a new RNG object akin to what's being done in RandomStream.gen?

When you say clone the RNG SharedVariable, do you mean:

Yeah, with one important difference, though:

new_rng = X_rv.owner.inputs[0].clone() # something more needs to be done to `new_rng... new_node = normal.make_node(rng, *other_inputs) new_rng.default_update = new_node.outputs[0]

In other words, it's the new_rng that needs to be updated with the values of the RNGs output by new_node.

I'm also confused because it feels like there are two RNGs here, one from X_rv and one from Y_rv

Which RNG we use is really just a choice. The only thing we need to consider is the user-level seeding, and, as long as we choose an RNG from one of the existing RandomVariables, we should maintain some consistency with the seeding. If we use RandomStream.gen, then we can't generate a connection with the user's seeding unless we have their RandomStream instance.

Also, because this whole RandomVariable + SharedVariable.default_update situation is such a mess, I've created this: aesara-devs/aesara#1478.

tests/test_math_stat.py

aeppl/__init__.py

brandonwillard · 2023-03-15T01:16:30Z

This is an elaboration of #239 (comment).

What to put for the rng argument in the make_node? For now, I left it as the rng of the first RV input, but this is incorrect.

That's a very important question! Those inputs are usually SharedVariables with non-None SharedVariable.default_update values. More specifically, those values are the RNG objects output by a RandomVariable Op—i.e. they're the updated RNGs obtained after sampling a random variable.

Altogether, the SharedVariable.default_update value of an RNG is a Variable representing a graph that draws samples, and it's intended to serve as a way of specifying something like an in-place update on the RNG state.

Here's an illustration:

import aesara
import aesara.tensor as at


srng = at.random.RandomStream(23092)

X_rv = srng.normal(0, 1, name="X")

aesara.dprint(X_rv, print_default_updates=True)

# normal_rv{0, (0, 0), floatX, False}.1 [id A] 'X'
#  |RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7F44C55C1120>) [id B] <- [id A]
#  |TensorConstant{[]} [id C]
#  |TensorConstant{11} [id D]
#  |TensorConstant{0} [id E]
#  |TensorConstant{1} [id F]
#
# Default updates:
#
# normal_rv{0, (0, 0), floatX, False}.0 [id A]
#  |RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7F44C55C1120>) [id B] <- [id A]
#  |TensorConstant{[]} [id C]
#  |TensorConstant{11} [id D]
#  |TensorConstant{0} [id E]
#  |TensorConstant{1} [id F]

Notice the .1 at the end of the first normal_rv line; that indicates that the second output (i.e. the sample value) is returned from the RandomVariable node given by id A. The first input to that node is a SharedVariable RNG object (i.e. id B), and its debug print line ends in [id B] <- [id A], which indicates that the SharedVariable is updated by one of the outputs of id A. The default update graph printed at the end shows that the first output (i.e. the .0) of the same id A node is the "update value".

When this graph is aesara.function-compiled, the SharedVariable.default_update graph will be discovered and added as a new output to the entire compiled graph, then, when the compiled function is evaluated, the Linker that executes the code will replace the value in the SharedVariable with the value of that new update output that was added (i.e. the RNG value after sampling).

In pure Python, this whole situation is similar to the following:

from copy import copy
import numpy as np


rng = np.random.default_rng(2309)


def draw_normal(rng):
    new_rng = copy(rng)
    res = new_rng.normal(0, 1)
    return new_rng, res


# Replace the old `rng` with the returned RNG
rng, res = draw_normal(rng)

Now, problems arise when a SharedVariable RNG with a non-None .default_update is used as an input to a different RandomVariable Op, because then the old sampling graph will show up in the compiled results.

This is what happens when we re-use an RNG with default updates for another graph:

# The RNG from `X`
rng_X = X_rv.owner.inputs[0]

# Manually create a `RandomVariable` with a specific RNG
Y_rv = at.random.gamma(0.5, 0.5, name="Y", rng=rng_X)

# Compile the function
Y_rv_fn = aesara.function([], Y_rv)

# View the compiled graph
aesara.dprint(Y_rv_fn)

# gamma_rv{0, (0, 0), floatX, True}.1 [id A] 'Y' 1
#  |RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7F44C55C1120>) [id B]
#  |TensorConstant{[]} [id C]
#  |TensorConstant{11} [id D]
#  |TensorConstant{0.5} [id E]
#  |TensorConstant{2.0} [id F]
# normal_rv{0, (0, 0), floatX, False}.0 [id G] 0
#  |RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7F44C55C1120>) [id B]
#  |TensorConstant{[]} [id C]
#  |TensorConstant{11} [id D]
#  |TensorConstant{0} [id H]
#  |TensorConstant{1} [id I]

As you can see, the compiled graph is sampling a gamma and normal variable, but the normal sampling is only done in order to update the RNG SharedVariable used by the gamma!

If we want to reuse existing RNG SharedVariable objects we need to get rid of the default updates on them. Ultimately, this is a big design complication, because it adds "state" to RandomVariables and—more specifically—ties RNG objects to specific graphs. Some related issues/discussions: aesara-devs/aesara#898, aesara-devs/aesara#738, aesara-devs/aesara#543, aesara-devs/aesara#454, aesara-devs/aesara#1251.

In AePPL, we generally don't sample the IR we produce, so this isn't an immediate problem, but it could easily creep into graphs somewhere down the line and cause real issues.

As far as this rewrite is concerned, we're probably fine copying the SharedVariable and (re)specifying the .default_update graph.

larryshamalama

Thanks @brandonwillard for the thorough reply. It took me a while to have an idea of the problem that you are addressing. @rlouf I incorporated your suggestions to the unit tests

As far as this rewrite is concerned, we're probably fine copying the SharedVariable and (re)specifying the .default_update graph.

The only thing that I have yet to fully address is the RNG of the newly created Normal RV node. I gave some details as a comment in this code review

aeppl/convolutions.py

larryshamalama · 2023-04-17T21:38:47Z

@brandonwillard Is this PR close to the end? Should I get started on the remaining rewrites in #238 and build on top of this PR?

brandonwillard

@brandonwillard Is this PR close to the end? Should I get started on the remaining rewrites in #238 and build on top of this PR?

As far as the rewrite is concerned, it looks good. We can merge it as-is, ~~but I don't think it will be used because of the rewrite ordering and existing transform-based approach.~~

We'll need to change the rewrite DB setup in order to allow rewrites like this (i.e. ones that provide more/better distribution information) to come first in place of the generic transforms approach. That concern is independent of this work, though.

aeppl/convolutions.py

brandonwillard · 2023-04-18T23:43:19Z

OK, I'm about to push a change that adds support for subtraction and changes the DB to which the rewrite is registered. I'll merge after that.

brandonwillard · 2023-04-18T23:52:49Z

This should be good to merge once the tests pass.

Thanks again, @larryshamalama; this was a great addition!

larryshamalama marked this pull request as draft March 7, 2023 00:03

larryshamalama force-pushed the conv-norm-rv branch 2 times, most recently from 590044e to c8ceb6d Compare March 7, 2023 12:15

brandonwillard added enhancement New feature or request graph rewriting Involves the implementation of rewrites to Aesara graphs rv-transforms Involves transforms applied to random variables labels Mar 7, 2023

brandonwillard reviewed Mar 7, 2023

View reviewed changes

aeppl/math_stat.py Outdated Show resolved Hide resolved

brandonwillard reviewed Mar 7, 2023

View reviewed changes

larryshamalama force-pushed the conv-norm-rv branch from c8ceb6d to 43bc724 Compare March 10, 2023 03:13

rlouf reviewed Mar 10, 2023

View reviewed changes

tests/test_math_stat.py Outdated Show resolved Hide resolved

rlouf reviewed Mar 10, 2023

View reviewed changes

tests/test_math_stat.py Outdated Show resolved Hide resolved

rlouf reviewed Mar 10, 2023

View reviewed changes

tests/test_math_stat.py Outdated Show resolved Hide resolved

rlouf reviewed Mar 10, 2023

View reviewed changes

aeppl/__init__.py Outdated Show resolved Hide resolved

larryshamalama force-pushed the conv-norm-rv branch from 43bc724 to cb2bf14 Compare March 10, 2023 14:45

This was referenced Mar 17, 2023

Consider making RandomStream-generated RandomVariables update RNGs in-place aesara-devs/aesara#1478

Closed

Add more conjugates aesara-devs/aemcmc#113

Draft

larryshamalama force-pushed the conv-norm-rv branch 3 times, most recently from 4f4a30b to 00f82b8 Compare April 9, 2023 01:31

larryshamalama commented Apr 9, 2023

View reviewed changes

aeppl/convolutions.py Outdated Show resolved Hide resolved

larryshamalama marked this pull request as ready for review April 9, 2023 01:40

larryshamalama requested review from brandonwillard and rlouf April 9, 2023 01:40

larryshamalama force-pushed the conv-norm-rv branch from 00f82b8 to 5a136dd Compare April 11, 2023 15:48

brandonwillard reviewed Apr 18, 2023

View reviewed changes

aeppl/convolutions.py Outdated Show resolved Hide resolved

Add rewrite for sum of normal RVs

325afb8

brandonwillard force-pushed the conv-norm-rv branch from 5a136dd to 325afb8 Compare April 18, 2023 23:51

brandonwillard enabled auto-merge (rebase) April 18, 2023 23:51

brandonwillard approved these changes Apr 18, 2023

View reviewed changes

brandonwillard disabled auto-merge April 19, 2023 00:15

brandonwillard merged commit 93c83e4 into aesara-devs:main Apr 19, 2023

larryshamalama deleted the conv-norm-rv branch May 4, 2023 20:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add rewrite for sum of normal RVs #239

Add rewrite for sum of normal RVs #239

larryshamalama commented Mar 7, 2023 •

edited

Loading

codecov bot commented Mar 7, 2023 •

edited

Loading

brandonwillard left a comment

brandonwillard Mar 7, 2023 •

edited

Loading

larryshamalama Mar 9, 2023 •

edited

Loading

brandonwillard Mar 18, 2023 •

edited

Loading

brandonwillard Mar 18, 2023

brandonwillard commented Mar 15, 2023 •

edited

Loading

larryshamalama left a comment

larryshamalama commented Apr 17, 2023

brandonwillard left a comment •

edited

Loading

brandonwillard commented Apr 18, 2023

brandonwillard commented Apr 18, 2023

Add rewrite for sum of normal RVs #239

Add rewrite for sum of normal RVs #239

Conversation

larryshamalama commented Mar 7, 2023 • edited Loading

codecov bot commented Mar 7, 2023 • edited Loading

Codecov Report

brandonwillard left a comment

Choose a reason for hiding this comment

brandonwillard Mar 7, 2023 • edited Loading

Choose a reason for hiding this comment

larryshamalama Mar 9, 2023 • edited Loading

Choose a reason for hiding this comment

brandonwillard Mar 18, 2023 • edited Loading

Choose a reason for hiding this comment

brandonwillard Mar 18, 2023

Choose a reason for hiding this comment

brandonwillard commented Mar 15, 2023 • edited Loading

larryshamalama left a comment

Choose a reason for hiding this comment

larryshamalama commented Apr 17, 2023

brandonwillard left a comment • edited Loading

Choose a reason for hiding this comment

brandonwillard commented Apr 18, 2023

brandonwillard commented Apr 18, 2023

larryshamalama commented Mar 7, 2023 •

edited

Loading

codecov bot commented Mar 7, 2023 •

edited

Loading

brandonwillard Mar 7, 2023 •

edited

Loading

larryshamalama Mar 9, 2023 •

edited

Loading

brandonwillard Mar 18, 2023 •

edited

Loading

brandonwillard commented Mar 15, 2023 •

edited

Loading

brandonwillard left a comment •

edited

Loading