General losses #147

cweniger · 2023-10-25T04:58:38Z

cweniger
Oct 25, 2023
Maintainer

In order to enable more flexible losses and handling of return values, I would suggest the following changes.

Getting rid of `B`

I suggest that the forward function of the inference network only receives what we call currently A, instead of A and B. Also LogRatioEstimator only receives A components. The split into marginal and joined examples is handled then within that object, instead of SwyftModule.

Pro:

Simplicity and transparency.
Makes it easier to handle other losses like NPE type of things.

Con:

During inference, we cannot plug in a single observation in A and large number of prior samples in B. Instead everything has to go through A. In order to make log ratio evaluation efficient, we probably should allow for different components of A having different batch sizes (such that the observation key can have batch size one, and the parameter key can have batch size 1024, for instance). That might lead to unforeseen problems.

More informative return types & flexible losses

Instead of returning LogRatioSamples from the inference network, we directly return the LogRatioEstimator objects (or similar other objects). We add a new method, calc_loss, to these objects, which are then called during training. During evaluation, which doesn't require the contrastive examples, we can call another method, like get_log_ratio_samples.

More general sampling objects

Right now the main sampling objects is LogRatioSamples, which is expected to contain weighted posterior samples, generated by unweighted prior samples plus logratios as weight. If we want to allow for NPE, or results based on for instance nested sampling, we need the ability to store and handle more general samples. In the case of NPE, we could easily handle this by setting logratios = 0, and weight all samples the same. But nested sampling leads to more involved reweighing factors for the dead points. I would suggest that in general we introduce a Sample object that can cover all these cases. What would be enough is to introduce two kinds of weights, one being p(x|z)/p(x) (our logratios) and the other being p(z) (effectively constant for what we currently have). We can call those logratio and logprior. Setting one or both of them to None would imply that they are constant. Thoughts about that?

Infer function

Right now inference is handled through the infer method, which essentially calls predict of lightning under the hood. It evaluates the network many times, with different batches of prior samples. If we generalise the framework to settings, it becomes less obvious how to do that, as the optimal solutions appear case-dependent. Other settings include:

Sampling from NPE flows. Here it would ideal to just add a get_samples method to the NPE object. Usage would be then to call the inference network once with the observed result, grab the returned NPE objects, and call get_samples. I'm not entirely sure how to generate "truncated samples" in the case of NPE.
Sampling from ANRE with nested samples: This would be similar to NPE flows, except that the get_samples method would call a slice sampler. That get_samples method would also require input regarding prior densities or hyper-cube mapping. We could also return truncated samples automatically. It is not clear to me whether this should be the same return type as the posterior samples (probably not, in order to avoid confusion).
GEDA-based stuff: Again, similar to the above, except that get_samples calls the GEDA sampler. Prior information has to be provided as well. "truncated samples" would be here samples with a tempered likelihood. Again, it should be probably a different return type than posterior samples in order to avoid confusion.
Vanilla NRE: In this case, get_samples option does not really work, since the entire point of the framework has been up to now to move simulator-generated prior samples through the ratio estimator. Doing that with a get_samples function is less obvious, since it would then happen outside of the network. If we want to look at posteriors for derived parameters, those parameter derivations could not happen inside the network anymore, but would need to be passed as some kind of transformation hook to the LogRatioEstimator (in order to being able to apply this transformation on-the-fly when calling get_samples on prior samples from the simulator).

All of the above procedures should ideally return the sample PosteriorSamples object, in order to enable a uniform plotting and testing framework, and the same type of Truncated Samples, in order to feed it back to the simulator.

It looks like we might end up with heterogeneous APIs for different use-cases. Is that acceptable or even desirable? What are the problems that this can bring further down the line? @NoemiAM @james-alvey-42, I would appreciate your thoughts about this.

james-alvey-42 · 2023-11-06T13:51:05Z

james-alvey-42
Nov 6, 2023
Collaborator

Just responding with some questions that we can discuss:

Why does having everything in A make it easier for NPE: just because you are not necessarily doing anything contrastive/binary classification?
Return types: I have not thought much about this, so don't really have a strong opinion
I think it is a smart idea to generalise the sampling object. If we want, we can also just subclass it to e.g. LogRatioSamples(Samples) to make it a bit more specific if desired. But as you say, can keep the general properties.
On the .infer story - I would be very much in favour of upgrading this a bit. I think at the moment it does lack a bit of functionality, and perhaps NPE etc. is a good reason to look at it.
In terms of thoughts on a heterogenous API - obviously it is better if we don't have to have it. I might recommend that we sit down and draw out the different types of objects and algorithms, and just see if at the very least there is some general structures that we can then e.g. have specific versions of for the various applications (such as the Samples above)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

General losses #147

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

General losses #147

cweniger Oct 25, 2023 Maintainer

Getting rid of B

More informative return types & flexible losses

More general sampling objects

Infer function

Replies: 1 comment

james-alvey-42 Nov 6, 2023 Collaborator

cweniger
Oct 25, 2023
Maintainer

Getting rid of `B`

james-alvey-42
Nov 6, 2023
Collaborator