Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support linear combinations of MeanParameters #19

Open
alex-lew opened this issue Mar 6, 2021 · 0 comments
Open

Support linear combinations of MeanParameters #19

alex-lew opened this issue Mar 6, 2021 · 0 comments

Comments

@alex-lew
Copy link
Contributor

alex-lew commented Mar 6, 2021

The goal is to allow parameters (possibly from different classes) to be transformed before they are used as arguments to distributions. For example, linear combinations of normally-distributed parameters can still be used as the mean of a Gaussian observation. This may be useful in cases where we want to model multiple causes; for example, the rent of an apartment may be normally distributed around a linear combination of parameters representing the effects of (1) the apartment's location, (2) the apartment's size, (3) the apartment's landlord, etc.

There are (at least) two general strategies we could take for supporting this:

  1. We could maintain as global state during inference a mean vector and covariance matrix for the multivariate Gaussian posterior over all MeanParameters in a program, updating it as necessary when values are observed or unobserved. Then, we could resample all MeanParameters jointly in a single blocked Gibbs update. This is probably the best approach from an inference perspective, as it fully takes into account all posterior correlations between the variables. I haven't yet worked out the math for what the update rules would be, or how expensive they'd be. A useful reference would be Dario Stein's recent paper, which describes the implementation of a language with Gaussian variables and affine transformations that supports exact conditioning, and uses the "maintain a covariance matrix and mean vector" approach: https://arxiv.org/pdf/2101.11351.pdf

  2. We could perform individual Gibbs updates separately on each MeanParameter. Then when observing that N(x1+x2, sigma) == y, we think of it as an observation of x1 as y-x2 when updating x1, and of x2 as y-x1 when updating x2. This requires fewer changes to the current architecture, at the cost of possibly worse inference (more Gibbs samples are needed to converge to the same local posterior that the blocked update would have sampled from directly).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant