Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AffineAutoregressive leaks global state #42

Open
feynmanliang opened this issue Feb 18, 2021 · 4 comments
Open

AffineAutoregressive leaks global state #42

feynmanliang opened this issue Feb 18, 2021 · 4 comments

Comments

@feynmanliang
Copy link
Collaborator

This static attribute https://github.com/stefanwebb/flowtorch/blob/master/flowtorch/bijectors/affine_autoregressive.py#L18 is a global singleton and leaks across runs.

To repro unexpected behavior:

  • Train an AffineAutoregressive against a 2D distribution (default_param_fn.permutation gets set to a size 2 permutation)
  • Initialize a new AffineAutoregressive and train it against something that is not 2D; this will fail because default_param_fn.permutation is already set so the new shape is not used.
@stefanwebb
Copy link
Owner

Thanks for catching this! The AffineAutoregressive is placed as a class level variable so mypy passes, although we'll need to make some modification to stop this unintended behaviour

@stefanwebb
Copy link
Owner

@feynmanliang could you post a code snippet here to reproduce? This would save me some time

@feynmanliang
Copy link
Collaborator Author

import torch
import torch.distributions as dist
import flowtorch
import flowtorch.bijectors as bijectors
# Lazily instantiated flow plus base and target distributions
flow = bijectors.AffineAutoregressive()
base_dist = dist.Normal(torch.zeros(2), torch.ones(2))
target_dist = dist.Normal(torch.zeros(2)+5, torch.ones(2)*0.5)
# Instantiate transformed distribution and parameters
new_dist, params = flow(base_dist)
# Training loop
opt = torch.optim.Adam(params.parameters(), lr=5e-2)
for idx in range(501):
    opt.zero_grad()
    # Minimize KL(p || q)
    y = target_dist.sample((1000,))
    loss = -new_dist.log_prob(y).mean()
    if idx % 100 == 0:
        print('epoch', idx, 'loss', loss)
        
    loss.backward()
    opt.step()





import torch
import torch.distributions as dist
import flowtorch
import flowtorch.bijectors as bijectors
# Lazily instantiated flow plus base and target distributions
flow = bijectors.AffineAutoregressive()
base_dist = dist.Normal(torch.zeros(3), torch.ones(3))
target_dist = dist.Normal(torch.zeros(3)+5, torch.ones(3)*0.5)
# Instantiate transformed distribution and parameters
new_dist, params = flow(base_dist)
# Training loop
opt = torch.optim.Adam(params.parameters(), lr=5e-2)
for idx in range(501):
    opt.zero_grad()
    # Minimize KL(p || q)
    y = target_dist.sample((1000,))
    loss = -new_dist.log_prob(y).mean()
    if idx % 100 == 0:
        print('epoch', idx, 'loss', loss)
        
    loss.backward()
    opt.step()

@stefanwebb
Copy link
Owner

Make the argument in question an optional and instantiate object in method if None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants