Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DenseAutoregresive.init_weights causes unstable learning for 1D target distributions #43

Open
feynmanliang opened this issue Feb 18, 2021 · 2 comments

Comments

@feynmanliang
Copy link
Collaborator

feynmanliang commented Feb 18, 2021

When I run the example on flowtorch.ai with a 1D distribution:

import torch
import torch.distributions as dist
import flowtorch
import flowtorch.bijectors as bijectors
# Lazily instantiated flow plus base and target distributions
flow = bijectors.AffineAutoregressive(
    flowtorch.params.DenseAutoregressive()
)
base_dist = dist.Normal(torch.zeros(1), torch.ones(1))
target_dist = dist.Normal(torch.zeros(1)+5, torch.ones(1))
# Instantiate transformed distribution and parameters
new_dist, params = flow(base_dist)
# Training loop
opt = torch.optim.Adam(params.parameters(), lr=1e-3)
for idx in range(501):
    opt.zero_grad()
    # Minimize KL(p || q)
    y = target_dist.sample((1000,))
    loss = -new_dist.log_prob(y).mean()
    if idx % 100 == 0:
        print('epoch', idx, 'loss', loss)
        
    loss.backward()
    opt.step()

sns.relplot(
    data=pd.DataFrame(new_dist.sample((100,)).detach().numpy()),
    x=0, y=1
)

The loss goes to NaN unless the learning rate is set extremely low (1e-15 gives sensible results).

Removing the call to self._init_weights in DenseAutoregressiveresolves the issue and allows a more reasonable1e-3` learning rate.

@feynmanliang feynmanliang changed the title Extremely low learning rates required for 1D target distributions DenseAutoregresive.init_weights causes unstable learning for 1D target distributions Feb 18, 2021
@stefanwebb
Copy link
Owner

How about we disable my new initialization scheme and treat it more like a research topic? I'll add in a flag and it will be disabled by default

@feynmanliang
Copy link
Collaborator Author

Yeah, I think we should break off an experimental submodule

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants