Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MBAR gives a statistical error magnitudes larger than the bootstrap error. #519

Open
xiki-tempula opened this issue Jan 5, 2024 · 4 comments

Comments

@xiki-tempula
Copy link

The statistical error is 40.48 while the bootstrap error is 0.35
The code to reproduce

import pymbar
import numpy as np

u_nk = np.load("u_nk.npy")
N_k = [10000] * 12

mbar = pymbar.MBAR(
    u_nk,
    N_k,
    maximum_iterations=10000,
    relative_tolerance=1e-07,
    verbose=True,
    initial_f_k=None,
    solver_protocol="robust",
    n_bootstraps=0,
)
out = mbar.compute_free_energy_differences(return_theta=True, uncertainty_method=None)

print(out["dDelta_f"][0][-1])
mbar = pymbar.MBAR(
    u_nk,
    N_k,
    maximum_iterations=10000,
    relative_tolerance=1e-07,
    verbose=True,
    initial_f_k=None,
    solver_protocol="robust",
    n_bootstraps=50,
)
out = mbar.compute_free_energy_differences(
    return_theta=True, uncertainty_method="bootstrap"
)
print(out["dDelta_f"][0][-1])

u_nk.npy.zip

@mikemhenry
Copy link
Contributor

Have you had a chance to check if you see the same behavior with pymbar 3?

@mrshirts
Copy link
Collaborator

So this is a fascinating case. I am not sure that it's necessarily wrong. Interestingly, the first state has nearly, but not exactly 0 overlap with the other 11 states (see mbar.compute_overlap()['matrix']. So the analytical estimate is definitely off for that state. However, bootstrapping has very little variation.

One weird thing is the u_nk matrix. After sample 11, only state 0 has any energy - the rest are zeros. This seems odd, and likely unintended? Also in sample 0, both state 0 and 11 have zero energy, which seem weird, since state 0 has large energy for the rest of the run.

@xiki-tempula
Copy link
Author

@mrshirts Thanks for taking a look.
When you say that

After sample 11, only state 0 has any energy - the rest are zeros.

Which row or column do you mean?

@mrshirts
Copy link
Collaborator

mrshirts commented Jan 12, 2024

  1. sorry for being so unclear! 2) 1 gave some numbers from a case where I was manipulating the data to test a hypothesis.

The first paragraph above is completely correct. I.e. the first state has very little overlap with the other states, so the analytical estimate is going to be off. But, suprisingly, that doesn't rule out the bootstrap being correct in some situations. The interesting thing is that the covariance matrixes from the analytical case and the bootstrap are very close to each other (within ~10%, about what you would expect between variance methods), except for the variances from state 0 to the other states (which are very different). Note also that the bootstrap estimate of variance to state 0 from the other states is much larger than the estimate of variance between the other 11 states, but 1 order of magnitude different instead of 3 orders of magnitude different.

The 2nd paragraph was wrong where I was trying to test a hypothesis from the first observation (basically, trying to change the reference state), but was failing to construct the correct u_kn matrix, so that stuff is wrong. Apologies for that.

The hypothesis was "is the bootstrap variation low because I used the poorly overlapping state as the reference state, and would the variances be larger if I relabeled the states without changing the physics. When I successfully switched the 0 and 11 state labels, then I found that the uncertainties were essentially preserved (huge analytical difference from state 11, uncertainty to state 11 by bootstrap like 0.3), meaning that the low variance from bootstrap is NOT an artifact of the state 0.

SO, conclusion - I think this is a case where bootstrap is right (or much closer to right) and analytical is wrong, but if you want to get better results and also uncertainties agree, it is better to run simulations where there is good overlap between all states.

orbeckst pushed a commit to alchemistry/alchemlyb that referenced this issue May 21, 2024
)

* problem: In the backward and forward convergence, for the initial set of points, which uses for example the first 10% of the date, it could be the cases where due to the fact that there are not many data points, so the overlap is pretty bad, which gives terrible statistical error.
   solution: If the statistical error is too bad, use the bootstrap error instead, see choderalab/pymbar#519
* new kwarg error_tol  in convergence.forward_backward_convergence() to allow the user to specify a error tolerance; if error > error_tol then switch to using bootstrap error
* Update CHANGES
* add test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants