Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The discrepancy in calculating u_kln between parallel-tempering-2dpmf.py and umbrella-sampling.py #479

Open
Vinvin-Zhang opened this issue Nov 13, 2022 · 1 comment

Comments

@Vinvin-Zhang
Copy link

Vinvin-Zhang commented Nov 13, 2022

After reading the examples for calculating the PMF based on data by parallel-templering and umbrella-sampling simulations, i found that there is a discrepancy in calculating u_kln(reduced potential energy).

Following the line 222 to 226 in (https://github.com/mrshirts/pymbar/blob/master/examples/parallel-tempering-2dpmf/parallel-tempering-2dpmf.py), reduced potential energy( u_kln[k,l,n] ) of trajectory segment n of temperature k evaluated at temperature l was calculated by following:.

    print("Computing reduced potential energies...")
    u_kln = numpy.zeros([K,K,N_max]) # u_kln[k,l,n] is reduced potential energy of trajectory segment n of temperature k evaluated at temperature l
    for k in range(K):
       for l in range(K):
          u_kln[k,l,0:N_k[k]] = beta_k[l] * U_kn[k,0:N_k[k]]

which means that u_kln[k,l,n] = beta_k[l] * U_kn[k,n] = beta_k[l] * U_kn[k,l,n] ( without bias, U_kn[k,n] = U_kn[k,1,n] = U_kn[k,2,n] = ... = U_kn[k,l,n] = .... ). [equation 1]

While in the examples of analyzing the data by umbrella sampling, as shown in the line 141 to 143 of https://github.com/mrshirts/pymbar/blob/master/examples/umbrella-sampling-pmf/umbrella-sampling.py, it is calculated by:

    # Compute energy of snapshot n from simulation k in umbrella potential l
    u_kln[k,:,n] = u_kn[k,n] + beta_k[k] * (K_k/2.0) * dchi**2

where

    u_kn[k,n] = beta_k[k] * (float(tokens[2]) - float(tokens[1])) # reduced potential energy without umbrella restraint(line 89)

which indicates: u_kln[k,l,n] = beta_k[k] * U0_kn[k,n] + beta_k[k] * (K_k[l] / 2.0) * dchi[l]**2 = beta_k[k] * ( U0_kn[k,n] + (K_k[l] / 2.0) * dchi[l]**2 ) = beta_k[k] * U_kn[k,l,n] [ equation 2]

It seems that the [equation 1] is the correct expression, while the [ equation 2 ] is not.

Generally, based on [ euqation 2 ], for umbrella sampling of constanst temperature(beta_k[k] = beta_k[l]), it is ok, while for different temperature, it will lead to incorrect result.

Here are my suggestions:
1)In line 89 of https://github.com/mrshirts/pymbar/blob/master/examples/umbrella-sampling-pmf/umbrella-sampling.py,

  u_kn[k,n] = beta_k[k] * (float(tokens[2]) - float(tokens[1])) # reduced potential energy without umbrella restraint

should be changed to :

  U_kn[k,n] = float(tokens[2]) - float(tokens[1])  # potential energy without umbrella restraint

2)In line 143,

  u_kln[k,:,n] = u_kn[k,n] + beta_k[k] * (K_k/2.0) * dchi**2 

should be changed to:

  u_kln[k,:,n] = beta_k * ( U_kn[k,n] + (K_k/2.0) * dchi**2 )
@mrshirts
Copy link
Collaborator

Thanks for noticing the problem!

Note that the main repository for the release is in choderalab/pymbar not mrshirts/pymbar. However, the same issue is present there. My personal repository is not necessarily in sync with the main repository!

We never noticed the problem since the provided data is only at 300K, and thus the erroneous code is never actually run. But it certainly would be confusing to users.

I think your solution looks correct, as the beta_k is then indexed by l, so then visits all of the other code. I will have to grab a few moments to look over carefully to see the best way to put it in the code (especially since we don't have actual data to check).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants