How to use these datasets? #3

kyleabeauchamp · 2013-11-26T15:42:12Z

So it seems like for most of these datasets, there's no "right" answer, at least when compared to analytical test cases. That brings up the questions of how we can use these tests in an automated test framework.

The second issue that I'm seeing is that these tests essentially involve running python scripts that involve ~1000 lines of IO, preprocessing, analysis, and output. The scripts are not something that will be easy to integrate into an automated test framework.

kyleabeauchamp · 2013-11-26T15:43:48Z

I guess the first thing we should do is figure out how to port the scripts to pymbar 2.0. The easiest way may be for me to write an mbar1.0 compatability object that exactly reproduces the API of pymbar1.0, but calls pymbar2.0 code under the hood.

kyleabeauchamp · 2013-11-26T15:45:31Z

For example, there's the issue of U_kln versus U_kn. It would take considerable time to rewrite all the scripts here to format the data into the new format, so a compatibility layer might be key.

kyleabeauchamp · 2013-11-26T15:49:18Z

I also think we might want to consider looking for more simple test cases where there are unambiguous right answers, either analytical or numerical.

jchodera · 2013-11-26T15:56:36Z

I would prefer our approach to be:

Analyze a dataset to see the problem someone is describing
Figure out how to recapitulate that problem in a synthetic dataset
Add that synthetic dataset to our tests

As a minimal alternative, we can just make sure the code runs on these datasets, but that is a very low bar.

jchodera · 2013-11-26T15:56:48Z

Is @mrshirts subscribed here?

kyleabeauchamp · 2013-11-26T16:04:24Z

Yes

kyleabeauchamp · 2013-11-26T16:05:39Z

I agree with the synthetic dataset stuff. IMHO I'm just overwhelmed by the idea of us maintaining thousands of lines of user-contributed code as part of our testing protocol.

jchodera · 2013-11-26T16:39:27Z

On Nov 26, 2013, at 5:05 PM, kyleabeauchamp notifications@github.com wrote:

I agree with the synthetic dataset stuff, though. IMHO I'm just overwhelmed by the idea of us maintaining thousands of lines of user-contributed code as part of our testing protocol.

I agree completely. There's no way we can possibly do that.

There may still be a few large datasets that we would like the code to work on or at least give consistent answers on, such as the large trypsin datasets that Michael has generated. But this seems like a low priority goal over testing systems with analytical results.

I still need to code to some analytically tractable systems for binding affinity calculations. Those could be included in our tests as well if we feel we need more diversity than just harmonic oscillators.

John

mrshirts · 2013-11-26T22:00:55Z

Hi, all-

Busy all day with classes and meetings! I'm adding these datasets because
they represent hard cases and/or interesting applications that use a lot of
data.

In all cases, there is a script that is currently working that can be run
to produce output. So at least on a high-level, one just needs a script
that calls those scripts, and inspects the output -- the only customizable
things are the filenames and the names of the output files. These are not
going to be things that are used in nightly regression tests, or even
downloaded by most users.

I don't think we want or need to maintain these things, other than perhaps
altering the call to pymbar (and I'm happy to do that as long as they are
working) They do represent hard problems that we'd like to manage. For
example, the gas-properties is a memory hog, and we'd love to reduce that.
the 8proteins case is a case where the free energy range requires that the
weights be stored in the log case, because otherwise you have exp(large
negative number) * exp(large positive number) = 0 because exp(large
negative number) = 0 to machine precision.

Going back to a question that kyle asked earlier; I suspect that in the
iterative cases, we can probably do the solutions in the exponential
domain, and then store in the log domain (though this needs to be tested).
So when doing an expectation we would do:

A = \sum exp(log W_n + log A_n).

Where W_n is the mixture distribution weight of sample n.

This would incur the cost of exponentials each time, but it's not an
iterative cost at least.

It's possible that one could have some way to test which version would be
used if fast enough if both log and exponential versions are stored. I've
defaulted to only storing log, but that may not be that costly.

Free energies of unsampled states would be

f_new = -log \sum exp(log W_n - u_newn).

Where A_n has been transformed to always be greater than 1.

Note that if we keep a legacy routine (of any flavor) that does everything
in the log domain, we can always test new extreme cases easily.

On Tue, Nov 26, 2013 at 11:39 AM, John Chodera notifications@github.comwrote:

On Nov 26, 2013, at 5:05 PM, kyleabeauchamp notifications@github.com
wrote:

I agree with the synthetic dataset stuff, though. IMHO I'm just
overwhelmed by the idea of us maintaining thousands of lines of
user-contributed code as part of our testing protocol.

I agree completely. There's no way we can possibly do that.

There may still be a few large datasets that we would like the code to
work on or at least give consistent answers on, such as the large trypsin
datasets that Michael has generated. But this seems like a low priority
goal over testing systems with analytical results.

I still need to code to some analytically tractable systems for binding
affinity calculations. Those could be included in our tests as well if we
feel we need more diversity than just harmonic oscillators.

John

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/3#issuecomment-29308171
.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use these datasets? #3

How to use these datasets? #3

kyleabeauchamp commented Nov 26, 2013

kyleabeauchamp commented Nov 26, 2013

kyleabeauchamp commented Nov 26, 2013

kyleabeauchamp commented Nov 26, 2013

jchodera commented Nov 26, 2013

jchodera commented Nov 26, 2013

kyleabeauchamp commented Nov 26, 2013

kyleabeauchamp commented Nov 26, 2013

jchodera commented Nov 26, 2013

mrshirts commented Nov 26, 2013

How to use these datasets? #3

How to use these datasets? #3

Comments

kyleabeauchamp commented Nov 26, 2013

kyleabeauchamp commented Nov 26, 2013

kyleabeauchamp commented Nov 26, 2013

kyleabeauchamp commented Nov 26, 2013

jchodera commented Nov 26, 2013

jchodera commented Nov 26, 2013

kyleabeauchamp commented Nov 26, 2013

kyleabeauchamp commented Nov 26, 2013

jchodera commented Nov 26, 2013

mrshirts commented Nov 26, 2013