-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvements for fix_ipi #4157
Conversation
I don't know what's going on with the MacOS unit tests, looks like a CI issue perhaps? |
@ceriottm Yes, can you please pull the latest upstream |
@ceriottm there are also some trailing whitespaces in |
…trigger re-compute of the NL
Leftover from debugging output
10f7221
to
2dc5931
Compare
Thanks. I have rebased on master, and fixed (hopefully) the whitespace issues. |
x[i][2] += xhold[i][2]; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could just be missing something, but I don't understand how this scheme works for a general box. Say the box doesn't contain the origin domain->pbc()
would shift
If I understand the intent correctly, what about alternatively shifting xhold
to the nearest periodic image, e.g. something like
if (domain->triclinic) domain->x2lamda(atom->nlocal);
domain->pbc();
domain->reset_box();
if (domain->triclinic) domain->lamda2x(atom->nlocal);
if (neigh->ncalls != 0) {
for (int i = 0; i < nlocal; i++) {
delx = xhold[i][0] - x[i][0];
...
domain->minimum_image(delx, dely, delz);
xhold[i][0] = x[i][0] + delx;
...
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works because xhold
is shifted back to the 000 replica when the neighbor list is updated. So this code makes sure the displacement is has a minimum image convention applied when it's computed in the neighbor list code, without touching that part of the code that is too core to be messed with, IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. So what happens if
Since domain->pbc()
acts on position vectors and its behavior depends on the location of the boxlo
, I would not assume feeding it a displacement vector (invariant to origin translations) would produce the desired behavior for all box geometries and displacements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I agree this is cleaner. I have reformulated the code in fix_ipi.cpp
- only difference is I am modifying the coordinates rather than xhold, again following the principle of minimal messing with the core classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, happy to help. Hopefully that works to improve performance!
Waiting for review by @sjplimp |
src/neighbor.cpp
Outdated
{ | ||
// Returns the pointer containing the last positions stored by the NL builder, | ||
// checking it has actually been initialized | ||
if (maxhold == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not clear on this error check. Does fix ipi know or check whether the neighbor class is using the xhold array? That is a neighbor setting which can be turned on/off.
Also, it might be possible, now or in the future, that xhold is NULL and maxhold is 0 if the proc has no atoms.
Seems like it might be better to just return xhold and have the caller (iPI) decide what to do with it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK that is fine with me. I wanted to have kind of a safety check given we're accessing an array that might or might not be initialized but I can def let the caller deal with the checking (and maybe just skip the unwrapping of the coordinates if xhold is not initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify a bit more. If the user sets neigh_modify check no (which used to be the default but is not now), then LAMMPS uses other criteria to determine when to re-neighbor, and the xhold array is not allocated or used.
Also, if xhold is used, then on each reneighboring it is assigned the current atom coords, which will be guaranteed to be inside the simulation box, because PBC are applied first. On every subsequent step (until the next reneighboring), all atoms coords will be "close" to xhold, even if they go slightly outside the box, because PBC are not applied. So unless IPI is applying PBC on those in-between steps to its copy of the LAMMPS atom coords (e.g. to ensure its atom coords are inside the box), I'm not clear why you need to apply minimum-image adjustments to your del xyz.
Also xhold can be reallocated by any proc on any reneighboring step (but not otherwise during a run), so you should invoke get_xhold() at least that often. The public variable Neighbor::ago stores how many elapsed timesteps since the last reneighbor, so some classes use that to check if they need to do something. Note that reneighboring happens in between your fix ipi initial and final integrate methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now see that you call domain->pbc() every timestep in fix ipi initial_integrate()
The comment says it is in case you also need to invoke irregular(), which is checked for later.
I don't understand the motivation here. However, if xhold does not exist, you cannot "undo" what
the pbc() method did, and I don't think you will successfully restore the atom coord values LAMMPS is expecting. E.g. if a pair style is used later in the same timestep, its pairwise atom distances will be wrong.
Also, if irregular is invoked then it seems like you should be forcing re-neighboring to occur on this timestep.
B/c current neighbor lists are now invalid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for the detailed summary of the neighboring process. It's very useful because most of the use cases we're aware of might use a small slice of the functionalities of LAMMPS .
To give you a similarly concise view of what happens on the i-PI side, the tricky bit is that (1) there is no guarantee that i-PI will pass coordinates and cell that follow a continuous trajectory, e.g. there might be replica exchange going on and then LAMMPS will be called repeatedly with completely unrelated structures; (2) the data flow is minimal and e.g. i-PI has no way of knowing that LAMMPS folds coordinates when setting xhold.
So the problem we had was that i-PI passes un-folded coordinates. LAMMPS applies PBC and sets xhold. i-PI then runs MD and sends again un-folded coordinates; if we don't fold them, they'll be far off the xhold reference, and if we fold there's no guarantee they'll be folded in the same way. Either way, we essentially trigger a NL rebuild at each step. This PR is trying (and in all our tests succeeds) to avoid this rebuild.
In the current implementation, at each step when LAMMPS gets new positions, if xhold is not set the atoms will just be folded back into the cell at the beginning of the step, otherwise they'll be "unwrapped" relative to xhold. I understand there may be reneighboring settings where this might fail, but I think that also the previous implementation (which was just folding regardless) would fail; if such a case occurs, we may have to look deeper into this; I think the previous implementation (including the use if irregular() ) was done with a lot of help from @akohlmey so I trust it much more than if it were of my own making ^_^
this also allows it to fall-back on do-nothing rather than crash
@ceriottm Hi Michele - thanks for the additional context. If you are convinced this current PR is working as you expect for your typical i-PI use cases, then I think we should go ahead and merge it. So that it make it into the upcoming stable release. One final suggestion - If I understand, you are now using this fix for cases beyond the PIMD usage described in the fix ipi doc page. So I suggest you broaden the doc page to explain it can be used for other kinds of i-PI-based models. If there are examples of this already in the LAMMPS distro (which includes tools/ipi) or on the i-PI doc pages, that would be useful for users to be aware of. After the stable release, I suggest we re-visit how this fix ipi works. I think there are still a few issues that could be more robust with respect to LAMMPS options. Possibly that would enable other use cases of LAMMPS via i-PI. For example, if you just want LAMMPS to compute and return forces for a series of unrelated configurations, I don't think you want to be doing that via intitial_integrate() and final_integrate(). |
Thanks, I agree this is the best course of action for now - this fixes a clear problem, and doesn't affect any of the examples I've seen and tried - that are quite diverse.
This is true. I will edit the doc page, and actually I suggest it might be a good idea to remove i-PI from tools - or at least update it: we are about to release 3.0 and I had completely forgotten we had embedded 1.0 with LAMMPS. It is now also possible to get i-PI from pip, so it doesn't seem a good idea to bundle it with LAMMPS. Let me know what you think and whether you'd like to do remove or update in a separate PR - seems cleaner.
I agree, absolutely. One of our plans for the future is to also include MPI-based communicators. LAMMPS is one of our main clients and it'd be great to coordinate. |
@ceriottm please go to src/ and type
A proposal for that is in PR 4176 |
Thanks @ceriottm - doc page changes look good |
Summary
In the process of benchmarking and speeding up the i-pi engine, we noticed a couple of performance improvements that need to be applied to the LAMMPS client routines, that are in
fix_ipi
.These improvements will also benefit other engines that use the same communication protocol, such as ASE.
The performance issues and the fixes are discussed in #4155, but in short they are due to quirks of TCP/IP that slow down communication of small messages, and to issues in ensuring consistent folding of the atom positions on the client/server sides, that lead to unnecessarily frequent updates of the neighbor list.
Related Issue(s)
This closes #4155, implementing the second solution proposed in there, which after some thinking seems to be the cleanest. The only change outside
fix_ipi.cpp
is adding a getter to access the positions at the last NL update.Author(s)
Michele Ceriotti (EPFL) michele.ceriotti@gmail.com
Davide Tisi (EPFL) davide.tisi@epfl.ch
Licensing
By submitting this pull request, I agree, that my contribution will be included in LAMMPS and redistributed under either the GNU General Public License version 2 (GPL v2) or the GNU Lesser General Public License version 2.1 (LGPL v2.1).
Backward Compatibility
This should change nothing outside the i-PI communication. For i-PI simulations using LAMMPS as the backend, the fact this applies unwrapping to the trajectory introduces small numerical noise, but trajectories should be unchanged for a few hundred steps - and of course statistically equivalent throughout.
Implementation Notes
Every time new configurations are received from i-PI, they are folded back to the image that is closest to the positions at the time the neighbor list was updated. This effectively applies a minimum image convention to the calculation of the atom drift, and ensures that neighbor lists are updated as rarely as possible.
Post Submission Checklist
Further Information, Files, and Links
See i-pi for a discussion of the background and what
fix_ipi
does.