Performance improvements for fix_ipi #4157

ceriottm · 2024-05-02T06:36:54Z

Summary

In the process of benchmarking and speeding up the i-pi engine, we noticed a couple of performance improvements that need to be applied to the LAMMPS client routines, that are in fix_ipi.
These improvements will also benefit other engines that use the same communication protocol, such as ASE.
The performance issues and the fixes are discussed in #4155, but in short they are due to quirks of TCP/IP that slow down communication of small messages, and to issues in ensuring consistent folding of the atom positions on the client/server sides, that lead to unnecessarily frequent updates of the neighbor list.

Related Issue(s)

This closes #4155, implementing the second solution proposed in there, which after some thinking seems to be the cleanest. The only change outside fix_ipi.cpp is adding a getter to access the positions at the last NL update.

Author(s)

Michele Ceriotti (EPFL) michele.ceriotti@gmail.com
Davide Tisi (EPFL) davide.tisi@epfl.ch

Licensing

By submitting this pull request, I agree, that my contribution will be included in LAMMPS and redistributed under either the GNU General Public License version 2 (GPL v2) or the GNU Lesser General Public License version 2.1 (LGPL v2.1).

Backward Compatibility

This should change nothing outside the i-PI communication. For i-PI simulations using LAMMPS as the backend, the fact this applies unwrapping to the trajectory introduces small numerical noise, but trajectories should be unchanged for a few hundred steps - and of course statistically equivalent throughout.

Implementation Notes

Every time new configurations are received from i-PI, they are folded back to the image that is closest to the positions at the time the neighbor list was updated. This effectively applies a minimum image convention to the calculation of the atom drift, and ensures that neighbor lists are updated as rarely as possible.

Post Submission Checklist

The feature or features in this pull request is complete
Licensing information is complete
Corresponding author information is complete
The source code follows the LAMMPS formatting guidelines
Suitable new documentation files and/or updates to the existing docs are included
The added/updated documentation is integrated and tested with the documentation build system
The feature has been verified to work with the conventional build system
The feature has been verified to work with the CMake based build system
A package specific README file has been included or updated

Further Information, Files, and Links

See i-pi for a discussion of the background and what fix_ipi does.

ceriottm · 2024-05-02T09:01:13Z

I don't know what's going on with the MacOS unit tests, looks like a CI issue perhaps?

stanmoore1 · 2024-05-13T18:14:11Z

I don't know what's going on with the MacOS unit tests, looks like a CI issue perhaps?

@ceriottm Yes, can you please pull the latest upstream develop branch here, which should fix the issue? I don't seem to have permissions to modify your branch.

stanmoore1 · 2024-05-13T18:16:39Z

@ceriottm there are also some trailing whitespaces in fix_ipi.cpp, you can either remove them manually, or use make fix-whitespace, see https://docs.lammps.org/Build_development.html.

…trigger re-compute of the NL

Leftover from debugging output

ceriottm · 2024-05-13T20:37:42Z

Thanks. I have rebased on master, and fixed (hopefully) the whitespace issues.

jtclemm · 2024-05-13T21:30:30Z

src/MISC/fix_ipi.cpp

+        x[i][2] += xhold[i][2];
+      }
+    }
+  }


I could just be missing something, but I don't understand how this scheme works for a general box. Say the box doesn't contain the origin $\vec{0}$. If a particle doesn't move, such that $\vec{x} = \vec{x}_{hold}$, then domain->pbc() would shift $(\vec{x} - \vec{x}_{hold}) = \vec{0}$ across periodic images back into the box. Adding $\vec{x}_{hold}$ again would then shift the particle back out of the box and trigger an unnecessary neighbor rebuild since now $\vec{x} \ne \vec{x}_{hold}$.

If I understand the intent correctly, what about alternatively shifting xhold to the nearest periodic image, e.g. something like

if (domain->triclinic) domain->x2lamda(atom->nlocal); domain->pbc(); domain->reset_box(); if (domain->triclinic) domain->lamda2x(atom->nlocal); if (neigh->ncalls != 0) { for (int i = 0; i < nlocal; i++) { delx = xhold[i][0] - x[i][0]; ... domain->minimum_image(delx, dely, delz); xhold[i][0] = x[i][0] + delx; ... } }

This works because xhold is shifted back to the 000 replica when the neighbor list is updated. So this code makes sure the displacement is has a minimum image convention applied when it's computed in the neighbor list code, without touching that part of the code that is too core to be messed with, IMO.

I see. So what happens if $\vec{x} = \vec{x}_{hold}$ (both in the same image) but the box doesn't contain the origin? Does the neighbor list get rebuilt?

Since domain->pbc() acts on position vectors and its behavior depends on the location of the boxlo, I would not assume feeding it a displacement vector (invariant to origin translations) would produce the desired behavior for all box geometries and displacements.

Thanks, I agree this is cleaner. I have reformulated the code in fix_ipi.cpp - only difference is I am modifying the coordinates rather than xhold, again following the principle of minimal messing with the core classes.

Cool, happy to help. Hopefully that works to improve performance!

stanmoore1 · 2024-05-14T19:08:19Z

Waiting for review by @sjplimp

src/neighbor.cpp

sjplimp · 2024-05-21T18:17:38Z

src/neighbor.cpp

+{
+  // Returns the pointer containing the last positions stored by the NL builder,
+  // checking it has actually been initialized
+  if (maxhold == 0) {


I'm not clear on this error check. Does fix ipi know or check whether the neighbor class is using the xhold array? That is a neighbor setting which can be turned on/off.

Also, it might be possible, now or in the future, that xhold is NULL and maxhold is 0 if the proc has no atoms.

Seems like it might be better to just return xhold and have the caller (iPI) decide what to do with it ?

OK that is fine with me. I wanted to have kind of a safety check given we're accessing an array that might or might not be initialized but I can def let the caller deal with the checking (and maybe just skip the unwrapping of the coordinates if xhold is not initialized.

Just to clarify a bit more. If the user sets neigh_modify check no (which used to be the default but is not now), then LAMMPS uses other criteria to determine when to re-neighbor, and the xhold array is not allocated or used.

Also, if xhold is used, then on each reneighboring it is assigned the current atom coords, which will be guaranteed to be inside the simulation box, because PBC are applied first. On every subsequent step (until the next reneighboring), all atoms coords will be "close" to xhold, even if they go slightly outside the box, because PBC are not applied. So unless IPI is applying PBC on those in-between steps to its copy of the LAMMPS atom coords (e.g. to ensure its atom coords are inside the box), I'm not clear why you need to apply minimum-image adjustments to your del xyz.

Also xhold can be reallocated by any proc on any reneighboring step (but not otherwise during a run), so you should invoke get_xhold() at least that often. The public variable Neighbor::ago stores how many elapsed timesteps since the last reneighbor, so some classes use that to check if they need to do something. Note that reneighboring happens in between your fix ipi initial and final integrate methods.

I now see that you call domain->pbc() every timestep in fix ipi initial_integrate()
The comment says it is in case you also need to invoke irregular(), which is checked for later.

I don't understand the motivation here. However, if xhold does not exist, you cannot "undo" what
the pbc() method did, and I don't think you will successfully restore the atom coord values LAMMPS is expecting. E.g. if a pair style is used later in the same timestep, its pairwise atom distances will be wrong.

Also, if irregular is invoked then it seems like you should be forcing re-neighboring to occur on this timestep.
B/c current neighbor lists are now invalid.

Hi, thanks for the detailed summary of the neighboring process. It's very useful because most of the use cases we're aware of might use a small slice of the functionalities of LAMMPS .
To give you a similarly concise view of what happens on the i-PI side, the tricky bit is that (1) there is no guarantee that i-PI will pass coordinates and cell that follow a continuous trajectory, e.g. there might be replica exchange going on and then LAMMPS will be called repeatedly with completely unrelated structures; (2) the data flow is minimal and e.g. i-PI has no way of knowing that LAMMPS folds coordinates when setting xhold.
So the problem we had was that i-PI passes un-folded coordinates. LAMMPS applies PBC and sets xhold. i-PI then runs MD and sends again un-folded coordinates; if we don't fold them, they'll be far off the xhold reference, and if we fold there's no guarantee they'll be folded in the same way. Either way, we essentially trigger a NL rebuild at each step. This PR is trying (and in all our tests succeeds) to avoid this rebuild.
In the current implementation, at each step when LAMMPS gets new positions, if xhold is not set the atoms will just be folded back into the cell at the beginning of the step, otherwise they'll be "unwrapped" relative to xhold. I understand there may be reneighboring settings where this might fail, but I think that also the previous implementation (which was just folding regardless) would fail; if such a case occurs, we may have to look deeper into this; I think the previous implementation (including the use if irregular() ) was done with a lot of help from @akohlmey so I trust it much more than if it were of my own making ^_^

this also allows it to fall-back on do-nothing rather than crash

sjplimp · 2024-05-27T15:48:35Z

@ceriottm Hi Michele - thanks for the additional context. If you are convinced this current PR is working as you expect for your typical i-PI use cases, then I think we should go ahead and merge it. So that it make it into the upcoming stable release.

One final suggestion - If I understand, you are now using this fix for cases beyond the PIMD usage described in the fix ipi doc page. So I suggest you broaden the doc page to explain it can be used for other kinds of i-PI-based models. If there are examples of this already in the LAMMPS distro (which includes tools/ipi) or on the i-PI doc pages, that would be useful for users to be aware of.

After the stable release, I suggest we re-visit how this fix ipi works. I think there are still a few issues that could be more robust with respect to LAMMPS options. Possibly that would enable other use cases of LAMMPS via i-PI. For example, if you just want LAMMPS to compute and return forces for a series of unrelated configurations, I don't think you want to be doing that via intitial_integrate() and final_integrate().

ceriottm · 2024-05-27T21:32:03Z

@ceriottm Hi Michele - thanks for the additional context. If you are convinced this current PR is working as you expect for your typical i-PI use cases, then I think we should go ahead and merge it. So that it make it into the upcoming stable release.

Thanks, I agree this is the best course of action for now - this fixes a clear problem, and doesn't affect any of the examples I've seen and tried - that are quite diverse.

One final suggestion - If I understand, you are now using this fix for cases beyond the PIMD usage described in the fix ipi doc page. So I suggest you broaden the doc page to explain it can be used for other kinds of i-PI-based models. If there are examples of this already in the LAMMPS distro (which includes tools/ipi) or on the i-PI doc pages, that would be useful for users to be aware of.

This is true. I will edit the doc page, and actually I suggest it might be a good idea to remove i-PI from tools - or at least update it: we are about to release 3.0 and I had completely forgotten we had embedded 1.0 with LAMMPS. It is now also possible to get i-PI from pip, so it doesn't seem a good idea to bundle it with LAMMPS. Let me know what you think and whether you'd like to do remove or update in a separate PR - seems cleaner.

After the stable release, I suggest we re-visit how this fix ipi works. I think there are still a few issues that could be more robust with respect to LAMMPS options. Possibly that would enable other use cases of LAMMPS via i-PI. For example, if you just want LAMMPS to compute and return forces for a series of unrelated configurations, I don't think you want to be doing that via intitial_integrate() and final_integrate().

I agree, absolutely. One of our plans for the future is to also include MPI-based communicators. LAMMPS is one of our main clients and it'd be great to coordinate.

akohlmey · 2024-05-27T23:08:30Z

@ceriottm please go to src/ and type make fix-whitespace and commit the resulting changes.

I suggest it might be a good idea to remove i-PI from tools

A proposal for that is in PR 4176

sjplimp · 2024-05-28T15:38:41Z

Thanks @ceriottm - doc page changes look good

ceriottm requested a review from sjplimp as a code owner May 2, 2024 06:36

ceriottm added enhancement neighborlist_feature labels May 2, 2024

stanmoore1 requested review from akohlmey, athomps and sjplimp and removed request for sjplimp May 13, 2024 18:14

stanmoore1 self-assigned this May 13, 2024

stanmoore1 assigned ceriottm and unassigned stanmoore1 May 13, 2024

akohlmey requested review from jtclemm and removed request for akohlmey May 13, 2024 19:59

ceriottm and others added 7 commits May 13, 2024 22:33

Try a (dirty) fix to the i-pi neighbor list update problem

b3fc34f

Minimally-invasive implementation of the ipi-side modification

bb471b1

disable Nagle's algorithm for internet socket

fe19a7e

Just some additional comments, and removed debug output

ad90c98

No need to go through the whole list if one atom has moved enough to …

7b728cd

…trigger re-compute of the NL

Removed iostream import

bc38b55

Leftover from debugging output

Fix whitespace

2dc5931

ceriottm force-pushed the bugfix/ipi-neigh branch from 10f7221 to 2dc5931 Compare May 13, 2024 20:34

jtclemm reviewed May 13, 2024

View reviewed changes

Changed folding logic to use minimum_image rather than pbc

b5ecea5

jtclemm approved these changes May 14, 2024

View reviewed changes

stanmoore1 assigned sjplimp and unassigned ceriottm May 14, 2024

stanmoore1 added the ready label May 14, 2024

sjplimp reviewed May 21, 2024

View reviewed changes

src/neighbor.cpp Outdated Show resolved Hide resolved

sjplimp reviewed May 21, 2024

View reviewed changes

move xhold checks caller-side

8f61bc5

this also allows it to fall-back on do-nothing rather than crash

Updated the documentation for i-PI

aa8cd7a

akohlmey mentioned this pull request May 27, 2024

Remove i-PI distribution from tools folder #4176

Open

5 tasks

Invisible mais penible

3335170

sjplimp approved these changes May 28, 2024

View reviewed changes

stanmoore1 assigned stanmoore1 and unassigned sjplimp May 28, 2024

stanmoore1 merged commit dedcfa1 into lammps:develop May 28, 2024
6 checks passed

Luthaf deleted the bugfix/ipi-neigh branch May 29, 2024 11:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance improvements for fix_ipi #4157

Performance improvements for fix_ipi #4157

ceriottm commented May 2, 2024 •

edited

ceriottm commented May 2, 2024

stanmoore1 commented May 13, 2024

stanmoore1 commented May 13, 2024 •

edited

ceriottm commented May 13, 2024

jtclemm May 13, 2024

ceriottm May 13, 2024

jtclemm May 13, 2024

ceriottm May 14, 2024

jtclemm May 14, 2024

stanmoore1 commented May 14, 2024

sjplimp May 21, 2024

ceriottm May 22, 2024

sjplimp May 22, 2024

sjplimp May 22, 2024

ceriottm May 22, 2024

sjplimp commented May 27, 2024

ceriottm commented May 27, 2024

akohlmey commented May 27, 2024

sjplimp commented May 28, 2024

Performance improvements for fix_ipi #4157

Performance improvements for fix_ipi #4157

Conversation

ceriottm commented May 2, 2024 • edited

ceriottm commented May 2, 2024

stanmoore1 commented May 13, 2024

stanmoore1 commented May 13, 2024 • edited

ceriottm commented May 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stanmoore1 commented May 14, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sjplimp commented May 27, 2024

ceriottm commented May 27, 2024

akohlmey commented May 27, 2024

sjplimp commented May 28, 2024

ceriottm commented May 2, 2024 •

edited

stanmoore1 commented May 13, 2024 •

edited