Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft] MP2-F12 #3110

Draft
wants to merge 37 commits into
base: master
Choose a base branch
from
Draft

[Draft] MP2-F12 #3110

wants to merge 37 commits into from

Conversation

EricaCMitchell
Copy link
Contributor

@EricaCMitchell EricaCMitchell commented Dec 15, 2023

Description

New feature to compute the explicitly correlated MP2 energy. Only computes the specific case of MP2-F12/3C(FIX) which has been shown to be the most robust of all versions of MP2-F12. Includes conventional and density-fitted versions of the code and also allows choice between a in-core and disk implementation. Relies on the tensor library Einsums.

Equations come from:
Werner, Adler, and Manby
Kong, Bischoff, and Valeev
Manby

Example Input

molecule {
0 1
C    0.000000000    0.000000000    0.000000000
H    1.185992116    1.185992116    1.185992116
H    1.185992116   -1.185992116   -1.185992116
H   -1.185992116    1.185992116   -1.185992116
H   -1.185992116   -1.185992116    1.185992116

units bohr
symmetry c1
}

set {
  BASIS cc-pvtz-f12
  FREEZE_CORE true
  E_CONVERGENCE 1.e-10
}

set mp2-f12 {
  CABS_BASIS cc-pvtz-f12-optri
  DF_BASIS_F12 aug-cc-pvtz-ri
  F12_TYPE df
  CABS_SINGLES true
}

energy('mp2-f12')

Timings

Timings and maxvmem are an average over 50 runs.
Orbital basis set (OBS) is cc-pVTZ-F12 (VTZ-F12) and CABS is cc-pVTZ-F12-OPTRI. For DF, auxiliary basis set (AUX) is aug-cc-pVTZ-RI.

Timings and Max RAM Usage for MP2-F12/3C(FIX):

VTZ-F12 NOBS NCABS Total (s) Total (min) maxvmem (GB)
CH4 125 239 176.48 2.94 27.554
NH3 107 198 96.64 1.61 15.060
H2O 89 157 53.02 0.88 7.828
HF 71 116 17.96 0.30 4.081

Timings and Max RAM Usage for DF-MP2-F12/3C(FIX):

VTZ-F12 NOBS NCABS NAUX Total (s) maxvmem (GB)
CH4 125 239 290 6.24 3.219
NH3 107 198 244 4.91 2.763
H2O 89 157 198 2.13 2.282
HF 71 116 152 1.18 2.128

VTune Analysis for MP2-F12/3C(FIX)

Memory consumption is most egregious in the form_teints where the allocation of the AO ERI is quite large with the largest AO being (NOBS, NOBS, NRI, NRI) e.g. CH4 would be (125, 125, 364, 364)
image

CPU Time all goes back to the form_teints function and specifically the two_body_ao_computer
image

User API & Changelog headlines

  • MP2-F12 single-point energy

Dev notes & details

  • Computes in-core MP2-F12/3C(FIX) energy
  • Computes disk MP2-F12/3C(FIX) energy
  • Computes in-core DF-MP2-F12/3C(FIX) energy
  • Computes disk DF-MP2-F12/3C(FIX) energy

Questions

  • I am unsure if I have done the disk implementation correctly.
  • The max RAM usage for the conventional is quite large. I could use some suggestions on how to get this down.
  • The conventional MP2-F12/3C(FIX) is not as usable as I would like with the high max RAM usage and slow integral computation. DF-MP2-F12/3C(FIX) is recommended over the conventional.
  • This version of DF-MP2-F12/3C(FIX) uses a more robust scheme than ORCA and MPQC for the density-fitting.

Checklist

Status

  • Ready for review
  • Ready for merge

@loriab
Copy link
Member

loriab commented Dec 19, 2023

Thanks for the PR, Erica! I pushed some lines to the Azure CI so that einsums is enabled and your code has a chance of running :-) . It won't always be this ugly -- ultimately Einsums will be req'd. There's also a blas dependency detail (mkl=2022 vs. 2023) I need to work out to get rid of that openblas pkg.

@konpat
Copy link
Contributor

konpat commented Dec 20, 2023

This might be something dumb on my part, but I cannot get this build to run on our Linux system:

>>> import psi4
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/mmfs1/home/kjp0013/psi4_erika/psi4/objdir_p4d19/stage/lib/psi4/__init__.py", line 71, in <module>
    from . import core
ImportError: /mmfs1/home/kjp0013/psi4_erika/psi4/objdir_p4d19/stage/lib/psi4/core.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3psi6mp2f126mp2f12ESt10shared_ptrINS_12WavefunctionEERNS_7OptionsE

I installed einsums from conda-forge. What else do I need to do?

@loriab
Copy link
Member

loriab commented Dec 20, 2023

@konpat you may need to -D ENABLE_Einsums=ON. You can toggle the value in objdir/CMakeCache.txt, then rebuild.

@konpat
Copy link
Contributor

konpat commented Dec 20, 2023

@loriab this did the trick, thank you!

@konpat
Copy link
Contributor

konpat commented Feb 13, 2024

Fantastic job on this PR, @EricaCMitchell! It took me a while (I apologize), but I finally translated @mkodrycka 's dispersion-F12 code to an MP2-F12 one and, after some tweaking, was able to reproduce your DF-MP2-F12 correlation energy exactly.

I learned quite a bit in the process: initially, our implementations (both based on the formulas from the same Werner-Adler-Manby paper) gave minimally different results, and I found out that our programmed expressions differ by several terms that vanish in the GBC approximation. This approximation is pretty good but neither one of us makes it explicitly in the implementation (no elements of the Fock matrix are zeroed). I think this is completely OK.

One avenue to possibly speed up the code is fully exploiting the fact that our F12 amplitudes are diagonal and we don't need to compute off-diagonal elements of some matrices. For example, out of the entire B matrix, we only use terms of the form B(i,j,i,j) and B(i,j,j,i). I know computing just the diagonal elements is easier said than done, but I think there is room for speedup there.

Finally, I know this was not directly a part of this PR, but do you happen to know the source of the 6-Gaussian fit of the Slater correlation factor? Here's what Psi4 uses for GEM_BETA == 1.0:

std::vector<double> coeffs = {-0.31442480597241274, -0.30369575353387201, -0.16806968430232927,
-0.098115812152857612, -0.060246640234342785, -0.037263541968504843};
std::vector<double> exps = {0.22085085450735284, 1.0040191632019282, 3.6212173098378728,
12.162483236221904, 45.855332448029337, 254.23460688554644};

and this is what Molpro 2012.1 prints out (I don't have a newer version):
Alpha: 0.19532 0.81920 2.85917 9.50073 35.69989 197.79328
Coeff: 0.27070 0.30552 0.18297 0.10986 0.06810 0.04224

This discrepancy, if not removed, does lead to small differences in the final results.

@EricaCMitchell
Copy link
Contributor Author

Thank you @konpat for trying out my code! :)

I will look into speeding up the F12 intermediate matrices by computing only the diagonal elements. I am also looking into speeding up the integral overhead by taking advantage of permutational symmetry of the integrals, but that affects the TwoBodyAOInt class as well.

The 6-Gaussian fit comes from Tew and Klopper. These are close to the coefficients and exponents they reported and are the ones given by MPQC4 and ORCA, which use a linear solve similar to Molpro.

Copy link
Member

@loriab loriab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all your work on this!

I was looking for why windows was failing on CI, and I think I have a fix. Noticed a couple other things along the way.

doc/sphinxman/source/bibliography.rst Outdated Show resolved Hide resolved
doc/sphinxman/source/mp2f12.rst Outdated Show resolved Hide resolved
psi4/driver/driver.py Outdated Show resolved Hide resolved
psi4/src/psi4/f12/wrapper.cc Outdated Show resolved Hide resolved
psi4/src/core.cc Show resolved Hide resolved
@EricaCMitchell
Copy link
Contributor Author

Thanks @loriab for looking into the CI failure!
I'll include all these edits in my next commits and I am also working on making some ctests as well.

I haven't had a lot of time to to focus on computing just the diagonal elements of the F12 intermediates, so that's on the backburner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants