Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NWChem build takes 10+ hours to complete, and it ignores parallelization options #959

Open
yurivict opened this issue Mar 29, 2024 · 17 comments

Comments

@yurivict
Copy link
Contributor

yurivict commented Mar 29, 2024

Describe the bugThe build time for the FreeBSD port regressed to many hours from ~1.5 hour for unknown reasons.
It appears to spend a lot of time in bash scripts, and in perl.

Here is the relevant GNU Make issue that I've created: https://savannah.gnu.org/bugs/?65533

It also doesn't build in parallel - make ignores the flag -j8.

makefiles have some issues that cause gmake slowdown.
Also: -jN is ignored, how to make it build in parallel?

Could you please try to build it with gmake-4.4 and see if there's a slowdown?

Version: 7.2.0

Describe settings used
Environent:
NWCHEM_TOP=/usr/ports/science/nwchem/work/nwchem-7.2.0-release/src/.. NWCHEM_MODULES=all NWCHEM_LONG_PATHS=Y NWCHEM_TARGET=LINUX64 USE_INTERNALBLAS=Y EXTERNAL_GA_PATH=/usr/local BLAS_SIZE=4 USE_64TO32=y USE_LIBXC=Y USE_MPI=Y PYTHONVERSION=3.9 NWCHEM_MODULES="all python" F77="gfortran13" F90="gfortran13" FC="gfortran13" FFLAGS="-O -Wl,-rpath=/usr/local/lib/gcc13" F90FLAGS="-O -Wl,-rpath=/usr/local/lib/gcc13" FCFLAGS="-Wl,-rpath=/usr/local/lib/gcc13" PERL_USE_UNSAFE_INC=1 XDG_DATA_HOME=/usr/ports/science/nwchem/work XDG_CONFIG_HOME=/usr/ports/science/nwchem/work XDG_CACHE_HOME=/usr/ports/science/nwchem/work/.cache HOME=/usr/ports/science/nwchem/work PATH=/usr/local/libexec/ccache:/usr/ports/science/nwchem/work/.bin:/home/yuri/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin PKG_CONFIG_LIBDIR=/usr/ports/science/nwchem/work/.pkgconfig:/usr/local/libdata/pkgconfig:/usr/local/share/pkgconfig:/usr/libdata/pkgconfig MK_DEBUG_FILES=no MK_KERNEL_SYMBOLS=no SHELL=/bin/sh NO_LINT=YES ADDR2LINE="/usr/local/bin/addr2line" AR="/usr/local/bin/ar" AS="/usr/local/bin/as" CPPFILT="/usr/local/bin/c++filt" GPROF="/usr/local/bin/gprof" LD="/usr/local/bin/ld" NM="/usr/local/bin/nm" OBJCOPY="/usr/local/bin/objcopy" OBJDUMP="/usr/local/bin/objdump" RANLIB="/usr/local/bin/ranlib" READELF="/usr/local/bin/readelf" SIZE="/usr/local/bin/size" STRINGS="/usr/local/bin/strings" PREFIX=/usr/local LOCALBASE=/usr/local CC="cc" CFLAGS="-O2 -pipe -fstack-protector-strong -fno-strict-aliasing " CPP="cpp" CPPFLAGS="" LDFLAGS=" -Wl,-rpath=/usr/local/lib/gcc13 -L/usr/local/lib/gcc13 -fstack-protector-strong " LIBS="" CXX="c++" CXXFLAGS="-O2 -pipe -fstack-protector-strong -fno-strict-aliasing " CCACHE_DIR="/tmp/.ccache" BSD_INSTALL_PROGRAM="install -s -m 555" BSD_INSTALL_LIB="install -s -m 0644" BSD_INSTALL_SCRIPT="install -m 555" BSD_INSTALL_DATA="install -m 0644" BSD_INSTALL_MAN="install -m 444"

FreeBSD 14

Attach log files
Compilation proceeds very slowly, never ends. Here is the log before it was killed:
https://freebsd.org/~yuri/nwchem-7.2.0_3.log

To Reproduce

  1. Steps to reproduce the behavior: Build with gmake-4.4.1
  2. Attach all the input files required to run: n/a

Expected behavior
Build in a reasonable time.

@edoapra
Copy link
Collaborator

edoapra commented Mar 29, 2024

Any chance of having the log files posted?

@yurivict
Copy link
Contributor Author

I attached the log into the original message.
But the log doesn't show any problem, since GNU make now handles variable expansions differently, and fires up exponentially more sub-processes.

See the explanation from GNU make people here: https://savannah.gnu.org/bugs/?65533

@edoapra
Copy link
Collaborator

edoapra commented Mar 29, 2024

No sign of slowdowns on Fedora 39 that uses make 4.4.1

export USE_MPI=y
export USE_INTERNALBLAS=1
export BLAS_SIZE=8
export NWCHEM_MODULES=all

@edoapra
Copy link
Collaborator

edoapra commented Mar 29, 2024

What version of NWChem is used here?
Is it really 7.2.0?
Why aren't you using 7.2.2 ?

@yurivict
Copy link
Contributor Author

Why aren't you using 7.2.2 ?

Because of some sort of conflict with GA. There is some error message.
GNU Make maintainers identified some problems in the NWChem makefiles.

@edoapra
Copy link
Collaborator

edoapra commented Mar 30, 2024

Why aren't you using 7.2.2 ?

Because of some sort of conflict with GA. There is some error message. GNU Make maintainers identified some problems in the NWChem makefiles.

Could you post the details of this issue?

@edoapra
Copy link
Collaborator

edoapra commented Mar 30, 2024

FreeBSD 14.0 seems to have gmake 4.3
How did you install gmake 4.4?

@yurivict
Copy link
Contributor Author

FreeBSD 14.0 seems to have gmake 4.3
How did you install gmake 4.4?

gmake-4.4.1 is the current gmake version on FreeBSD 14.0

If you installed with the 'quarterly' packages (in /etc/pkg/FreeBSD.conf) - you need to change this to 'latest' and 'pkg upgrade -f'

@yurivict
Copy link
Contributor Author

The FreeBSD port builds with BLAS_SIZE=4 (this is probably worse than BLAS_SIZE=8, but anyway).

Maybe I am mistaken, but some sort of size conversions appear to be done extensively during the build (perl scripts and various shell commands are run a lot).

Wild guess, but maybe there is no slowdown for BLAS_SIZE=8, only for BLAS_SIZE=4?

@jeffhammond
Copy link
Collaborator

BLAS_SIZE=4 requires a Perl script transformation of every source file that contains a BLAS call. It takes forever.

Use BLAS_SIZE=8 and a compatible library to compile faster.

@yurivict
Copy link
Contributor Author

yurivict commented Mar 31, 2024

It calls grep, awk, cut, wc, bash, etc a lot with BLAS_SIZE=8 too. Build is still very slow with gmake-4.4.1

In fact, it spends most time running grep, awk, cut, wc, bash, etc, and Fortran takes only a small fraction of time, making the build very slow.

@edoapra
Copy link
Collaborator

edoapra commented Apr 1, 2024

@yurivict Are you using the release tarballs for the FreeBSD builds?
If this is the case, there is no need of the make 64_to_32 step since the source tarball has already been processed through make 64_to_32

@yurivict
Copy link
Contributor Author

yurivict commented Apr 1, 2024

@edoapra
No, GitHub tarball is used.

@edoapra
Copy link
Collaborator

edoapra commented Apr 1, 2024

Why?

@edoapra
Copy link
Collaborator

edoapra commented Apr 5, 2024

The current hotfix/release-7-2-0 branch compiles in a reasonable amount of time with FreeBSD 14.0 and make 4.4 (around 20 minutes for the 64_to_32 step and 30 minutes for the actual compilation)
Issue #960 should be fixed in too.
Could you give it a try and test it?
https://github.com/nwchemgit/nwchem/tree/hotfix/release-7-2-0

curl -LJO https://github.com/nwchemgit/nwchem/tarball/hotfix/release-7-2-0/
tar xzf nwchemgit-nwchem-v7.2.2-release*gz
rm nwchemgit-nwchem-v7.2.2-release*gz
ln -sf nwchemgit-nwchem-* nwchem-7.2.2
cd nwchem-7.2.2

Keep in mind this is not a release tarball with the 64_to_32 processing, but just an automated github tarball

@yurivict
Copy link
Contributor Author

yurivict commented Apr 5, 2024

I confirm that nwchem now builds faster. On my slow system the build has succeeded in 80 minutes.

Thank you!

@edoapra
Copy link
Collaborator

edoapra commented Apr 6, 2024

Thank you very much for the feedback.
I might be able to get another patch release going some time in the not so distant future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants