Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CMakeLists.txt: Properly handle cpu flags #533

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

AndrewAmmerlaan
Copy link

Having multiple cpu flags enabled is valid, so we should use if instead of elsif. Otherwise only the first HAVE_* variable is actually respected.

e.g. before this change we get this on Gentoo:

/usr/bin/x86_64-pc-linux-gnu-g++ -march=native -O2 -pipe -frecord-gcc-switches -Wl,-O1 -Wl,--as-needed -Wl,--defsym=__gentoo_check_ldflags__=0   -rdynamic   -fsigned-char  -mavx2 -mcx16 -Wa,-q -mcx16 -std=c++1y -pedantic -Wall -Wextra -Wdisabled-optimization -fno-exceptions -fopenmp src/CMakeFiles/mmseqs.dir/mmseqs.cpp.o -o src/mmseqs  src/libmmseqs-framework.a  src/version/libversion.a  lib/tinyexpr/libtinyexpr.a  -lm  -Wl,-Bstatic  -lzstd  lib/microtar/libmicrotar.a  -Wl,-Bdynamic  -lz  -lbz2 && :

i.e. only avx2 is respected, sse2/4 is ignored.

After change we get:

/usr/bin/x86_64-pc-linux-gnu-g++ -march=native -O2 -pipe -frecord-gcc-switches -Wl,-O1 -Wl,--as-needed -Wl,--defsym=__gentoo_check_ldflags__=0   -rdynamic   -fsigned-char  -mavx2 -mcx16 -Wa,-q -msse4.1 -mcx16 -msse2 -std=c++1y -pedantic -Wall -Wextra -Wdisabled-optimization -fno-exceptions -fopenmp src/CMakeFiles/mmseqs.dir/mmseqs.cpp.o -o src/mmseqs  src/libmmseqs-framework.a  src/version/libversion.a  lib/tinyexpr/libtinyexpr.a  -lm  -Wl,-Bstatic  -lzstd  lib/microtar/libmicrotar.a  -Wl,-Bdynamic  -lz  -lbz2 && :

gentoo/sci#1143

Signed-off-by: Andrew Ammerlaan andrewammerlaan@gentoo.org

gentoo-bot pushed a commit to gentoo/sci that referenced this pull request Feb 13, 2022
Upstream PR is here: soedinglab/MMseqs2#533
See also: #1143

Package-Manager: Portage-3.0.30, Repoman-3.0.3
Signed-off-by: Andrew Ammerlaan <andrewammerlaan@gentoo.org>
@milot-mirdita
Copy link
Member

milot-mirdita commented Feb 13, 2022

I thought that -mavx2 would imply (most) lower SSE levels. We also use an SSSE3 instruction in some important place (iirc), so should we also enable that explicitly? (Edit: we use _mm_shuffle_epi8 a lot.)

I don't really like it a lot that with this change -DHAVE_AVX2 -DHAVE_ARM8 would be a possible but nonsensical combination.

BTW Debian uses -DNATIVE_ARCH=0 (https://salsa.debian.org/med-team/mmseqs2/-/blob/master/debian/rules) and compiles multiple SIMD levels and adds a SIMD dispatcher (https://salsa.debian.org/med-team/mmseqs2/-/blob/master/debian/bin/simd-dispatch), if that is in any way useful for you.

@AndrewAmmerlaan
Copy link
Author

We also use an SSSE3 instruction in some important place (iirc), so should we also enable that explicitly?

Yes we should probably have a HAVE_SSE3 option for that as well.

I don't really like it a lot that with this change -DHAVE_AVX2 -DHAVE_ARM8 would be a possible but nonsensical combination.

Well yes, but if the user sets nonsensical flags it is not that big of a problem if the outcome is (expected) nonsense. Is there an easy way we can avoid this?

BTW Debian uses -DNATIVE_ARCH=0 (https://salsa.debian.org/med-team/mmseqs2/-/blob/master/debian/rules) and compiles multiple SIMD levels and adds a SIMD dispatcher (https://salsa.debian.org/med-team/mmseqs2/-/blob/master/debian/bin/simd-dispatch), if that is in any way useful for you.

In Gentoo this is controlled by CPU_FLAGS_X86 variable at build time. In general we don't use dispatchers for this because there isn't really a point if things are built exactly according to the users set preference. The problem here is that the user preference is not respected because the flags are treated as mutually exclusive.

gentoo/sci#1143

Signed-off-by: Andrew Ammerlaan <andrewammerlaan@gentoo.org>
@AndrewAmmerlaan
Copy link
Author

I don't really like it a lot that with this change -DHAVE_AVX2 -DHAVE_ARM8 would be a possible but nonsensical combination.

How about now? This should fix the problem but still keep the nonsensical combinations mutually exclusive.

@benjamin-lieser
Copy link
Member

At least on g++ -mavx2 implies all the other flags. See https://stackoverflow.com/questions/77828641/does-mavx2-implies-mavx-and-msse4-2
Because this is true for the actual CPUs, I don't think other compilers would handle this differently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants