Use OMPI without LSF integration on LSF #12556

robertsawko · 2024-05-17T21:05:43Z

I believe this should be relatively simple, but I am struggling to find the right combination of switches.

My target application is quite complex: OpenFOAM, ParaView + Catalyst V2 with OSMESA both using OpenMPI v5. I've used Spack to build it on x86 RHEL7 cluster. Unfortunately, Spack OpenMPI package doesn't support lsf-libdir option which on this cluster is required to correctly build OpenMPI with LSF integration. So I ended up with all my stack built but no LSF integration. Lawless land it seems.

I've already tested my setup for small jobs and now I am about to launch a medium size job: 512 ranks, spanned over 16 nodes, 32 core per node and 1 rank per physical core.

cat \$LSB_DJOB_HOSTFILE | uniq | awk '{print \$1 " slots=32 max_slots=32"}' > myhostfile
mpirun \
    -np 512 \
    --hostfile myhostfile \
    --map-by node \
    --rank-by slot \
    --bind-to core \
    --report-bindings \
    --display-map \
    -wdir $CASE_DIR \
    -x PATH \
    -x LIBRARY_PATH \
    -x LD_LIBRARY_PATH \
         hostname

but I still end up seeing the following error.

prterun was unable to launch the specified application as it encountered an
error:

Error: system limit exceeded on number of files that can be open Node:
sqg6e31

when attempting to start process rank 0.

This can be resolved by setting the mca parameter
opal_set_max_sys_limits to 1, increasing your limit descriptor setting
(using limit or ulimit commands), asking the system administrator for
that node to increase the system limit, or by rearranging your
processes to place fewer of them on that node.

Please advise if there's anything I could improve in my mpirun invocation. The displayed mapping looks correct, but clearly before binding(?) something goes very wrong.

Also, if you think it's impossible to correctly call mpirun for medium and large jobs without LSF integration, then I am happy to focus on fixing the Spack package. I was meaning to do it for a while.

The text was updated successfully, but these errors were encountered:

rhc54 · 2024-05-17T22:28:14Z

This has nothing to do with mpirun or binding - the error message is quite specific:

Error: system limit exceeded on number of files that can be open Node:
sqg6e31

You need to increase the limit on the number of files that can be open, just like it says:

This can be resolved by setting the mca parameter
opal_set_max_sys_limits to 1, increasing your limit descriptor setting
(using limit or ulimit commands), asking the system administrator for
that node to increase the system limit, or by rearranging your
processes to place fewer of them on that node.

robertsawko · 2024-05-18T07:11:17Z

I have tried to set with --mca opal_set_max_sys_limits 1, but that results in exactly the same message. It just occurred to me that all 512 ranks are still trying to start on one node. On the same cluster, when I compiled OpenMPI manually with LSF integration I didn't have the message so that's why I am going this rabbit hole.

I am now trying to fix LSF integration for OMPI in Spack, but strangely by just adding schedulers=lsf HDF5 no longer wants to compile, so I am chasing this route too now independently.

robertsawko · 2024-05-18T07:48:44Z

And I can now also confirm that after --with-lsf-libdir to Spack package.py OpenMPI and compiling with schedulers=lsf. I can just run

...
#BSUB -n 512
#BSUB -R "span[ptile=32] affinity[core(1)]"
...
mpirun hostname

runs just fine - no need to set any mca parameters or changing system limits.

I am trying to fix the packages that broke downstream. I suspect this is because I haven't properly fixed the Spack package, so it's not clear whether I will succeed as this breaks all packages downstream that depend on MPI.

wenduwan · 2024-05-23T16:19:40Z

@robertsawko Please keep us posted on new issues

robertsawko · 2024-05-23T19:43:54Z

Absolutely, I do want to get to the bottom of it. I only just got access to another LSF cluster. My main one is actually down until tomorrow, possibly later, so I couldn't look into it over this week.

robertsawko · 2024-05-23T19:51:05Z

Ah, sorry, I should have said - I started a Spack issue on LSF LIBDIR here, but also last week my LSF cluster went into a week long maintenance so I didn't have computer to test on.

rhc54 · 2024-05-23T21:01:38Z

I confess to being puzzled as to how the LSF libdir can impact the MPI stack (outside of mpirun itself). Nothing in MPI depends on or integrates with LSF.

robertsawko · 2024-05-24T10:43:30Z

Thanks, @rhc54 - you may be right. Maybe LSF is a red herring... When I compile OMPI manually I add all sort of switches:

--enable-shared --disable-static \\
--enable-mpi-fortran=usempi \
--disable-libompitrace \
--enable-wrapper-rpath \
--with-lsf=\${LSF_LIBDIR%%linux*} \
--with-lsf-libdir=\${LSF_LIBDIR} \
--with-knem=\${knem_dir} \
--with-mxm=/opt/mellanox/mxm \
--with-ucx=$CORE_DIR/ucx/1.4.0 \

and my Spack spec was pretty basic:

openmpi+internal-pmix fabrics=auto schedulers=lsf

So I need to test that. Specifically, I need to test that adding knem and mxm rather than relying on auto works.

robertsawko · 2024-05-28T15:00:39Z

Yes, the LSF integration may be a red herring. I am sorry. It looks like the error is caused by adding wdir option. I mis-attributed something again. The actual application I am trying to run is OpenFOAM, but without wdir I was getting

--> FOAM FATAL ERROR :
     Could not find mandatory etc entry (mode=ugo)
     'controlDict'

which I misread as being in the wrong directory. Now I can see clearly that all ranks were indeed starting in the correct working directiroy and this error has something to do with the environment. For instance here they're discussing it in the context of a container and running as root. There's a few posts with people trying to run as root, but that's not the case for me, so I am not sure what I've done wrong here.

I am checking this more carefully now.

robertsawko · 2024-05-28T15:28:20Z

Hmm... looks like the source of my problem is some inconsistency of the environment. If I run without -x flags, non-launch nodes are unaware of Spack. If I add the basic ones line PATH and LD_LIBRARY_PATH, OpenFOAM think I am running in root or something equivalent. I am trying to devise a sensible wrapper...

robertsawko · 2024-05-28T15:45:33Z

After many trials and not so many tribulations I managed to produce a wrapper which reproduces the Spack environment consistently across all nodes. I am happy for this to be closed, but could you please advise if there's any better way to propagate environment across all nodes? Maybe LSF integration was doing just that?

When I was running this:

source /path/to/spack/share/spack/setup-env.sh
spack env activate openfoam_w_catalyst
mpirun \\
    -np 512 \\
    --hostfile myhostfile \\
    --map-by node \\
    --rank-by slot \\
    --bind-to core \\
        myApp

nothing but the launch node would know about my Spack environment. Adding naively, PATH and LD_LIBRARY_PATH produced a confusion about input files, which lead me to more confusion with -wdir option and too many files being opened supposedly.

robertsawko · 2024-05-28T16:05:33Z

Sorry, one more comment as it is pertinent to my original question. I've run some more scripts and I can confirm that with LSF integration I've got launch node environment fully reproduced on all other nodes, whereas without LSF integration PATH et al are set to system defaults. So this has been the source of my misery all along.

rhc54 · 2024-05-28T17:44:57Z

LSF automatically forwards your entire environment. However, ssh does not - so when launching via ssh, your environment will not get forwarded. Easiest way around that is to add the key envars to your login shell script (e.g., .bashrc).

Trying to forward the entire environment under ssh would be problematic as there are limits to the size of the overall ssh string. So the only alternative solution is to ask that the user specify which envars should be forwarded.

robertsawko · 2024-05-29T21:26:56Z

I am going to close this as this is really a solved problem (wrapper) and fix the Spack package.

wenduwan added question State-Awaiting user information labels May 23, 2024

github-actions bot removed the State-Awaiting user information label May 23, 2024

robertsawko closed this as completed May 29, 2024

robertsawko mentioned this issue May 29, 2024

Installation issue: OpenMPI with LSF libdir option spack/spack#44255

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use OMPI without LSF integration on LSF #12556

Use OMPI without LSF integration on LSF #12556

robertsawko commented May 17, 2024

rhc54 commented May 17, 2024

robertsawko commented May 18, 2024

robertsawko commented May 18, 2024 •

edited

wenduwan commented May 23, 2024

robertsawko commented May 23, 2024

robertsawko commented May 23, 2024

rhc54 commented May 23, 2024

robertsawko commented May 24, 2024 •

edited

robertsawko commented May 28, 2024

robertsawko commented May 28, 2024 •

edited

robertsawko commented May 28, 2024

robertsawko commented May 28, 2024 •

edited

rhc54 commented May 28, 2024

robertsawko commented May 29, 2024 •

edited

Use OMPI without LSF integration on LSF #12556

Use OMPI without LSF integration on LSF #12556

Comments

robertsawko commented May 17, 2024

rhc54 commented May 17, 2024

robertsawko commented May 18, 2024

robertsawko commented May 18, 2024 • edited

wenduwan commented May 23, 2024

robertsawko commented May 23, 2024

robertsawko commented May 23, 2024

rhc54 commented May 23, 2024

robertsawko commented May 24, 2024 • edited

robertsawko commented May 28, 2024

robertsawko commented May 28, 2024 • edited

robertsawko commented May 28, 2024

robertsawko commented May 28, 2024 • edited

rhc54 commented May 28, 2024

robertsawko commented May 29, 2024 • edited

robertsawko commented May 18, 2024 •

edited

robertsawko commented May 24, 2024 •

edited

robertsawko commented May 28, 2024 •

edited

robertsawko commented May 28, 2024 •

edited

robertsawko commented May 29, 2024 •

edited