-
Notifications
You must be signed in to change notification settings - Fork 841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3rd-party: bump openpmix submodule #12532
Conversation
The failure looks real. @rhc54 This happens on openpmix master branch. Does it ring any bell? |
bot:aws:retest |
Not really - I just tested it on my Docker cluster and it works fine. |
If you can configure with |
Also, I don't really understand that output. What does it mean that the |
@rhc54 Thanks Ralph. I need to poke this more. Will let you know. |
rather odd, if i build with --enable-debug and run mpi4py by hand singleton doesn't fail. |
Is it |
I found a bug in PRRTE's modex procedure - it might be contributing here (enabling debug would have made a difference to what I found)? Anyway, you might need to pull that around. Also did a little cleanup in PMIx, though I'm not sure if that would contribute to what you are seeing either. Should have included the relevant links: |
Test new patches from openpmix project |
Fix for singletons is here: openpmix/openpmix#3345 |
Switched to Ralph's pmix branch for testing. |
FWIW: I believe some of these tests are failing because you are patching the local 3rd-party code instead of advancing a submodule pointer. The problem is that the tests are trying to recursively clone the repo in your branch, and they cannot do that if the branch isn't tied to a commit. If you are pointing your submodule at my branch, be aware that my branch gets deleted once the PR is committed. Best to just wait for commit and then advance the submodule to the head of master. |
I don't remember seeing this though.
|
Not familiar with that function, but I can try to take a look. If it involves |
Signed-off-by: Wenduo Wang <wenduwan@amazon.com>
@rhc54 IIRC you mentioned some WIP in pmix that might fix the create group failure. Is that something we can test? |
AWS CI also failed. Many OMB/IMB benchmarks did not start. |
yes the failures are due to some group construct issue.the mpi create from group(s) methods use pmix group construct/destruct methods. |
Yeah, I mentioned this at the RM meeting earlier this week. It will take me a while to fix - got a lot going on right now. Issue isn't in PMIx, but rather in PRRTE. |
Sorry, Ralph is correct. He said the issue was in prrte. |
advance sha to e32e0179bc. related to open-mpi#12532 Signed-off-by: Howard Pritchard <howardp@lanl.gov>
Closing in favor of #12565 |
advance sha to e32e0179bc. related to open-mpi#12532 Signed-off-by: Howard Pritchard <howardp@lanl.gov>
Track upstream master branch