Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BTL_OFI_BLACKLIST #6734

Closed
wants to merge 2 commits into from

Conversation

guserav
Copy link
Contributor

@guserav guserav commented Jun 3, 2019

Don’t initialize OFI BTL if only PSM2 provider is available (override with MCA parameter)

Co-authored-by: jlbyrne-hpe <john.l.byrne@hpe.com>
Signed-off-by: guserav <erik.zeiske@web.de>
@ompiteam-bot
Copy link

Can one of the admins verify this patch?

@bwbarrett
Copy link
Member

ok to test

@bwbarrett
Copy link
Member

Can you add some color as to why the OFI BTL shouldn't use the psm2 provider? We should have a note either in the commit log or in a comment about why it is excluded.

I'm also not sure customers are going to love having to exclude both MTL and BTL providers. Maybe that's ok, but we should think about it.

Copy link
Member

@jsquyres jsquyres left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if a user supplies --mca btl_ofi_provider_include BLAH on the command line? It does not look like you are checking the source of the include / exclude lists to see if the user supplied "include" but not "exclude" (and therefore the exclude list should be ignored), ...etc.

EDIT: Fixed mtl_ofi... from original text to be btl_ofi..., which is what I originally intended.

opal/mca/btl/ofi/btl_ofi_component.c Outdated Show resolved Hide resolved
opal/mca/btl/ofi/btl_ofi_component.c Outdated Show resolved Hide resolved
Mainly to satisfy the coding style.

Signed-off-by: guserav <erik.zeiske@web.de>
@guserav guserav force-pushed the btl_ofi_prov_exclude_default branch from 878fc75 to 36fa464 Compare June 4, 2019 17:04
@jlbyrne-hpe
Copy link

Can you add some color as to why the OFI BTL shouldn't use the psm2 provider? We should have a note either in the commit log or in a comment about why it is excluded.

There is no support in a release for one-sided using ofi. The OFI BTL was added to allow one-sided support for the 4.0.0 release and it was removed from the release during testing when some versions of psm2 broke things. See "OFI issues on Open MPI v4.0.0rc1 (Ralph H Castain) 20 Sep 2018" on the devel list.

The hope is to enable one-sided support for some providers without causing trouble for others without doing a lot more work. It is suggested in the email above that a common OFI component, but this is a lot more work. The goal of the patch is to enable one-sided support for some providers with minimal effort without breaking anything.

I'm also not sure customers are going to love having to exclude both MTL and BTL providers. Maybe that's ok, but we should think about it.

I don't love having separate MTL/BTL include/exclude parameters, but it may not be possible to avoid having two sets of parameters. My understanding is that enabling one-sided for psm2 negatively impacts two-sided performance. Given that the use of MPI one-sided is rare, we may need one-sided include/exclude parameters whatever we do. (See my final question below.)

What happens if a user supplies --mca mtl_ofi_provider_include BLAH on the command line? It does not look like you are checking the source of the include / exclude lists to see if the user supplied "include" but not "exclude" (and therefore the exclude list should be ignored), ...etc.

Despite the comment, the behavior is not exactly as with mtl. Including a provider will override it on the exclusion list, but the exclusion list is not just ignored as in mtl. And it was deliberately chosen not to make the include a list, but keep it as a single provider.

However, as I noted above, one-sided is rare. Maybe the correct approach to this is for BTL OFI to be disabled unless a provider is specified instead of trying to enable it automatically if ofi is present?

@jsquyres
Copy link
Member

jsquyres commented Jun 5, 2019

@jlbyrne-hpe Forgive me -- I mis-typed: I meant to say --mca btl_ofi_provider_include BLAH (btl, not mtl).

My concern is that the help messages that the include/exclude params are mutually exclusive, but the code does not treat them that way. Put differently: what happens if the user specifies both btl_ofi_provider_include and btl_ofi_provider_exclude? It's tremendously confusing, but it can happen (e.g., if someone sets one in a system-wide default and then tries to set the other on the command line). The intent for these include/exclude params is that it should only be permissible that one of them is set, and therefore it is obvious exactly what behavior the user should expect.

Make sense?

If MTL OFI exhibiting this same behavior, then it is also incorrect and should be fixed.

@jlbyrne-hpe
Copy link

@jsquyres Understood.

@hppritcha
Copy link
Member

bot:ompi:retest

@jsquyres
Copy link
Member

jsquyres commented Jun 10, 2019

@guserav @jlbyrne-hpe Let me take a step back and ask: what is the exact problem you are trying to solve here?

The title is "Add BTL_OFI_BLACKLIST", but there is no BTL_OF_BLACKLIST term or variable anywhere here. It's the btl_ofi_provider_exclude list. Also, the description of the PR ("Don’t initialize OFI BTL if only PSM2 provider is available (override with MCA parameter)") does not match what the contents of the PR do (i.e., you added a btl_ofi_provider_exclude MCA param and added a whole bunch of OFI providers in there by default).

Can you specify exactly what problem you are trying to solve? Is there something wrong with the psm2 OFI provider and/or the OFI BTL? If so, have you talked to Intel about it?

@jlbyrne-hpe
Copy link

@jsquyres

HPE worked together with an Intel intern and Howard Pritchard to develop the original BTL OFI which was tested on psm2, gni, and our in-progress zhpe provider. It was proposed and accepted into Open MPI and all seemed fine until the 4.0 release tests when it broke existing psm2 installations. I do not recall the exact issue, but the contention was that it the problems occurred only with certain versions of the Libfabric/OPA libraries, If this is true, the solution is to update things, but if you don't care about one-sided, why should you have to update things that are working? Of course, by this time, the intern had left and there were other priorities, so things were left as they were.

So the problem we are trying to solve is how to get an OIpen MPI release which will support one-sided for OFI providers without breaking existing working two-sided installations. It was suggested by Howard that an exclusion list wiith psm2 on it by default might be the easiest way forward.

The use of the term "blacklist" in the PR was perhaps an unfortunate choice of words.

@jsquyres
Copy link
Member

This seems like a big hammer to fix an OFI-version specific problem. It also restricts quite a bit more than just the psm2 OFI provider, which is neither what the PR title nor commit message says.

Given that your goal is to restrict behavior on Intel hardware, I think Intel needs to be involved in the discussion.

@jlbyrne-hpe
Copy link

@jsquyres Actually, given the existence of BTL OFI in the tree, I consider it to be a low-effort fix to enable a feature a few users will care about while not breaking things for the majority of users who don't. In the 4.0.x branch, a system with psm2 and Ethernet would perform one-sided over Ethernet; on master with this patch the expectation for psm2 will still use Ethernet by default, but it can be overridden and other providers shouldn't have to worry.

As to the other items on the exclusion list, this is the exclusion list from MTL OFI with psm2 added. In the absence of a common OFI component, a common OFI header would be useful to sharing things like this. We can certainly address any documentation issues you desire.

I do think that given the lack of use of one-sided, the best choice is off-by-default, though.

@hppritcha
Copy link
Member

Well we analyzed this problem extensively on a call today with Intel, HPE, and LANL participants and determined that this blacklist approach to working around PSM2 libfabric provider limitations will not work, at least not while the PSM2 MTL is active. The underlying issue is that using PSM2 through libfabric and directly doesn't work. Although it appears to be okay to call PSM2_Init more than once, the same is not true for PSM2_finalize.

what we are noticing is that the blacklist approach works if one selects the OFI MTL. It does not work if one goes with the default of PSM2 MTL. Even if one excludes the PSM2 provider in a blacklist for the OFI BTL, the very fact that one calls fi_getinfo and the PSM2 provider is present in the libfabric is sufficient to trigger the use of PSM2_finalize with libfabric. This results in messages like this:

hpp@gr0770[]:/usr/projects/hpctools/hpp/ompi/examples> (guserav-btl_ofi_prov_exclude_default *)mpirun -np 4 ./hello_c
Hey initializing psm2_init
Hey initializing psm2_init
Hey initializing psm2_init
Hey initializing psm2_init
Hello, world, I am 0 of 4, (Open MPI v4.1.0a1, package: Open MPI hpp@gr-fey1.lanl.gov Distribution, ident: 4.1.0a1, repo rev: v2.x-dev-7008-gf1b2a096, Unreleased developer copy, 149)
Hello, world, I am 3 of 4, (Open MPI v4.1.0a1, package: Open MPI hpp@gr-fey1.lanl.gov Distribution, ident: 4.1.0a1, repo rev: v2.x-dev-7008-gf1b2a096, Unreleased developer copy, 149)
Hello, world, I am 1 of 4, (Open MPI v4.1.0a1, package: Open MPI hpp@gr-fey1.lanl.gov Distribution, ident: 4.1.0a1, repo rev: v2.x-dev-7008-gf1b2a096, Unreleased developer copy, 149)
Hello, world, I am 2 of 4, (Open MPI v4.1.0a1, package: Open MPI hpp@gr-fey1.lanl.gov Distribution, ident: 4.1.0a1, repo rev: v2.x-dev-7008-gf1b2a096, Unreleased developer copy, 149)
psmx2 init count 1
HEY WERE SHUTTING DOWN
gr0770.localdomain.114696PSM2 has not been initialized
psmx2 init count 1
HEY WERE SHUTTING DOWN
gr0770.localdomain.114697PSM2 has not been initialized
psmx2 init count 1
HEY WERE SHUTTING DOWN
gr0770.localdomain.114695PSM2 has not been initialized
psmx2 init count 1
HEY WERE SHUTTING DOWN
gr0770.localdomain.114694PSM2 has not been initialized
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[13889,1],3]
  Exit code:    255
--------------------------------------------------------------------------

Although it might be useful to include a blacklist in the OFI BTL, it won't solve this PSM2 provider/PSM2 being used as the MTL problem. So closing this PR and open a new one when a better solution is found.

@hppritcha hppritcha closed this Jul 1, 2019
@BrendanCunningham
Copy link
Member

Built with ompi/master HEAD commit 5d51b23.

I needed to configure ompi/master with '--disable-picky' to build with the PR commits. I did not need to use '--disable-picky' without the PR commits. Is this expected?

@rhc54
Copy link
Contributor

rhc54 commented Jul 18, 2019

No - developers should never --disable-picky as we want code to compile cleanly. That is why we configure --enable-picky by default when we see you are in a git repo checkout and not a tarball.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants