comm/cid: use ibcast to distribute result in intercomm case #2061

hjelmn · 2016-09-07T14:27:16Z

This commit updates the intercomm allgather to do a local comm bcast
as the final step. This should resolve a hang seen in intercomm
tests.

Signed-off-by: Nathan Hjelm hjelmn@me.com

hjelmn · 2016-09-07T14:27:37Z

@ggouaillardet This should resolve the hang. Can you verify?

ggouaillardet · 2016-09-07T14:44:14Z

Thsnks, will try.

Btw, why do we iallreduce ?
Could we ireduce with root=0 instead ?

ggouaillardet · 2016-09-07T15:32:26Z

I think i understand better now ...
ibcast on the intercomm basically cannot work since it is not bidirectional, hence iallgatherv, but with scount=1 on roots and scount=0 elsewhere, and rcounts[0]=1 and rcounts[i>0]=0 everywhere
These settings had gone during the refactor process.
Anyway, that was an overkill since ibcast on the local comm is just what is needed

I am confident ireduce can be used instead of iallreduce, and so we can only allocate tmpbuf on roots

Also, allreduce_fn functions take a count parameter, but they are only invoked with count=1, so we might want to simplify that too

I will test all these tomorrow

hjelmn · 2016-09-07T16:18:03Z

@ggouaillardet I agree. I think the count was used at some point in the past. I agree about the reduce vs allreduce. Will make that change.

bosilca · 2016-09-07T16:19:22Z

I don't think that using bcast on an inter-communicator is equivalent to using allgatherv. The result of this operation is that only one group will get the data.

hjelmn · 2016-09-07T16:20:44Z

@bosilca We first do a reduce on the intercommunicator. This gives the leaders (rank 0 in each comm) the result from the other group. Then we exchange the info between the leaders and do the reduction. At this point both leaders have the same result. They then bcast to their local groups.

hjelmn · 2016-09-07T16:25:55Z

@bosilca Just to be clear the ibcast is on the intra-communicator we keep with every inter-communicator.

bosilca · 2016-09-07T16:37:41Z

@hjelmn the bcast on a intercommunicator is not symmetric. Only one of the leaders (which is the process providing MPI_ROOT as the root) will broadcast it's data to the remote group.

bosilca · 2016-09-07T16:38:35Z

if the ibcast is done on a intra-communicator, then the result should be correct.

bosilca · 2016-09-07T16:41:04Z

If we have an intra-communicator then why all the pain to deal with the reduce on an inter-communicator ?

hjelmn · 2016-09-07T16:46:42Z

@bosilca Not sure why the code was so complicated before. Once we added the c_local_comm we should have updated the algorithm to do what this PR makes it do. This is much less complicated than we made it out to be :).

hjelmn · 2016-09-07T16:48:29Z

@ggouaillardet Nevermind on the not using reduce. I forgot that I had already changed the initial reduce to be on the intra-communicator. Made the change from allreduce->reduce.

This commit updates the intercomm allgather to do a local comm bcast as the final step. This should resolve a hang seen in intercomm tests. Signed-off-by: Nathan Hjelm <hjelmn@me.com>

hjelmn · 2016-09-07T19:03:48Z

Might go ahead an merge this before 9pm EDT to get it into MTT tonight.

ggouaillardet · 2016-09-08T12:51:43Z

It works great for me, thanks !

rhc54 · 2016-09-08T13:44:01Z

I don't see any change in MTT, though - still a ton of hangs. However, the tests that are now hanging are comm_dup.

use MPI_MIN instead of MPI_MAX when appropriate, otherwise a currently used CID can be reused, and bad things will likely happen. Refs #2061

ggouaillardet · 2016-09-09T01:18:27Z

3b968ec fixes MPI_Comm_dup_c
(that was a separate issue, if it did not occur earlier it is probably because it was hidden by the bug fixed in this PR)
i am now running some other tests from the intel_tests suite, and see if i hit some more bugs

rhc54 · 2016-09-09T02:19:12Z

just wondering - are these some of the same problems we are seeing in 2.x? I don't know if these fixes need to go over or not.

ggouaillardet · 2016-09-09T03:50:28Z

these two issues came from the refactor, and it was not brought to v2.x
i ll check v2.x from now, maybe something was backported and should have not ...

ggouaillardet · 2016-09-09T04:19:01Z

several hangs occur with the *_stride* collective tests from the ibm test suite. strictly speaking, these are not hangs since these tests complete in less than 10 minutes, but take more than 3 minutes so MTT consider them as hangs.

that should be fixed by open-mpi/ompi-tests@53ad6f22e1f9ceda580a06bf8e00324a5f15c6c0
(i simply #ifdef'ed out sanity checks that are time very consumming since they access strided data)

alltoallvw_zero from the ibm test suite (and maybe some more) will be fixed when open-mpi/ompi-release#1368 is merged

use MPI_MIN instead of MPI_MAX when appropriate, otherwise a currently used CID can be reused, and bad things will likely happen. Refs open-mpi#2061

use MPI_MIN instead of MPI_MAX when appropriate, otherwise a currently used CID can be reused, and bad things will likely happen. Refs open-mpi#2061 (cherry picked from commit 3b968ec) Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>

use MPI_MIN instead of MPI_MAX when appropriate, otherwise a currently used CID can be reused, and bad things will likely happen. Refs open-mpi#2061 (cherry picked from commit 3b968ec)

use MPI_MIN instead of MPI_MAX when appropriate, otherwise a currently used CID can be reused, and bad things will likely happen. Refs open-mpi#2061 (cherry picked from commit 3b968ec) Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>

hjelmn force-pushed the cid_inter branch from caf5c27 to 79d942b Compare September 7, 2016 16:47

comm/cid: use ibcast to distribute result in intercomm case

54cc829

This commit updates the intercomm allgather to do a local comm bcast as the final step. This should resolve a hang seen in intercomm tests. Signed-off-by: Nathan Hjelm <hjelmn@me.com>

hjelmn force-pushed the cid_inter branch from 79d942b to 54cc829 Compare September 7, 2016 16:49

hjelmn merged commit 63d73a5 into open-mpi:master Sep 7, 2016

ggouaillardet added a commit that referenced this pull request Sep 9, 2016

ompi/communicator: fix typos in CID generation

3b968ec

use MPI_MIN instead of MPI_MAX when appropriate, otherwise a currently used CID can be reused, and bad things will likely happen. Refs #2061

bosilca pushed a commit to bosilca/ompi that referenced this pull request Sep 15, 2016

ompi/communicator: fix typos in CID generation

cda8a04

use MPI_MIN instead of MPI_MAX when appropriate, otherwise a currently used CID can be reused, and bad things will likely happen. Refs open-mpi#2061

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

comm/cid: use ibcast to distribute result in intercomm case #2061

comm/cid: use ibcast to distribute result in intercomm case #2061

hjelmn commented Sep 7, 2016

hjelmn commented Sep 7, 2016

ggouaillardet commented Sep 7, 2016

ggouaillardet commented Sep 7, 2016

hjelmn commented Sep 7, 2016

bosilca commented Sep 7, 2016

hjelmn commented Sep 7, 2016

hjelmn commented Sep 7, 2016

bosilca commented Sep 7, 2016

bosilca commented Sep 7, 2016

bosilca commented Sep 7, 2016

hjelmn commented Sep 7, 2016

hjelmn commented Sep 7, 2016

hjelmn commented Sep 7, 2016

ggouaillardet commented Sep 8, 2016

rhc54 commented Sep 8, 2016

ggouaillardet commented Sep 9, 2016

rhc54 commented Sep 9, 2016

ggouaillardet commented Sep 9, 2016

ggouaillardet commented Sep 9, 2016

comm/cid: use ibcast to distribute result in intercomm case #2061

comm/cid: use ibcast to distribute result in intercomm case #2061

Conversation

hjelmn commented Sep 7, 2016

hjelmn commented Sep 7, 2016

ggouaillardet commented Sep 7, 2016

ggouaillardet commented Sep 7, 2016

hjelmn commented Sep 7, 2016

bosilca commented Sep 7, 2016

hjelmn commented Sep 7, 2016

hjelmn commented Sep 7, 2016

bosilca commented Sep 7, 2016

bosilca commented Sep 7, 2016

bosilca commented Sep 7, 2016

hjelmn commented Sep 7, 2016

hjelmn commented Sep 7, 2016

hjelmn commented Sep 7, 2016

ggouaillardet commented Sep 8, 2016

rhc54 commented Sep 8, 2016

ggouaillardet commented Sep 9, 2016

rhc54 commented Sep 9, 2016

ggouaillardet commented Sep 9, 2016

ggouaillardet commented Sep 9, 2016