Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hang in clock_gettime() during Bcast #3445

Closed
junjieqian opened this issue May 4, 2017 · 8 comments
Closed

Hang in clock_gettime() during Bcast #3445

junjieqian opened this issue May 4, 2017 · 8 comments
Labels

Comments

@junjieqian
Copy link

junjieqian commented May 4, 2017

Thank you for taking the time to submit an issue!

Background information

OMPI hang during Bcast() on clock_gettime()

What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.)

v1.10.3

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

Build from distribution tarball.

Please describe the system on which you are running

  • Operating system/version: Ubuntu 14.04 in Docker container
  • Computer hardware: GPU/CPU
  • Network type: IB

Details of the problem

MPI hangs on clock_gettime(). It happens from time to time, and most jobs are on same machine. The hang can be hours or infinite.

The issue is simiar as #99, which seems has been solved.
The stack trace is as:

#0  0x00007fffa47f7b19 in clock_gettime ()
#1  0x00007f48297f485d in __GI___clock_gettime (clock_id=<optimized out>, tp=<optimized out>) at ../sysdeps/unix/clock_gettime.c:115
#2  0x00007f4817e61931 in opal_timer_base_get_usec_clock_gettime () from /usr/local/openmpi-1.10.3-cuda-8.0/lib/libopen-pal.so.13
#3  0x00007f4817de1689 in opal_progress () from /usr/local/openmpi-1.10.3-cuda-8.0/lib/libopen-pal.so.13
#4  0x00007f482a3153e5 in ompi_request_default_wait () from /usr/local/mpi/lib/libmpi.so.12
#5  0x00007f480df51990 in ompi_coll_tuned_bcast_intra_generic () from /usr/local/openmpi-1.10.3-cuda-8.0/lib/openmpi/mca_coll_tuned.so
#6  0x00007f480df51e67 in ompi_coll_tuned_bcast_intra_binomial () from /usr/local/openmpi-1.10.3-cuda-8.0/lib/openmpi/mca_coll_tuned.so
#7  0x00007f480df4676c in ompi_coll_tuned_bcast_intra_dec_fixed () from /usr/local/openmpi-1.10.3-cuda-8.0/lib/openmpi/mca_coll_tuned.so
#8  0x00007f482a329700 in PMPI_Bcast () from /usr/local/mpi/lib/libmpi.so.12
#9  0x0000000000bd29d6 in ...::Bcast (this=this@entry=0x17a7a00, buffer=buffer@entry=0x7f47ccd4f710, count=64, datatype=0x7f482a5928a0 <ompi_mpi_char>,
    root=0) at .../MPIWrapper.cpp:853
@jsquyres
Copy link
Member

jsquyres commented May 4, 2017

Do you know if the hang itself is in clock_gettime(), or is Open MPI simply calling clock_gettime() all the time? (we use clock_gettime() as part of our internal progress engine)

Does the problem happen in v1.10.6? Or v2.1.0?

@junjieqian
Copy link
Author

junjieqian commented May 4, 2017

Hi @jsquyres , thank you very much for your attention and quick response! The hang should happen in OpenMPI, as I got more trace on other ranks, as follows.

#0  0x00007f7a6ecc390f in mca_btl_sm_component_progress () from /usr/local/openmpi-1.10.3-cuda-8.0/lib/openmpi/mca_btl_sm.so
#1  0x00007f7a774c365a in opal_progress () from /usr/local/openmpi-1.10.3-cuda-8.0/lib/libopen-pal.so.13
#2  0x00007f7a899f73e5 in ompi_request_default_wait () from /usr/local/mpi/lib/libmpi.so.12
#3  0x00007f7a6d633990 in ompi_coll_tuned_bcast_intra_generic () from /usr/local/openmpi-1.10.3-cuda-8.0/lib/openmpi/mca_coll_tuned.so
#4  0x00007f7a6d633e67 in ompi_coll_tuned_bcast_intra_binomial () from /usr/local/openmpi-1.10.3-cuda-8.0/lib/openmpi/mca_coll_tuned.so
#5  0x00007f7a6d62876c in ompi_coll_tuned_bcast_intra_dec_fixed () from /usr/local/openmpi-1.10.3-cuda-8.0/lib/openmpi/mca_coll_tuned.so
#6  0x00007f7a89a0b700 in PMPI_Bcast () from /usr/local/mpi/lib/libmpi.so.12
#7  0x0000000000bd29d6 in ...MPIWrapperMpi::Bcast (this=this@entry=0x22dfa00, buffer=buffer@entry=0x7f7a2c002190, count=64, datatype=0x7f7a89c748a0 <ompi_mpi_char>,
    root=0) at ...MPIWrapper.cpp:853

@bosilca
Copy link
Member

bosilca commented May 4, 2017

So the hand is not in clock_gettime, but in the opal_progress that waits for the completion of the requests generated during the bcast. How many processes are involved your parallel application ? Can you check the stack of all processes to make sure they all reached the same MPI_Bcast ?

@junjieqian
Copy link
Author

junjieqian commented May 4, 2017

Hi @bosilca , there are 8 processes in total. 7 processes are stuck in clock_gettime(), but one is on mca_btl_sm_component_progress.

Can you check the stack of all processes to make sure they all reached the same MPI_Bcast ?

Do you mean that there should be a barrier before doing MPI_Bcast? Now, one rank does some work and others not (go to MPI_Bcast directly), could this be the problem? As it happens randomly.
And the stack trace shows that the rank 0 (doing extra work one) stuck at mca_btl_sm_component_progress, while others are on clock_gettime()`.

@bosilca
Copy link
Member

bosilca commented May 4, 2017

They cannot be stuck in clock_gettime. What happens is that when you stop the process, it happens that it is in clock_gettime, but that particular function does not block. In fact, if you look on the stack trace, you can notice that you are in the opal_progress, which loops around pooling the network, and calling clock_gettime, until messages are received. Thus, I assume the culprit is that an expected message does not arrive, and thus your process seems blocked in opal_progress (and thus in clock_gettime).

There is no need to have a barrier before the bcast, but all processes on the communication where the bcast is called must call the MPI_Bcast function. I just wanted to make sure this is indeed the case. What really matters is if there is an MPI_Bcast on the stack trace, not what is the last function the processes are blocked into.

@junjieqian
Copy link
Author

@bosilca , thank you for your explanation! I double checked the stack traces of the ranks, and they all called PMPI_Bcast () from /usr/local/mpi/lib/libmpi.so.12.

@jsquyres
Copy link
Member

jsquyres commented May 4, 2017

I would encourage you to upgrade your version of Open MPI to at least the latest in the v1.10 series (i.e., 1.10.6) to see if this bug was already fixed. If possible, you might want to upgrade to Open MPI v2.1.0.

@ggouaillardet
Copy link
Contributor

@junjieqian the hang could occur when MPI_Bcast internally tries to establish a btl/tcp connection, that could be blocked by the firewall since you are using docker.
can you doublecheck there is no firewall running on your containers ?
assuming your IP interface is eth0, what if you

mpirun --mca btl_tcp_if_include eth0 --mca coll ^tuned ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants