-
Notifications
You must be signed in to change notification settings - Fork 859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenMPI Freezes during multi-threaded communication #10324
Comments
As an update, I've also tried Open MPI 5.0rc6 with the same results. Freezing in the same spot etc. |
I cannot reproduce with main (v2.x-dev-9807-g0d126c6405) compiled in release mode. I increased the number of threads (THREAD_COUNT) up to 16, and run each test 100 times. Everything went smooth. |
@bosilca Thanks for taking a look! On my end I have to allow the test to run until 20-100K tests before it freezes. Each test takes a different amount of time before it freezes but it's usually within that range or about 5-20min of testing. Also, it does only happen when running across three nodes that need to communicate over LAN/WAN vs using vader, sm or some other same node communication method. Thanks! |
Let's try something else before we go to such testing lengths. Assuming you are using the OB1 PML with the BTLs tcp and self (this will force all non-local communications over TCP by preventing sm BTL from being used), you should be able to dump the status of the sending and receiving queues. I suggest that once you obtain the deadlock, you attach to your application with gdb and call mca_pml_ob1_dump(comm, 1), where comm is MPI_COMM_WORLD in your example (or maybe ompi_mpi_comm_world.comm). You should do this on all your 3 processes and post the output (maybe a gist). |
@bosilca Specifying As a point of clarification on getting a dump of the queues. Does that command need to be run on the MPI process or could it also be called from one of the pthreads? As of right now I'm getting a SIGSEGV when calling the function on what I presume to be one of the pthreads my program created. GDB attached to one of the
|
You can call it in any context, gdb will execute it in it's own context. I think the segfault is because you are missing the second argument for the call |
I tried mixing up the order of the |
@bosilca Do you have any thoughts on next steps to debug this issue? Could the thread synchronization issue be causing issues? Maybe switching away from barriers? I can also cause the issue to occur and then give you SSH access to the node I'm testing on if that's helpful. Thanks |
I am able to reproduce this issue on a single node when running with |
With Open MPI built in debug mode, I get an assertion every couple of runs:
That's with
|
Can't replicate on a setup with a single IP address. This, as well as the assert location reported by @devreal might indicate that the issue is in the code handling multiple interfaces. To check this out, let's make sure we restrict the BTL TCP flow over a single interface (127.0.0.1 on a single node). What do you get if you run |
I tried to replicate on the same hardware as @devreal, but so far without success. I will talk with him tomorrow to see if he can share a setup where I can dig a little more into the reasons behind that abort (the abort signals that we are trying to open the connection between a pair of processes multiple times, clearly a bad thing). |
I think @bosilca is right: I have two interfaces (one being the wifi and one being a docker device). If I a) run with |
Thanks for all the help on this issue everyone! I'm wondering if you(@devreal) could describe your setup a little bit more? Using the below command I still see the freeze happening. I re-compiled OMPI with
Thanks! |
Here is my network config:
As I said earlier, excluding the |
@devreal Interesting, I am not able to get the issue to go away even restricting the network card usage as you described. Are you able to provide some more details about the operating system and machine(VM or physical) that you are running the test on? Maybe my environment is introducing some sort of other conflict. Also, how often in general did you see the assertion error when running the test program? I've run it in my environment ~ 10 times and haven't seen any assertions errors yet. This is with MPI built using the Thanks! |
The assertion only triggers in a scenario with multiple IP network interfaces, as soon as you restrict to a single interface you should never see the assert. Back to the test: I've run the test hundreds of time and other members of the team did similar tests, but we were unable to make it deadlock. This might indicate we are chasing the wrong lead, and the problem is not with the communication layer. Going back to your original bug description, it was mentionned that the 3rd process was sitting idle. That's not something OMPI does in any MPI call, so maybe we can get some info from there. Can you please post the stack trace of the last process. |
@IanSaucy Sorry for the 2-year wait. Are you still working on this? |
It looks like this issue is expecting a response, but hasn't gotten one yet. If there are no responses in the next 2 weeks, we'll assume that the issue has been abandoned and will close it. |
It looks like this issue is expecting a response, but hasn't gotten one yet. If there are no responses in the next 2 weeks, we'll assume that the issue has been abandoned and will close it. |
Per the above comment, it has been a month with no reply on this issue. It looks like this issue has been abandoned. I'm going to close this issue. If I'm wrong and this issue is not abandoned, please feel free to re-open it. Thank you! |
Background information
I am working on a project where we use multiple pthreads within a single MPI process, allowing each pthread to communicate with MPI. We have noticed that OMPI freezes sometimes and causes a single thread to ping at 100% CPU utilization and no forward progress is made. We were able to build a minimal example to demonstrate the issue. Unfortunately it is still non-deterministic and takes some time running to cause the issue.
What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)
We're using Open MPI v4.1.2 from a TAR.
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
We have the following version of GCC:
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Please describe the system on which you are running
Details of the problem
We have three machines running on AWS, m6a.8xlarge(64c) all connected via SSH. Our code is then run on three MPI processes, one on each machine. Each machine will have some variable number of threads, our test case uses 8. These threads then communicate with their respective threads on the other nodes in a ring style manner. Threads sync their work using pthread barriers to maintain synchronization of our virtual work units. We allow each thread to communicate with their respective thread on other nodes using message tags.
Typically, within about 5min but sometimes closer to 30min the issue will arise and the entire program will freeze and never make progress. We have tried running the same test but co-located on a single node but the issue does not seem to arise, we only see it when running it on multiple machines.
This is the specific command to run our program:
mpirun -np 3 --host one,two,three ./comm-test
The typical behavior is that two PIDs are pinged at 100% CPU while the others sit idle. I've attached the backtrace of the two processes that are pinged at 100% CPU and one that is sat idle. The backtrace is equal across all nodes.
We've also verified that OpenMPI supports multiple threads view the following:
Thanks in advance for any help.
Thread 1 at 100% CPU
Thread 2 at 100% CPU
The other threads that are not taking CPU time have the follow backtrack:
Below is our minimal example of the issue:
Makefile:
The text was updated successfully, but these errors were encountered: