Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hang when using polling fd #1492

Closed
artpol84 opened this issue May 10, 2017 · 8 comments
Closed

Hang when using polling fd #1492

artpol84 opened this issue May 10, 2017 · 8 comments
Labels
Milestone

Comments

@artpol84
Copy link
Contributor

I'm seeing hangs when using my custom-made latency test that was derived from ucp_hello_world.c:
https://github.com/artpol84/poc/tree/ucx_hang_demo/ucx/latency.

Command line is the same, here is how you can reproduce the issue:

server: ./launch.sh ./ucp_latency -f
cli: ./launch.sh ./ucp_latency -f -n vegas06 -p 13337

Since communication is symmetric, hang may occur on either side.
For one particular case, sender side is not hanging, but waiting for the message send completion:

(gdb) bt
#0  0x00007ffff6516213 in pthread_spin_lock () from /usr/lib64/libpthread.so.0
#1  0x00007ffff562b9e6 in mlx5_poll_cq () from /usr/lib64/libmlx5-rdmav2.so
#2  0x00007ffff72a0e33 in ibv_poll_cq (wc=0x7fffffffd888, num_entries=<optimized out>, cq=<optimized out>)
    at /usr/include/infiniband/verbs.h:1360
#3  uct_ib_poll_cq (wcs=0x7fffffffd888, count=<synthetic pointer>, cq=<optimized out>)
    at /hpc/local/benchmarks/hpcx_install_Tuesday/src/hpcx-gcc-redhat7.2/ucx-master/src/uct/ib/base/ib_device.h:270
#4  uct_ud_verbs_iface_poll_rx (is_async=0, iface=0x6bd020) at ib/ud/verbs/ud_verbs.c:320
#5  uct_ud_verbs_iface_progress (arg=0x6bd020) at ib/ud/verbs/ud_verbs.c:368
#6  0x00007ffff727d68e in ucs_callbackq_dispatch (cbq=0x648f18, cbq=0x648f18)
    at /hpc/local/benchmarks/hpcx_install_Tuesday/src/hpcx-gcc-redhat7.2/ucx-master/src/ucs/datastruct/callbackq.inl:39
#7  uct_worker_progress (worker=0x648f10) at base/uct_md.c:233
#8  0x00007ffff7bc71dd in ucp_worker_progress (worker=0x656030) at core/ucp_worker.c:719
#9  0x0000000000401a77 in wait (ucp_worker=0x656030, context=0x6e1748) at ucp_latency.c:123
#10 0x0000000000401ec3 in send_msg (ucp_worker=0x656030, server_ep=0x611280, msg_len=262144) at ucp_latency.c:284
#11 0x0000000000401f35 in pong_ping (ucp_worker=0x656030, ep=0x611280) at ucp_latency.c:301
#12 0x000000000040215e in run_ucx_client (ucp_worker=0x656030) at ucp_latency.c:372
#13 0x0000000000402553 in run_test (server=0x7fffffffe34e "vegas06", ucp_worker=0x656030) at ucp_latency.c:486
#14 0x00000000004029e4 in main (argc=6, argv=0x7fffffffdfe8) at ucp_latency.c:617

receiver side (one that actually hangs):

(gdb) bt
#0  0x00007ffff75bfd13 in epoll_wait () from /usr/lib64/libc.so.6
#1  0x0000000000401b5d in test_poll_wait (ucp_worker=0x61f2d0) at ucp_latency.c:161
#2  0x0000000000401bd4 in recv_msg (ucp_worker=0x61f2d0, server_ep=0x6248e0) at ucp_latency.c:194
#3  0x0000000000401f73 in ping_pong (ucp_worker=0x61f2d0, ep=0x6248e0, len=262144) at ucp_latency.c:307
#4  0x000000000040247c in run_ucx_server (ucp_worker=0x61f2d0) at ucp_latency.c:470
#5  0x0000000000402561 in run_test (server=0x0, ucp_worker=0x61f2d0) at ucp_latency.c:488
#6  0x00000000004029e4 in main (argc=2, argv=0x7fffffffe018) at ucp_latency.c:617

If I manually interrupt poll and fake ret value and force it to go to processing, I see that actually receive event has occurred, but poll wasn't interrupted for some reasons (or *_arm hasn't captured existing event)

(gdb) fini
Run till exit from #0  0x00007ffff75bfd13 in epoll_wait () from /usr/lib64/libc.so.6
0x0000000000401b5d in test_poll_wait (ucp_worker=0x61f2d0) at ucp_latency.c:161
161             ret = epoll_wait(epoll_fd_local, &ev, 1, -1);
(gdb) n
162         } while ((ret == -1) && (errno == EINTR));
(gdb) set ret = 0
(gdb) n
164         ret = UCS_OK;
(gdb)
167         close(epoll_fd_local);
(gdb)
169         return ret;
(gdb)
170     }
(gdb)
recv_msg (ucp_worker=0x61f2d0, server_ep=0x6248e0) at ucp_latency.c:195
195                 if (status != UCS_OK) {
(gdb)
201             ucp_worker_progress(ucp_worker);
(gdb)
204             msg_tag = ucp_tag_probe_nb(ucp_worker, tag, tag_mask, 1, &info_tag);
(gdb)
205         } while (msg_tag == NULL);
(gdb)
207         if( same_buf ) {
(gdb) p msg_tag
$1 = (ucp_tag_message_h) 0x7ffff49ed2a9

If I introduce artificial delay at the sender side:
https://github.com/artpol84/poc/blob/ucx_hang_demo/ucx/latency/ucp_latency.c#L270
hang goes away.

@shamisp shamisp added this to the v1.2 - release milestone May 10, 2017
@yosefe yosefe added the Bug label May 11, 2017
@evgeny-leksikov
Copy link
Contributor

Since it's reproduced only for large messages, it looks like side effect from RNDV protocol. Receiver catches only RTS packet with event which is processed earlier than UCX request is completed.

@yosefe yosefe modified the milestones: v1.2 - release, v1.3 May 15, 2017
evgeny-leksikov pushed a commit to evgeny-leksikov/ucx that referenced this issue Jun 2, 2017
evgeny-leksikov added a commit to evgeny-leksikov/ucx that referenced this issue Jun 2, 2017
evgeny-leksikov added a commit to evgeny-leksikov/ucx that referenced this issue Jun 4, 2017
evgeny-leksikov added a commit to evgeny-leksikov/ucx that referenced this issue Jun 4, 2017
@yosefe
Copy link
Contributor

yosefe commented Jun 17, 2017

@artpol84 @evgeny-leksikov the problem is with the test (which was modeled after the buggy ucp_hello_world example) --
the last thing before going to sleep should be doing ucp_tag_probe_nb(). otherwise, a message might already be received by ucp_worker_progress(), but the test would still go to sleep.
See the documentation for ucp_worker_arm() - particularly that "check_for_events" is done before arming the worker and going to sleep.

@artpol84
Copy link
Contributor Author

@yosefe I'm not sure I'm 100% understand. Will buggy hello world example be fixed so I can see it in the code?

@yosefe
Copy link
Contributor

yosefe commented Jun 18, 2017

yes
the fix is to probe for message before going to sleep

@artpol84
Copy link
Contributor Author

Going to sleep = calling epoll?
So we call it in between arming and epoll?

@yosefe
Copy link
Contributor

yosefe commented Jun 18, 2017

@artpol84 yes. need to call it before epoll, can be before or after arm. as long as getting successful probe at that point will mean not go to sleep.

@artpol84
Copy link
Contributor Author

I see what you are saying. In the original ucp_hello_world.c example do-while loop skips the very first probe:
https://github.com/openucx/ucx/blob/master/test/examples/ucp_hello_world.c#L195:L212
Indeed same is in my latency test. Thanks for pointing it out.

The reason I double-asked was because I had impression that you should do probe before arm (at least that was making more sense to me). And from your original note I got a feeling that it is mandatory to call probe after arm. Now I see that my understanding was ok.

@yosefe
Copy link
Contributor

yosefe commented Jun 18, 2017

@artpol84 exactly. Just to stress the problem, the wait() for send request may call ucp_worker_progress(). This might already get a receive completion and put the message on unexpected queue. Then we go to sleep and never wake up (since the message was already received).

@yosefe yosefe closed this as completed Jun 19, 2017
artpol84 added a commit to artpol84/poc that referenced this issue Jun 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants