Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segv on the sendself test with cuda #2316

Closed
alinask opened this issue Feb 15, 2018 · 3 comments
Closed

segv on the sendself test with cuda #2316

alinask opened this issue Feb 15, 2018 · 3 comments
Assignees
Labels

Comments

@alinask
Copy link
Contributor

alinask commented Feb 15, 2018

There is a segv that happens on the sendself benchmark when running on the vulcan hosts with cuda enabled.

Nodes: vulcan x2 (ppn=48(x2), nodelist=vulcan[03-04])
Module: hpcx-gcc-cuda

Command line to reproduce:

/hpc/local/benchmarks/hpcx_install_2018-02-14/hpcx-gcc-redhat7.4/ompi-v3.1.x/bin/mpirun -np 96 --display-map -mca btl self --tag-output --timestamp-output -mca pml ucx -mca coll '^hcoll' --bind-to hwthread -x UCX_NET_DEVICES=mlx5_0:1 -x UCX_IB_GID_INDEX=0 -x UCX_IB_REG_METHODS=rcache,direct -x UCX_TLS=rc_x,cuda_copy,gdr_copy -mca opal_pmix_base_async_modex 0 -mca mpi_add_procs_cutoff 100000 --map-by node /hpc/mtr_scrap/users/mtt/scratch/ucx_ompi/20180214_233132_9872_141709_vulcan03/installs/lVNE/tests/mpich_tests/mpich-mellanox.git/test/mpi/pt2pt/sendself

/hpc/local/benchmarks/hpcx_install_2018-02-14/hpcx-gcc-redhat7.4/ompi-v3.1.x/bin/mpirun -np 96 --display-map -mca btl self --tag-output --timestamp-output -mca pml ucx -mca coll '^hcoll' --bind-to hwthread -x UCX_NET_DEVICES=mlx5_0:1 -x UCX_IB_GID_INDEX=0 -x UCX_IB_REG_METHODS=rcache,direct -x UCX_TLS=dc,cuda_copy,gdr_copy -x UCX_IB_SL=1 -mca opal_pmix_base_async_modex 0 -mca mpi_add_procs_cutoff 100000 --map-by node /hpc/mtr_scrap/users/mtt/scratch/ucx_ompi/20180214_233132_9872_141709_vulcan03/installs/lVNE/tests/mpich_tests/mpich-mellanox.git/test/mpi/pt2pt/sendself

/hpc/local/benchmarks/hpcx_install_2018-02-14/hpcx-gcc-redhat7.4/ompi-v3.1.x/bin/mpirun -np 96 --display-map -mca btl self --tag-output --timestamp-output -mca pml ucx -mca coll '^hcoll' --bind-to hwthread -x UCX_NET_DEVICES=mlx5_0:1 -x UCX_IB_GID_INDEX=0 -x UCX_IB_REG_METHODS=rcache,direct -x UCX_TLS=rc,cuda_copy,gdr_copy -mca opal_pmix_base_async_modex 0 -mca mpi_add_procs_cutoff 100000 --map-by node /hpc/mtr_scrap/users/mtt/scratch/ucx_ompi/20180214_233132_9872_141709_vulcan03/installs/lVNE/tests/mpich_tests/mpich-mellanox.git/test/mpi/pt2pt/sendself

Thu Feb 15 07:24:35 2018[1,9]<stderr>:[vulcan04:12958:0:12958] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x1000000010)
Thu Feb 15 07:24:35 2018[1,3]<stderr>:[vulcan04:12955:0:12955] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x1000000010)
Thu Feb 15 07:24:35 2018[1,9]<stderr>:
Thu Feb 15 07:24:35 2018[1,9]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_remote_connected() ]
Thu Feb 15 07:24:35 2018[1,9]<stderr>:      ...
Thu Feb 15 07:24:35 2018[1,9]<stderr>:      508                                       ucp_wireup_ep_progress, wireup_ep, 0,
Thu Feb 15 07:24:35 2018[1,9]<stderr>:      509                                       &wireup_ep->progress_id);
Thu Feb 15 07:24:35 2018[1,9]<stderr>:      510     ucp_worker_signal_internal(ucp_ep->worker);
Thu Feb 15 07:24:35 2018[1,9]<stderr>:==>   511 }
Thu Feb 15 07:24:35 2018[1,9]<stderr>:      512 
Thu Feb 15 07:24:35 2018[1,9]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:35 2018[1,9]<stderr>:      514 {
Thu Feb 15 07:24:35 2018[1,9]<stderr>:
Thu Feb 15 07:24:35 2018[1,21]<stderr>:[vulcan04:12967:0:12967] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x1000000010)
Thu Feb 15 07:24:35 2018[1,3]<stderr>:
Thu Feb 15 07:24:35 2018[1,3]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_remote_connected() ]
Thu Feb 15 07:24:35 2018[1,3]<stderr>:      ...
Thu Feb 15 07:24:35 2018[1,3]<stderr>:      508                                       ucp_wireup_ep_progress, wireup_ep, 0,
Thu Feb 15 07:24:35 2018[1,3]<stderr>:      509                                       &wireup_ep->progress_id);
Thu Feb 15 07:24:35 2018[1,3]<stderr>:      510     ucp_worker_signal_internal(ucp_ep->worker);
Thu Feb 15 07:24:35 2018[1,3]<stderr>:==>   511 }
Thu Feb 15 07:24:35 2018[1,3]<stderr>:      512 
Thu Feb 15 07:24:35 2018[1,3]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:35 2018[1,3]<stderr>:      514 {
Thu Feb 15 07:24:35 2018[1,3]<stderr>:
Thu Feb 15 07:24:35 2018[1,9]<stderr>:==== backtrace ====
Thu Feb 15 07:24:35 2018[1,9]<stderr>: 0 0x000000000002f200 ucp_wireup_ep_remote_connected()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:511
Thu Feb 15 07:24:35 2018[1,9]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:35 2018[1,9]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:35 2018[1,9]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:35 2018[1,9]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:35 2018[1,9]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:35 2018[1,9]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:35 2018[1,9]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:35 2018[1,9]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:35 2018[1,9]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:35 2018[1,9]<stderr>:===================
Thu Feb 15 07:24:35 2018[1,21]<stderr>:
Thu Feb 15 07:24:35 2018[1,21]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_remote_connected() ]
Thu Feb 15 07:24:35 2018[1,21]<stderr>:      ...
Thu Feb 15 07:24:35 2018[1,21]<stderr>:      508                                       ucp_wireup_ep_progress, wireup_ep, 0,
Thu Feb 15 07:24:35 2018[1,21]<stderr>:      509                                       &wireup_ep->progress_id);
Thu Feb 15 07:24:35 2018[1,21]<stderr>:      510     ucp_worker_signal_internal(ucp_ep->worker);
Thu Feb 15 07:24:35 2018[1,21]<stderr>:==>   511 }
Thu Feb 15 07:24:35 2018[1,21]<stderr>:      512 
Thu Feb 15 07:24:35 2018[1,21]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:35 2018[1,21]<stderr>:      514 {
Thu Feb 15 07:24:35 2018[1,21]<stderr>:
Thu Feb 15 07:24:35 2018[1,3]<stderr>:==== backtrace ====
Thu Feb 15 07:24:35 2018[1,3]<stderr>: 0 0x000000000002f200 ucp_wireup_ep_remote_connected()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:511
Thu Feb 15 07:24:35 2018[1,3]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:35 2018[1,3]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:35 2018[1,3]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:35 2018[1,3]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:35 2018[1,3]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:35 2018[1,3]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:35 2018[1,3]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:35 2018[1,3]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:35 2018[1,3]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:35 2018[1,3]<stderr>:===================
Thu Feb 15 07:24:35 2018[1,21]<stderr>:==== backtrace ====
Thu Feb 15 07:24:35 2018[1,21]<stderr>: 0 0x000000000002f200 ucp_wireup_ep_remote_connected()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:511
Thu Feb 15 07:24:35 2018[1,21]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:35 2018[1,21]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:35 2018[1,21]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:35 2018[1,21]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:35 2018[1,21]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:35 2018[1,21]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:35 2018[1,21]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:35 2018[1,21]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:35 2018[1,21]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:35 2018[1,21]<stderr>:===================
Thu Feb 15 07:24:35 2018[1,25]<stderr>:[vulcan04:12975:0:12975] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x1000000010)
Thu Feb 15 07:24:35 2018[1,25]<stderr>:
Thu Feb 15 07:24:35 2018[1,25]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_remote_connected() ]
Thu Feb 15 07:24:35 2018[1,25]<stderr>:      ...
Thu Feb 15 07:24:35 2018[1,25]<stderr>:      508                                       ucp_wireup_ep_progress, wireup_ep, 0,
Thu Feb 15 07:24:35 2018[1,25]<stderr>:      509                                       &wireup_ep->progress_id);
Thu Feb 15 07:24:35 2018[1,25]<stderr>:      510     ucp_worker_signal_internal(ucp_ep->worker);
Thu Feb 15 07:24:35 2018[1,25]<stderr>:==>   511 }
Thu Feb 15 07:24:35 2018[1,25]<stderr>:      512 
Thu Feb 15 07:24:35 2018[1,25]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:35 2018[1,25]<stderr>:      514 {
Thu Feb 15 07:24:35 2018[1,25]<stderr>:
Thu Feb 15 07:24:35 2018[1,25]<stderr>:==== backtrace ====
Thu Feb 15 07:24:35 2018[1,25]<stderr>: 0 0x000000000002f200 ucp_wireup_ep_remote_connected()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:511
Thu Feb 15 07:24:35 2018[1,25]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:35 2018[1,25]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:35 2018[1,25]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:35 2018[1,25]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:35 2018[1,25]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:35 2018[1,25]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:35 2018[1,25]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:35 2018[1,25]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:35 2018[1,25]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:35 2018[1,25]<stderr>:===================
Thu Feb 15 07:24:36 2018[1,38]<stderr>:[vulcan03:3923 :0:3923] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x3)
Thu Feb 15 07:24:36 2018[1,38]<stderr>:
Thu Feb 15 07:24:36 2018[1,38]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_remote_connected() ]
Thu Feb 15 07:24:36 2018[1,38]<stderr>:      ...
Thu Feb 15 07:24:36 2018[1,38]<stderr>:      508                                       ucp_wireup_ep_progress, wireup_ep, 0,
Thu Feb 15 07:24:36 2018[1,38]<stderr>:      509                                       &wireup_ep->progress_id);
Thu Feb 15 07:24:36 2018[1,38]<stderr>:      510     ucp_worker_signal_internal(ucp_ep->worker);
Thu Feb 15 07:24:36 2018[1,38]<stderr>:==>   511 }
Thu Feb 15 07:24:36 2018[1,38]<stderr>:      512 
Thu Feb 15 07:24:36 2018[1,38]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:36 2018[1,38]<stderr>:      514 {
Thu Feb 15 07:24:36 2018[1,38]<stderr>:
Thu Feb 15 07:24:37 2018[1,38]<stderr>:==== backtrace ====
Thu Feb 15 07:24:37 2018[1,38]<stderr>: 0 0x000000000002f200 ucp_wireup_ep_remote_connected()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:511
Thu Feb 15 07:24:37 2018[1,38]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:37 2018[1,38]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:37 2018[1,38]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:37 2018[1,38]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:37 2018[1,38]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:37 2018[1,38]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:37 2018[1,38]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:37 2018[1,38]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:37 2018[1,38]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:37 2018[1,38]<stderr>:===================
Thu Feb 15 07:24:37 2018[1,65]<stderr>:[vulcan04:13068:0:13068] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x1b933c)
Thu Feb 15 07:24:37 2018[1,65]<stderr>:
Thu Feb 15 07:24:37 2018[1,65]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_test() ]
Thu Feb 15 07:24:37 2018[1,65]<stderr>:      ...
Thu Feb 15 07:24:37 2018[1,65]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:37 2018[1,65]<stderr>:      514 {
Thu Feb 15 07:24:37 2018[1,65]<stderr>:      515     return uct_ep->iface->ops.ep_destroy ==
Thu Feb 15 07:24:37 2018[1,65]<stderr>:==>   516                     UCS_CLASS_DELETE_FUNC_NAME(ucp_wireup_ep_t);
Thu Feb 15 07:24:37 2018[1,65]<stderr>:      517 }
Thu Feb 15 07:24:37 2018[1,65]<stderr>:      518 
Thu Feb 15 07:24:37 2018[1,65]<stderr>:      519 int ucp_wireup_ep_is_owner(uct_ep_h uct_ep, uct_ep_h owned_ep)
Thu Feb 15 07:24:37 2018[1,65]<stderr>:Thu Feb 15 07:24:37 2018[1,79]<stderr>:[vulcan04:13087:0:13087] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xf4f83)
Thu Feb 15 07:24:37 2018[1,77]<stderr>:[vulcan04:13084:0:13084] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x1d9ccc)Thu Feb 15 07:24:37 2018[1,65]<stderr>:==== backtrace ====
Thu Feb 15 07:24:37 2018[1,65]<stderr>: 0 0x000000000002f20a ucp_wireup_ep_test()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:516
Thu Feb 15 07:24:37 2018[1,65]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:37 2018[1,65]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:37 2018[1,65]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:37 2018[1,65]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:37 2018[1,65]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:37 2018[1,65]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:37 2018[1,65]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:37 2018[1,65]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:37 2018[1,65]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:37 2018[1,65]<stderr>:===================
Thu Feb 15 07:24:37 2018[1,81]<stderr>:[vulcan04:13090:0:13090] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xeb213)
Thu Feb 15 07:24:37 2018[1,51]<stderr>:[vulcan04:13042:0:13042] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xcdfc5)
Thu Feb 15 07:24:37 2018[1,75]<stderr>:[vulcan04:13081:0:13081] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x1c5c2c)
Thu Feb 15 07:24:37 2018[1,53]<stderr>:[vulcan04:13048:0:13048] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x41bd0)
Thu Feb 15 07:24:37 2018[1,39]<stderr>:[vulcan04:13004:0:13004] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x1000000010)
Thu Feb 15 07:24:37 2018[1,61]<stderr>:[vulcan04:13062:0:13062] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x37239)Thu Feb 15 07:24:37 2018[1,41]<stderr>:[vulcan04:13005:0:13005] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x1000000010)
Thu Feb 15 07:24:37 2018[1,83]<stderr>:[vulcan04:13092:0:13092] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x169f2a)
Thu Feb 15 07:24:37 2018[1,63]<stderr>:[vulcan04:13063:0:13063] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x20f58c)
Thu Feb 15 07:24:37 2018[1,77]<stderr>:
Thu Feb 15 07:24:37 2018[1,77]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_test() ]
Thu Feb 15 07:24:37 2018[1,77]<stderr>:      ...Thu Feb 15 07:24:37 2018[1,79]<stderr>:
Thu Feb 15 07:24:37 2018[1,79]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_test() ]
Thu Feb 15 07:24:37 2018[1,79]<stderr>:      ...
Thu Feb 15 07:24:37 2018[1,79]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:37 2018[1,79]<stderr>:      514 {
Thu Feb 15 07:24:37 2018[1,79]<stderr>:      515     return uct_ep->iface->ops.ep_destroy ==
Thu Feb 15 07:24:37 2018[1,77]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:37 2018[1,77]<stderr>:      514 {
Thu Feb 15 07:24:37 2018[1,77]<stderr>:      515     return uct_ep->iface->ops.ep_destroy ==
Thu Feb 15 07:24:37 2018[1,77]<stderr>:==>   516                     UCS_CLASS_DELETE_FUNC_NAME(ucp_wireup_ep_t);
Thu Feb 15 07:24:37 2018[1,77]<stderr>:      517 }
Thu Feb 15 07:24:37 2018[1,77]<stderr>:      518 
Thu Feb 15 07:24:37 2018[1,77]<stderr>:      519 int ucp_wireup_ep_is_owner(uct_ep_h uct_ep, uct_ep_h owned_ep)
Thu Feb 15 07:24:37 2018[1,77]<stderr>:
Thu Feb 15 07:24:37 2018[1,79]<stderr>:==>   516                     UCS_CLASS_DELETE_FUNC_NAME(ucp_wireup_ep_t);
Thu Feb 15 07:24:37 2018[1,79]<stderr>:      517 }
Thu Feb 15 07:24:37 2018[1,79]<stderr>:      518 
Thu Feb 15 07:24:37 2018[1,79]<stderr>:      519 int ucp_wireup_ep_is_owner(uct_ep_h uct_ep, uct_ep_h owned_ep)
Thu Feb 15 07:24:37 2018[1,79]<stderr>:
Thu Feb 15 07:24:37 2018[1,81]<stderr>:
Thu Feb 15 07:24:37 2018[1,81]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_test() ]
Thu Feb 15 07:24:37 2018[1,81]<stderr>:      ...Thu Feb 15 07:24:37 2018[1,81]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:37 2018[1,81]<stderr>:      514 {
Thu Feb 15 07:24:37 2018[1,81]<stderr>:      515     return uct_ep->iface->ops.ep_destroy ==
Thu Feb 15 07:24:37 2018[1,81]<stderr>:==>   516                     UCS_CLASS_DELETE_FUNC_NAME(ucp_wireup_ep_t);
Thu Feb 15 07:24:37 2018[1,81]<stderr>:      517 }
Thu Feb 15 07:24:37 2018[1,81]<stderr>:      518 
Thu Feb 15 07:24:37 2018[1,81]<stderr>:      519 int ucp_wireup_ep_is_owner(uct_ep_h uct_ep, uct_ep_h owned_ep)
Thu Feb 15 07:24:37 2018[1,81]<stderr>:
Thu Feb 15 07:24:37 2018[1,75]<stderr>:
Thu Feb 15 07:24:37 2018[1,75]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_test() ]
Thu Feb 15 07:24:37 2018[1,75]<stderr>:      ...Thu Feb 15 07:24:37 2018[1,51]<stderr>:
Thu Feb 15 07:24:37 2018[1,51]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_test() ]
Thu Feb 15 07:24:37 2018[1,51]<stderr>:      ...
Thu Feb 15 07:24:37 2018[1,51]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:37 2018[1,51]<stderr>:      514 {
Thu Feb 15 07:24:37 2018[1,51]<stderr>:      515     return uct_ep->iface->ops.ep_destroy ==
Thu Feb 15 07:24:37 2018[1,51]<stderr>:==>   516                     UCS_CLASS_DELETE_FUNC_NAME(ucp_wireup_ep_t);
Thu Feb 15 07:24:37 2018[1,51]<stderr>:      517 }
Thu Feb 15 07:24:37 2018[1,51]<stderr>:      518 
Thu Feb 15 07:24:37 2018[1,51]<stderr>:      519 int ucp_wireup_ep_is_owner(uct_ep_h uct_ep, uct_ep_h owned_ep)
Thu Feb 15 07:24:37 2018[1,51]<stderr>:
Thu Feb 15 07:24:37 2018[1,75]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:37 2018[1,75]<stderr>:      514 {
Thu Feb 15 07:24:37 2018[1,75]<stderr>:      515     return uct_ep->iface->ops.ep_destroy ==
Thu Feb 15 07:24:37 2018[1,75]<stderr>:==>   516                     UCS_CLASS_DELETE_FUNC_NAME(ucp_wireup_ep_t);
Thu Feb 15 07:24:37 2018[1,75]<stderr>:      517 }
Thu Feb 15 07:24:37 2018[1,75]<stderr>:      518 
Thu Feb 15 07:24:37 2018[1,75]<stderr>:      519 int ucp_wireup_ep_is_owner(uct_ep_h uct_ep, uct_ep_h owned_ep)
Thu Feb 15 07:24:37 2018[1,75]<stderr>:
Thu Feb 15 07:24:37 2018[1,53]<stderr>:
Thu Feb 15 07:24:37 2018[1,53]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_test() ]
Thu Feb 15 07:24:37 2018[1,53]<stderr>:      ...Thu Feb 15 07:24:37 2018[1,53]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:37 2018[1,53]<stderr>:      514 {
Thu Feb 15 07:24:37 2018[1,53]<stderr>:      515     return uct_ep->iface->ops.ep_destroy ==
Thu Feb 15 07:24:37 2018[1,53]<stderr>:==>   516                     UCS_CLASS_DELETE_FUNC_NAME(ucp_wireup_ep_t);
Thu Feb 15 07:24:37 2018[1,53]<stderr>:      517 }
Thu Feb 15 07:24:37 2018[1,53]<stderr>:      518 
Thu Feb 15 07:24:37 2018[1,53]<stderr>:      519 int ucp_wireup_ep_is_owner(uct_ep_h uct_ep, uct_ep_h owned_ep)
Thu Feb 15 07:24:37 2018[1,53]<stderr>:
Thu Feb 15 07:24:37 2018[1,39]<stderr>:
Thu Feb 15 07:24:37 2018[1,39]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_remote_connected() ]
Thu Feb 15 07:24:37 2018[1,39]<stderr>:      ...Thu Feb 15 07:24:37 2018[1,39]<stderr>:      508                                       ucp_wireup_ep_progress, wireup_ep, 0,
Thu Feb 15 07:24:37 2018[1,39]<stderr>:      509                                       &wireup_ep->progress_id);
Thu Feb 15 07:24:37 2018[1,39]<stderr>:      510     ucp_worker_signal_internal(ucp_ep->worker);
Thu Feb 15 07:24:37 2018[1,39]<stderr>:==>   511 }
Thu Feb 15 07:24:37 2018[1,39]<stderr>:      512 
Thu Feb 15 07:24:37 2018[1,39]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:37 2018[1,39]<stderr>:      514 {
Thu Feb 15 07:24:37 2018[1,39]<stderr>:
Thu Feb 15 07:24:37 2018[1,41]<stderr>:
Thu Feb 15 07:24:37 2018[1,41]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_remote_connected() ]
Thu Feb 15 07:24:37 2018[1,41]<stderr>:      ...
Thu Feb 15 07:24:37 2018[1,41]<stderr>:      508                                       ucp_wireup_ep_progress, wireup_ep, 0,
Thu Feb 15 07:24:37 2018[1,41]<stderr>:      509                                       &wireup_ep->progress_id);
Thu Feb 15 07:24:37 2018[1,41]<stderr>:      510     ucp_worker_signal_internal(ucp_ep->worker);
Thu Feb 15 07:24:37 2018[1,41]<stderr>:==>   511 }
Thu Feb 15 07:24:37 2018[1,41]<stderr>:      512 
Thu Feb 15 07:24:37 2018[1,41]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:37 2018[1,41]<stderr>:      514 {
Thu Feb 15 07:24:37 2018[1,41]<stderr>:
Thu Feb 15 07:24:37 2018[1,63]<stderr>:
Thu Feb 15 07:24:37 2018[1,63]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_test() ]
Thu Feb 15 07:24:37 2018[1,63]<stderr>:      ...Thu Feb 15 07:24:37 2018[1,61]<stderr>:
Thu Feb 15 07:24:37 2018[1,61]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_test() ]
Thu Feb 15 07:24:37 2018[1,61]<stderr>:      ...
Thu Feb 15 07:24:37 2018[1,63]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:37 2018[1,63]<stderr>:      514 {
Thu Feb 15 07:24:37 2018[1,63]<stderr>:      515     return uct_ep->iface->ops.ep_destroy ==
Thu Feb 15 07:24:37 2018[1,61]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:37 2018[1,61]<stderr>:      514 {
Thu Feb 15 07:24:37 2018[1,61]<stderr>:      515     return uct_ep->iface->ops.ep_destroy ==
Thu Feb 15 07:24:37 2018[1,61]<stderr>:==>   516                     UCS_CLASS_DELETE_FUNC_NAME(ucp_wireup_ep_t);
Thu Feb 15 07:24:37 2018[1,61]<stderr>:      517 }
Thu Feb 15 07:24:37 2018[1,61]<stderr>:      518 
Thu Feb 15 07:24:37 2018[1,61]<stderr>:      519 int ucp_wireup_ep_is_owner(uct_ep_h uct_ep, uct_ep_h owned_ep)
Thu Feb 15 07:24:37 2018[1,63]<stderr>:==>   516                     UCS_CLASS_DELETE_FUNC_NAME(ucp_wireup_ep_t);
Thu Feb 15 07:24:37 2018[1,63]<stderr>:      517 }
Thu Feb 15 07:24:37 2018[1,63]<stderr>:      518 
Thu Feb 15 07:24:37 2018[1,63]<stderr>:      519 int ucp_wireup_ep_is_owner(uct_ep_h uct_ep, uct_ep_h owned_ep)
Thu Feb 15 07:24:37 2018[1,63]<stderr>:
Thu Feb 15 07:24:37 2018[1,61]<stderr>:
Thu Feb 15 07:24:37 2018[1,51]<stderr>:==== backtrace ====
Thu Feb 15 07:24:37 2018[1,51]<stderr>: 0 0x000000000002f20a ucp_wireup_ep_test()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:516
Thu Feb 15 07:24:37 2018[1,51]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:37 2018[1,51]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:37 2018[1,51]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:37 2018[1,51]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:37 2018[1,51]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:37 2018[1,51]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:37 2018[1,51]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:37 2018[1,51]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:37 2018[1,51]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:37 2018[1,51]<stderr>:===================
Thu Feb 15 07:24:37 2018[1,75]<stderr>:==== backtrace ====
Thu Feb 15 07:24:37 2018[1,75]<stderr>: 0 0x000000000002f20a ucp_wireup_ep_test()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:516
Thu Feb 15 07:24:37 2018[1,75]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:37 2018[1,75]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:37 2018[1,75]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:37 2018[1,75]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:37 2018[1,75]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:37 2018[1,75]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:37 2018[1,75]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:37 2018[1,75]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:37 2018[1,75]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:37 2018[1,75]<stderr>:===================
Thu Feb 15 07:24:37 2018[1,53]<stderr>:==== backtrace ====
Thu Feb 15 07:24:37 2018[1,53]<stderr>: 0 0x000000000002f20a ucp_wireup_ep_test()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:516
Thu Feb 15 07:24:37 2018[1,53]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:37 2018[1,53]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:37 2018[1,53]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:37 2018[1,53]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:37 2018[1,53]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:37 2018[1,53]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:37 2018[1,53]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:37 2018[1,53]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:37 2018[1,53]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:37 2018[1,53]<stderr>:===================
Thu Feb 15 07:24:37 2018[1,77]<stderr>:==== backtrace ====
Thu Feb 15 07:24:37 2018[1,77]<stderr>: 0 0x000000000002f20a ucp_wireup_ep_test()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:516
Thu Feb 15 07:24:37 2018[1,77]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:37 2018[1,77]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:37 2018[1,77]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:37 2018[1,77]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:37 2018[1,77]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:37 2018[1,77]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:37 2018[1,77]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:37 2018[1,77]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:37 2018[1,77]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:37 2018[1,77]<stderr>:===================
Thu Feb 15 07:24:37 2018[1,79]<stderr>:==== backtrace ====
Thu Feb 15 07:24:37 2018[1,79]<stderr>: 0 0x000000000002f20a ucp_wireup_ep_test()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:516
Thu Feb 15 07:24:37 2018[1,79]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:37 2018[1,79]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:37 2018[1,79]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:37 2018[1,79]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:37 2018[1,79]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:37 2018[1,79]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:37 2018[1,79]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:37 2018[1,79]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:37 2018[1,79]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:37 2018[1,79]<stderr>:===================
Thu Feb 15 07:24:37 2018[1,39]<stderr>:==== backtrace ====
Thu Feb 15 07:24:37 2018[1,39]<stderr>: 0 0x000000000002f200 ucp_wireup_ep_remote_connected()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:511
Thu Feb 15 07:24:37 2018[1,39]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:37 2018[1,39]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:37 2018[1,39]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:37 2018[1,39]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:37 2018[1,39]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:37 2018[1,39]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:37 2018[1,39]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:37 2018[1,39]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:37 2018[1,39]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:37 2018[1,39]<stderr>:===================
Thu Feb 15 07:24:37 2018[1,41]<stderr>:==== backtrace ====
Thu Feb 15 07:24:37 2018[1,41]<stderr>: 0 0x000000000002f200 ucp_wireup_ep_remote_connected()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:511
Thu Feb 15 07:24:37 2018[1,41]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:37 2018[1,41]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:37 2018[1,41]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:37 2018[1,41]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:37 2018[1,41]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:37 2018[1,41]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:37 2018[1,41]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:37 2018[1,41]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:37 2018[1,41]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:37 2018[1,41]<stderr>:===================
Thu Feb 15 07:24:37 2018[1,81]<stderr>:==== backtrace ====
Thu Feb 15 07:24:37 2018[1,81]<stderr>: 0 0x000000000002f20a ucp_wireup_ep_test()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:516
Thu Feb 15 07:24:37 2018[1,81]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:37 2018[1,81]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:37 2018[1,81]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:37 2018[1,81]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:37 2018[1,81]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:37 2018[1,81]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:37 2018[1,81]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:37 2018[1,81]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:37 2018[1,81]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:37 2018[1,81]<stderr>:===================
Thu Feb 15 07:24:37 2018[1,83]<stderr>:
Thu Feb 15 07:24:37 2018[1,83]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_test() ]
Thu Feb 15 07:24:37 2018[1,83]<stderr>:      ...
Thu Feb 15 07:24:37 2018[1,83]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:37 2018[1,83]<stderr>:      514 {
Thu Feb 15 07:24:37 2018[1,83]<stderr>:      515     return uct_ep->iface->ops.ep_destroy ==
Thu Feb 15 07:24:37 2018[1,83]<stderr>:==>   516                     UCS_CLASS_DELETE_FUNC_NAME(ucp_wireup_ep_t);
Thu Feb 15 07:24:37 2018[1,83]<stderr>:      517 }
Thu Feb 15 07:24:37 2018[1,83]<stderr>:      518 
Thu Feb 15 07:24:37 2018[1,83]<stderr>:      519 int ucp_wireup_ep_is_owner(uct_ep_h uct_ep, uct_ep_h owned_ep)
Thu Feb 15 07:24:37 2018[1,83]<stderr>:
Thu Feb 15 07:24:37 2018[1,63]<stderr>:==== backtrace ====
Thu Feb 15 07:24:37 2018[1,63]<stderr>: 0 0x000000000002f20a ucp_wireup_ep_test()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:516
Thu Feb 15 07:24:37 2018[1,63]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:37 2018[1,63]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:37 2018[1,63]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:37 2018[1,63]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:37 2018[1,63]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:37 2018[1,63]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:37 2018[1,63]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:37 2018[1,63]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:37 2018[1,63]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:37 2018[1,63]<stderr>:===================
Thu Feb 15 07:24:37 2018[1,61]<stderr>:==== backtrace ====
Thu Feb 15 07:24:37 2018[1,61]<stderr>: 0 0x000000000002f20a ucp_wireup_ep_test()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:516
Thu Feb 15 07:24:37 2018[1,61]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:37 2018[1,61]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:37 2018[1,61]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:37 2018[1,61]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:37 2018[1,61]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:37 2018[1,61]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:37 2018[1,61]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:37 2018[1,61]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:37 2018[1,61]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:37 2018[1,61]<stderr>:===================
Thu Feb 15 07:24:37 2018[1,83]<stderr>:==== backtrace ====
Thu Feb 15 07:24:37 2018[1,83]<stderr>: 0 0x000000000002f20a ucp_wireup_ep_test()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:516
Thu Feb 15 07:24:37 2018[1,83]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:37 2018[1,83]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:37 2018[1,83]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:37 2018[1,83]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:37 2018[1,83]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:37 2018[1,83]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:37 2018[1,83]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:37 2018[1,83]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:37 2018[1,83]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:37 2018[1,83]<stderr>:===================
Thu Feb 15 07:24:37 2018[1,64]<stderr>:[vulcan03:3979 :0:3979] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x100)
Thu Feb 15 07:24:37 2018[1,4]<stderr>:[vulcan03:3882 :0:3882] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x3)
Thu Feb 15 07:24:37 2018[1,64]<stderr>:
Thu Feb 15 07:24:37 2018[1,64]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_test() ]
Thu Feb 15 07:24:37 2018[1,64]<stderr>:      ...
Thu Feb 15 07:24:37 2018[1,64]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:37 2018[1,64]<stderr>:      514 {
Thu Feb 15 07:24:37 2018[1,64]<stderr>:      515     return uct_ep->iface->ops.ep_destroy ==
Thu Feb 15 07:24:37 2018[1,64]<stderr>:==>   516                     UCS_CLASS_DELETE_FUNC_NAME(ucp_wireup_ep_t);
Thu Feb 15 07:24:37 2018[1,64]<stderr>:      517 }
Thu Feb 15 07:24:37 2018[1,64]<stderr>:      518 
Thu Feb 15 07:24:37 2018[1,64]<stderr>:      519 int ucp_wireup_ep_is_owner(uct_ep_h uct_ep, uct_ep_h owned_ep)
Thu Feb 15 07:24:37 2018[1,64]<stderr>:
Thu Feb 15 07:24:37 2018[1,42]<stderr>:[vulcan03:3926 :0:3926] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x3)
Thu Feb 15 07:24:37 2018[1,6]<stderr>:[vulcan03:3883 :0:3883] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x3)
Thu Feb 15 07:24:37 2018[1,32]<stderr>:[vulcan03:3915 :0:3915] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x3)
Thu Feb 15 07:24:37 2018[1,4]<stderr>:
Thu Feb 15 07:24:37 2018[1,4]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_remote_connected() ]
Thu Feb 15 07:24:37 2018[1,4]<stderr>:      ...Thu Feb 15 07:24:37 2018[1,4]<stderr>:      508                                       ucp_wireup_ep_progress, wireup_ep, 0,
Thu Feb 15 07:24:37 2018[1,4]<stderr>:      509                                       &wireup_ep->progress_id);
Thu Feb 15 07:24:37 2018[1,4]<stderr>:      510     ucp_worker_signal_internal(ucp_ep->worker);
Thu Feb 15 07:24:37 2018[1,4]<stderr>:==>   511 }
Thu Feb 15 07:24:37 2018[1,4]<stderr>:      512 
Thu Feb 15 07:24:37 2018[1,4]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:37 2018[1,4]<stderr>:      514 {
Thu Feb 15 07:24:37 2018[1,4]<stderr>:
Thu Feb 15 07:24:37 2018[1,64]<stderr>:==== backtrace ====
Thu Feb 15 07:24:37 2018[1,64]<stderr>: 0 0x000000000002f20a ucp_wireup_ep_test()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:516
Thu Feb 15 07:24:37 2018[1,64]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:37 2018[1,64]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:37 2018[1,64]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:37 2018[1,64]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:37 2018[1,64]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:37 2018[1,64]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:37 2018[1,64]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:37 2018[1,64]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:37 2018[1,64]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:37 2018[1,64]<stderr>:===================
Thu Feb 15 07:24:37 2018[1,42]<stderr>:
Thu Feb 15 07:24:37 2018[1,42]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_remote_connected() ]
Thu Feb 15 07:24:37 2018[1,42]<stderr>:      ...
Thu Feb 15 07:24:37 2018[1,32]<stderr>:
Thu Feb 15 07:24:37 2018[1,32]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_remote_connected() ]
Thu Feb 15 07:24:37 2018[1,32]<stderr>:      ...
Thu Feb 15 07:24:37 2018[1,42]<stderr>:      508                                       ucp_wireup_ep_progress, wireup_ep, 0,
Thu Feb 15 07:24:37 2018[1,42]<stderr>:      509                                       &wireup_ep->progress_id);
Thu Feb 15 07:24:37 2018[1,42]<stderr>:      510     ucp_worker_signal_internal(ucp_ep->worker);
Thu Feb 15 07:24:37 2018[1,42]<stderr>:==>   511 }
Thu Feb 15 07:24:37 2018[1,42]<stderr>:      512 
Thu Feb 15 07:24:37 2018[1,42]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:37 2018[1,42]<stderr>:      514 {
Thu Feb 15 07:24:37 2018[1,42]<stderr>:
Thu Feb 15 07:24:37 2018[1,32]<stderr>:      508                                       ucp_wireup_ep_progress, wireup_ep, 0,
Thu Feb 15 07:24:37 2018[1,32]<stderr>:      509                                       &wireup_ep->progress_id);
Thu Feb 15 07:24:37 2018[1,32]<stderr>:      510     ucp_worker_signal_internal(ucp_ep->worker);
Thu Feb 15 07:24:37 2018[1,32]<stderr>:==>   511 }
Thu Feb 15 07:24:37 2018[1,32]<stderr>:      512 
Thu Feb 15 07:24:37 2018[1,32]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:37 2018[1,32]<stderr>:      514 {
Thu Feb 15 07:24:37 2018[1,32]<stderr>:Thu Feb 15 07:24:37 2018[1,6]<stderr>:
Thu Feb 15 07:24:37 2018[1,6]<stderr>:/hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c: [ ucp_wireup_ep_remote_connected() ]
Thu Feb 15 07:24:37 2018[1,6]<stderr>:      ...
Thu Feb 15 07:24:37 2018[1,6]<stderr>:      508                                       ucp_wireup_ep_progress, wireup_ep, 0,
Thu Feb 15 07:24:37 2018[1,6]<stderr>:      509                                       &wireup_ep->progress_id);
Thu Feb 15 07:24:37 2018[1,6]<stderr>:      510     ucp_worker_signal_internal(ucp_ep->worker);
Thu Feb 15 07:24:37 2018[1,6]<stderr>:==>   511 }
Thu Feb 15 07:24:37 2018[1,6]<stderr>:      512 
Thu Feb 15 07:24:37 2018[1,6]<stderr>:      513 int ucp_wireup_ep_test(uct_ep_h uct_ep)
Thu Feb 15 07:24:37 2018[1,6]<stderr>:      514 {
Thu Feb 15 07:24:37 2018[1,6]<stderr>:
Thu Feb 15 07:24:37 2018[1,4]<stderr>:==== backtrace ====
Thu Feb 15 07:24:37 2018[1,4]<stderr>: 0 0x000000000002f200 ucp_wireup_ep_remote_connected()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:511
Thu Feb 15 07:24:37 2018[1,4]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:37 2018[1,4]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:37 2018[1,4]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:37 2018[1,4]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:37 2018[1,4]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:37 2018[1,4]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:37 2018[1,4]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:37 2018[1,4]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:37 2018[1,4]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:37 2018[1,4]<stderr>:===================
Thu Feb 15 07:24:37 2018[1,42]<stderr>:==== backtrace ====
Thu Feb 15 07:24:37 2018[1,42]<stderr>: 0 0x000000000002f200 ucp_wireup_ep_remote_connected()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:511
Thu Feb 15 07:24:37 2018[1,42]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:37 2018[1,42]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:37 2018[1,42]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:37 2018[1,42]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:37 2018[1,42]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:37 2018[1,42]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:37 2018[1,42]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:37 2018[1,42]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:37 2018[1,42]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:37 2018[1,42]<stderr>:===================
Thu Feb 15 07:24:37 2018[1,32]<stderr>:==== backtrace ====
Thu Feb 15 07:24:37 2018[1,32]<stderr>: 0 0x000000000002f200 ucp_wireup_ep_remote_connected()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:511
Thu Feb 15 07:24:37 2018[1,32]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:37 2018[1,32]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:37 2018[1,32]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:37 2018[1,32]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:37 2018[1,32]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:37 2018[1,32]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:37 2018[1,32]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:37 2018[1,32]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:37 2018[1,32]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:37 2018[1,32]<stderr>:===================
Thu Feb 15 07:24:37 2018[1,6]<stderr>:==== backtrace ====
Thu Feb 15 07:24:37 2018[1,6]<stderr>: 0 0x000000000002f200 ucp_wireup_ep_remote_connected()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:511
Thu Feb 15 07:24:37 2018[1,6]<stderr>: 1 0x000000000002f229 ucp_wireup_ep_get_aux_rsc_index()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup_ep.c:354
Thu Feb 15 07:24:37 2018[1,6]<stderr>: 2 0x0000000000030e88 ucp_wireup_send_request()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/wireup/wireup.c:692
Thu Feb 15 07:24:37 2018[1,6]<stderr>: 3 0x00000000000242ed ucp_ep_connect_remote()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/proto/proto.h:66
Thu Feb 15 07:24:37 2018[1,6]<stderr>: 4 0x0000000000028be0 ucp_tag_send_req()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ucx-v1.3.x/src/ucp/tag/tag_send.c:79
Thu Feb 15 07:24:37 2018[1,6]<stderr>: 5 0x0000000000005561 mca_pml_ucx_send_nbr()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mca/pml/ucx/pml_ucx.c:757
Thu Feb 15 07:24:37 2018[1,6]<stderr>: 6 0x000000000006d46d PMPI_Send()  /hpc/local/benchmarks/hpcx_install_2018-02-14/src/hpcx-gcc-redhat7.4/ompi-v3.1.x/ompi/mpi/c/profile/psend.c:78
Thu Feb 15 07:24:37 2018[1,6]<stderr>: 7 0x000000000040276c main()  ???:0
Thu Feb 15 07:24:37 2018[1,6]<stderr>: 8 0x0000000000021c05 __libc_start_main()  ???:0
Thu Feb 15 07:24:37 2018[1,6]<stderr>: 9 0x0000000000402579 _start()  ???:0
Thu Feb 15 07:24:37 2018[1,6]<stderr>:===================-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
---------------------------------------------------------------------------------------------------------------------------------
mpirun noticed that process rank 9 with PID 12958 on node vulcan04 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------+ rc=139
+ exit 139

MTT link:
http://hpcweb.lab.mtl.com//hpc/mtr_scrap/users/mtt/scratch/ucx_ompi/20180214_233132_9872_141709_vulcan03/html/test_stdout_edsx1i.txt

@bureddy
Copy link
Contributor

bureddy commented Feb 15, 2018

self ucp ep dest_uuid classed with internal mem type ep

@yosefe
Copy link
Contributor

yosefe commented Feb 16, 2018

@bureddy is it fixed by #2317 and #2319 ?

@bureddy
Copy link
Contributor

bureddy commented Feb 16, 2018

@yosefe yes

@yosefe yosefe closed this as completed Feb 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants