Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCP/EP: Cleanup proto reqs on failure #5821

Merged
merged 1 commit into from
Mar 31, 2021

Conversation

dmitrygx
Copy link
Member

@dmitrygx dmitrygx commented Oct 21, 2020

What

Clean up proto reqs on failure

Why ?

To fix UCP request leak when they are not UCT-managed (i.e. not send in-progress or not in pending)

How ?

Use UCP EP extension to save submitted UCP requests (only TAG sync send and TAG/AM RNDV) to hlist
Purge them from hlist when UCP EP failed

@dmitrygx dmitrygx force-pushed the topic/ucp/req_proto_err_handling branch from 5a5a837 to 6565d36 Compare October 21, 2020 17:34
@dmitrygx dmitrygx changed the title UCP/EP: Clean up proto reqs (send only) on failure UCP/EP: Clean up proto reqs on failure Oct 27, 2020
@dmitrygx dmitrygx force-pushed the topic/ucp/req_proto_err_handling branch from 1875428 to f09147b Compare October 27, 2020 14:52
@dmitrygx dmitrygx changed the title UCP/EP: Clean up proto reqs on failure UCP/EP: Cleanup proto reqs on failure Oct 27, 2020
@dmitrygx dmitrygx force-pushed the topic/ucp/req_proto_err_handling branch 3 times, most recently from 69d47f5 to 592b80a Compare October 27, 2020 21:17
@dmitrygx dmitrygx force-pushed the topic/ucp/req_proto_err_handling branch 7 times, most recently from eee98d7 to 91014aa Compare November 12, 2020 15:53
@dmitrygx
Copy link
Member Author

@brminich @hoopoepg @evgeny-leksikov could you review pls?

Comment on lines 2320 to 2327
/* it means that purging started from a request responsible
* for sending RTR, so a request responsible for copying
* data from staging buffer is a receive request */
req->super_req->recv.remaining -= req->recv.length;
} else {
/* it means that purging started from a request responsible
* for sending RTR, so a request responsible for copying
* data from staging buffer is a send request */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls unite to single comment just pointing to difference.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -50,17 +50,17 @@


/* defined as a macro to print the call site */
#define ucp_request_get(_worker) \
({ \
static inline ucp_request_t* ucp_request_get(ucp_worker_h _worker) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • USC_F_ALWAYS_INLINE
  • pls remove ending \s, no need in func

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and move function below defines

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also func names do not start with _

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is leftover from debugging
rolled back to the previous state

ucs_hlist_add_tail(&flush_state->reqs,
&req->list_elem);
ucs_trace_req("added flush request %p to ep remote completion "
" queueu with sn %d",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo queueu and extra space

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

ucp_rndv_rtr_pack,
sizeof(ucp_rndv_rtr_hdr_t) +
packed_rkey_size);
if ((status != UCS_OK) && (status != UCS_ERR_NO_RESOURCE)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ucs_unlikely

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -821,6 +802,10 @@ ucp_rndv_recv_frag_put_mem_type(ucp_request_t *rreq, ucp_request_t *rndv_req,
ucp_ep_use_indirect_id(freq->send.ep));
}

ucs_assert(freq->send.ep != NULL);
ucs_hlist_add_tail(&ucp_ep_ext_gen(freq->send.ep)->proto_reqs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to save fragments on the list for put? should not transport report failure? otherwise the request should stay on pending, rigth?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is PUT operation from HOST staging buffer to GPU device buffer after receiving ATP from a peer on each fragment or after successful GET operation in pipelined RNDV
yes, we don't need this, since it is a local operation

removed

@@ -48,7 +48,7 @@ UCS_TEST_F(test_obj_size, size) {
#else
EXPECTED_SIZE(ucp_ep_t, 64);
/* TODO reduce request size to 240 or less after removing old protocols state */
EXPECTED_SIZE(ucp_request_t, 296);
EXPECTED_SIZE(ucp_request_t, 312);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no way to avoid this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did two PRs to decrease UCP request size from 320 bytes to 296 bytes: #5829 and #5825
I have the privilege to use at least 16 bytes :)
unfortunately, no way to reduce this - I tried to refactor UCP request send part, but it has ucp_wireup_msg_t in the union that could be reduced and it is our limit...

@@ -50,17 +50,17 @@


/* defined as a macro to print the call site */
#define ucp_request_get(_worker) \
({ \
static inline ucp_request_t* ucp_request_get(ucp_worker_h _worker) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and move function below defines

#include "ucp_worker.h"
#include "ucp_am.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why it moved?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by mistake

Comment on lines 147 to 148
ucs_trace_req("added flush request %p to ep remote completion "
" queueu with sn %d",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are 2 spaces in target string completion++queueU

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

@brminich brminich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add some tests to make sure the functionality is covered?

static void ucp_ep_req_purge(ucp_request_t *req, ucs_status_t status,
int recursive)
{
if (req->flags & (UCP_REQUEST_FLAG_SEND_AM |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: seems to fit to 80 syms
(same below)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done
but it doesn't fit 80 syms below

/* don't release rndv request in case of success, since it was sent to
* a peer as a remote request ID, and we will use the req to track an
* user-exposed receive request (and request for copying data from staging
* buffer in case of fragmented RNDV) */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: maybe use pipelined instead of fragmented to avoid confusion with multirail?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -1297,6 +1282,16 @@ UCS_PROFILE_FUNC(ucs_status_t, ucp_rndv_rts_handler,
}
}

static void ucp_rndv_ats_complete(ucp_request_t *sreq, ucs_status_t status)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for separate func, which is called just once

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -50,17 +50,17 @@


/* defined as a macro to print the call site */
#define ucp_request_get(_worker) \
({ \
static inline ucp_request_t* ucp_request_get(ucp_worker_h _worker) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also func names do not start with _

src/ucp/core/ucp_request.inl Outdated Show resolved Hide resolved
src/ucp/core/ucp_ep.c Outdated Show resolved Hide resolved
src/ucp/rndv/rndv.c Outdated Show resolved Hide resolved
src/ucp/rndv/rndv.c Outdated Show resolved Hide resolved
src/ucp/core/ucp_ep.c Outdated Show resolved Hide resolved
src/ucp/rndv/rndv.inl Outdated Show resolved Hide resolved
src/ucp/core/ucp_request.inl Outdated Show resolved Hide resolved
src/ucp/rndv/rndv.inl Outdated Show resolved Hide resolved
src/ucp/core/ucp_request.inl Outdated Show resolved Hide resolved
src/ucp/core/ucp_request.inl Outdated Show resolved Hide resolved
src/ucp/rndv/rndv.inl Outdated Show resolved Hide resolved
src/ucp/core/ucp_request.inl Show resolved Hide resolved
src/ucp/core/ucp_request.inl Outdated Show resolved Hide resolved
src/ucp/rndv/rndv.c Outdated Show resolved Hide resolved
@dmitrygx
Copy link
Member Author

bot:pipe:retest

@dmitrygx
Copy link
Member Author

@evgeny-leksikov @hoopoepg @brminich is it ok now?

@dmitrygx
Copy link
Member Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dmitrygx
Copy link
Member Author

failure is #5840

@dmitrygx dmitrygx force-pushed the topic/ucp/req_proto_err_handling branch from c5c92c5 to c26fca8 Compare March 17, 2021 20:44
@dmitrygx dmitrygx removed the WIP-DNM Work in progress / Do not review label Mar 17, 2021
@dmitrygx dmitrygx force-pushed the topic/ucp/req_proto_err_handling branch 9 times, most recently from 6eeb26c to 4488275 Compare March 24, 2021 15:48
@dmitrygx
Copy link
Member Author

@hoopoepg @brminich could you review pls?

src/ucp/core/ucp_ep.c Show resolved Hide resolved
Comment on lines +2676 to +2683
if (req->id != UCP_REQUEST_ID_INVALID) {
ucp_request_id_release(req);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible some of the protocols cleanup flow would try to release the request it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch, Yossi
currently, we don't have such protocols, but they could be implemented in the future
fixed this comment by moving checking/releasing of request ID at the end of the function, but need to implement some hack (save and remove RELEASE flag before calling request complete procedure)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe "extract" the id from the request to a local variable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then protocols which expect that request ID is valid will fail ucp_request_id_release()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decided to not do it for now

src/ucp/core/ucp_ep.c Show resolved Hide resolved
src/ucp/core/ucp_ep.c Outdated Show resolved Hide resolved
src/ucp/core/ucp_ep.c Show resolved Hide resolved
Comment on lines 2640 to 2736
if ((wireup_ep != NULL) && ucp_wireup_ep_test(wireup_ep)) {
/* flush state is not valid yet */
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we check some ep flags instead? a more "direct" check if flush state is valid or not?

src/ucp/core/ucp_request.inl Outdated Show resolved Hide resolved
src/ucp/core/ucp_request.inl Outdated Show resolved Hide resolved
src/ucp/core/ucp_request.inl Outdated Show resolved Hide resolved
src/ucp/rndv/rndv.c Show resolved Hide resolved
test/gtest/ucp/test_ucp_sockaddr.cc Show resolved Hide resolved
@@ -839,8 +847,11 @@ ucp_request_id_release(ucp_request_t *req)
ucp_request_id_reset(req);

ucs_assert(req->send.ep != NULL);
ucs_hlist_del(&ucp_ep_ext_gen(req->send.ep)->proto_reqs,
&req->send.list_elem);
if (ucs_unlikely(ucp_ep_config(req->send.ep)->key.err_mode ==
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we unite it with the "if" inside ucs_ptr_map_del() which checks for direct/indirect id? to avoid extra branch on fast path?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but user can set UCX_PROTO_INDIRECT_ID=n in case of error handling too and vice versa (UCX_PROTO_INDIRECT_ID=y in case of non-error handling)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i see the problem, need to discuss this offline
for now, i would add comment ion the code that we should optimize this
BTW it happens only for eager/sync, rndv, and sw-rma flow, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now, i would add comment ion the code that we should optimize this

added the comment

BTW it happens only for eager/sync, rndv, and sw-rma flow, right?

yes, it is right

Comment on lines 2677 to 2679
/* Mark release to be not released from the ucp_request_complete() procedure
* to be able do some actions with this request (e.g. check and release
* request ID) after completing an operation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it's better to release the request id for each protocol explicitly. after we find which protocol is it according to req flags, we can decide whether need to release the id or not. and potentially unite code with normal-flow completion and id release.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean to release ID from protocol completion callback?
but it is not possible sometimes, because some of them, e.g. RNDV PUT rely on the fact that request ID will be released upon receiving RTR packet, since ID is no longer needed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean from the if/else/else ... code below
is it possible some protocols release the id inside the completion callback?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible some protocols release the id inside the completion callback?

no, currently we don't have such protocols.
but the existing protocols (or new ones) can be updated to release the ID inside its completion callback

src/ucp/core/ucp_ep.c Outdated Show resolved Hide resolved
src/ucp/core/ucp_ep.c Outdated Show resolved Hide resolved
src/ucp/rndv/rndv.c Outdated Show resolved Hide resolved
@yosefe
Copy link
Contributor

yosefe commented Mar 29, 2021

error seems relevant:

2021-03-29T12:10:16.1473466Z -------------------------------------------------------
2021-03-29T12:10:16.1474321Z  T E S T S
2021-03-29T12:10:16.1475403Z -------------------------------------------------------
2021-03-29T12:10:16.8047278Z Running org.openucx.jucx.UcpMemoryTest
2021-03-29T12:10:17.6701981Z [swx-rdmz-ucx-gpu-02:18494:0:18514]    ucp_ep.inl:147  Assertion `ep->flags & UCP_EP_FLAG_FLUSH_STATE_VALID' failed
2021-03-29T12:10:17.6875275Z /scrap/azure/agent-08/AZP_WORKSPACE/1/s/src/ucp/core/ucp_ep.inl: [ ucp_ep_flush_state() ]
2021-03-29T12:10:17.6877076Z       ...
2021-03-29T12:10:17.6877838Z       144 
2021-03-29T12:10:17.6878777Z       145 static UCS_F_ALWAYS_INLINE ucp_ep_flush_state_t* ucp_ep_flush_state(ucp_ep_h ep)
2021-03-29T12:10:17.6879718Z       146 {
2021-03-29T12:10:17.6881071Z ==>   147     ucs_assert(ep->flags & UCP_EP_FLAG_FLUSH_STATE_VALID);
2021-03-29T12:10:17.6882431Z       148     ucs_assert(!(ep->flags & UCP_EP_FLAG_ON_MATCH_CTX));
2021-03-29T12:10:17.6883770Z       149     ucs_assert(!(ep->flags & UCP_EP_FLAG_CLOSE_REQ_VALID));
2021-03-29T12:10:17.6885046Z       150     return &ucp_ep_ext_gen(ep)->flush_state;
2021-03-29T12:10:17.7043666Z ==== backtrace (tid:  18514) ====
2021-03-29T12:10:17.7046765Z  0 0x000000000003338e ucp_ep_flush_state()  /scrap/azure/agent-08/AZP_WORKSPACE/1/s/src/ucp/core/ucp_ep.inl:147
2021-03-29T12:10:17.7048525Z  1 0x000000000003338e ucp_ep_reqs_purge()  /scrap/azure/agent-08/AZP_WORKSPACE/1/s/src/ucp/core/ucp_ep.c:2734
2021-03-29T12:10:17.7050164Z  2 0x000000000003384c ucp_ep_destroy_base()  /scrap/azure/agent-08/AZP_WORKSPACE/1/s/src/ucp/core/ucp_ep.c:263
2021-03-29T12:10:17.7051683Z  3 0x000000000003384c ucp_ep_remove_ref()  /scrap/azure/agent-08/AZP_WORKSPACE/1/s/src/ucp/core/ucp_ep.c:281
2021-03-29T12:10:17.7053139Z  4 0x000000000003410a ucp_worker_destroy_mem_type_endpoints()  /scrap/azure/agent-08/AZP_WORKSPACE/1/s/src/ucp/core/ucp_ep.c:477
2021-03-29T12:10:17.7054582Z  5 0x000000000004713e ucp_worker_destroy()  /scrap/azure/agent-08/AZP_WORKSPACE/1/s/src/ucp/core/ucp_worker.c:2372
2021-03-29T12:10:17.7056324Z  6 0x000000000004713e ucp_worker_close_cms()  /scrap/azure/agent-08/AZP_WORKSPACE/1/s/src/ucp/core/ucp_worker.c:1393
2021-03-29T12:10:17.7057735Z  7 0x000000000004713e ucp_worker_destroy()  /scrap/azure/agent-08/AZP_WORKSPACE/1/s/src/ucp/core/ucp_worker.c:2373
2021-03-29T12:10:17.7058604Z =================================
2021-03-29T12:10:17.7059760Z [swx-rdmz-ucx-gpu-02:18494:0:18514] Process frozen...
2021-03-29T13:02:59.2572990Z ##[error]The operation was canceled.

Copy link
Member Author

@dmitrygx dmitrygx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error seems relevant:

yes, need to fix flush_state

Comment on lines 2640 to 2736
if ((wireup_ep != NULL) && ucp_wireup_ep_test(wireup_ep)) {
/* flush state is not valid yet */
return;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed - I found that it isn't needed anymore

@@ -846,7 +851,7 @@ UCS_PROFILE_FUNC_VOID(ucp_rndv_recv_frag_put_completion, (self),
rreq->recv.remaining, freq->send.length);
rreq->recv.remaining -= freq->send.length;
if (rreq->recv.remaining == 0) {
ucp_rndv_recv_req_complete(rreq, UCS_OK);
ucp_request_complete_and_dereg_recv_rndv(rreq, UCS_OK);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -839,8 +847,11 @@ ucp_request_id_release(ucp_request_t *req)
ucp_request_id_reset(req);

ucs_assert(req->send.ep != NULL);
ucs_hlist_del(&ucp_ep_ext_gen(req->send.ep)->proto_reqs,
&req->send.list_elem);
if (ucs_unlikely(ucp_ep_config(req->send.ep)->key.err_mode ==
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now, i would add comment ion the code that we should optimize this

added the comment

BTW it happens only for eager/sync, rndv, and sw-rma flow, right?

yes, it is right

Comment on lines 2677 to 2679
/* Mark release to be not released from the ucp_request_complete() procedure
* to be able do some actions with this request (e.g. check and release
* request ID) after completing an operation.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible some protocols release the id inside the completion callback?

no, currently we don't have such protocols.
but the existing protocols (or new ones) can be updated to release the ID inside its completion callback

Comment on lines +2676 to +2683
if (req->id != UCP_REQUEST_ID_INVALID) {
ucp_request_id_release(req);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decided to not do it for now

src/ucp/core/ucp_ep.c Outdated Show resolved Hide resolved
* flush state won't be used (flush state is considered as valid, when
* EP doesn't exist on matchong context and remoe EP ID is set) */
ucp_ep_update_flags(ep, 0,
UCP_EP_FLAG_ON_MATCH_CTX | UCP_EP_FLAG_REMOTE_ID);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we set remote id here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we release the flag here instead to not use flush state

} else {
ucp_ep_update_remote_id(ep, msg->src_ep_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now, returned back - but changed other places
we rely on the fact that if REMOTE_ID is set then FLUSH state is valid
so, changed it to check FLUSH_STATE/ON_MACTCH_CTX flags when updating remote ID

@@ -232,6 +232,7 @@ static inline void ucp_ep_flush_state_reset(ucp_ep_h ep)
((flush_state->send_sn == 0) &&
(flush_state->cmpl_sn == 0) &&
ucs_hlist_is_empty(&flush_state->reqs)));
ucs_assert(ep->flags & UCP_EP_FLAG_REMOTE_ID);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why need this assertion?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to make sure the we reset flush state after updating a remote ID
so, we can rely on the flag

I think better to make updating remote ID dependant on flush state or matching context flags - done it

Copy link
Contributor

@yosefe yosefe Mar 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think better not introduce such dependency in this PR, since it can require more substantial refactoring

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think better not introduce such dependency in this PR, since it can require more substantial refactoring

I have already done it :)
just need to change the sequence

@dmitrygx
Copy link
Member Author

@yosefe @brminich could you review pls? your comments were fixed

@dmitrygx
Copy link
Member Author

failure is #6568

src/ucp/core/ucp_ep.c Outdated Show resolved Hide resolved
src/ucp/core/ucp_request.inl Outdated Show resolved Hide resolved
src/ucp/rndv/rndv.c Show resolved Hide resolved
src/ucp/core/ucp_request.inl Outdated Show resolved Hide resolved
src/ucp/core/ucp_request.inl Outdated Show resolved Hide resolved
@dmitrygx dmitrygx force-pushed the topic/ucp/req_proto_err_handling branch from 59d2e6a to dc909eb Compare March 30, 2021 19:46
src/ucp/core/ucp_ep.c Outdated Show resolved Hide resolved
src/ucp/core/ucp_request.inl Outdated Show resolved Hide resolved
@dmitrygx dmitrygx force-pushed the topic/ucp/req_proto_err_handling branch from dc909eb to d8b364b Compare March 31, 2021 08:26
@yosefe yosefe merged commit 0477cce into openucx:master Mar 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants