-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCP/CORE: Complete outstanding RNDV reqs after all UCT lanes are destroyed #227
UCP/CORE: Complete outstanding RNDV reqs after all UCT lanes are destroyed #227
Conversation
6881b25
to
2f82417
Compare
src/ucp/tag/rndv.c
Outdated
@@ -202,6 +202,15 @@ size_t ucp_tag_rndv_rts_pack(void *dest, void *arg) | |||
return sizeof(*rndv_rts_hdr) + packed_rkey_size; | |||
} | |||
|
|||
static void ucp_rndv_req_cancel(ucp_request_t *sreq, ucs_status_t status) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
smith like ucp_rndv_req_add_to_cancel_list
src/ucp/tag/rndv.c
Outdated
@@ -234,7 +243,7 @@ UCS_PROFILE_FUNC(ucs_status_t, ucp_proto_progress_rndv_rts, (self), | |||
return UCS_ERR_NO_RESOURCE; | |||
} else { | |||
ucs_assert(UCS_STATUS_IS_ERR(status)); | |||
ucp_rndv_complete_send(sreq, status, "rts_cancel"); | |||
ucp_rndv_req_cancel(sreq, status); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert
ee93157
to
c6ecd5c
Compare
@evgeny-leksikov @brminich could you review pls? |
@yosefe could you review pls? |
src/ucp/core/ucp_request.c
Outdated
@@ -490,7 +490,7 @@ void ucp_request_handle_send_error(ucp_request_t *req, ucs_status_t status) | |||
} | |||
} else { | |||
if (req->flags & UCP_REQUEST_FLAG_SEND_RNDV) { | |||
ucp_rndv_complete_send(req, UCS_ERR_CANCELED, "rndv_flush"); | |||
ucp_rndv_req_add_to_cancel_list(req, status); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if i could not send an RTS, why not release it immediately?
the receiver should not be aware of the rndv operation
src/ucp/tag/rndv.c
Outdated
@@ -202,6 +202,19 @@ size_t ucp_tag_rndv_rts_pack(void *dest, void *arg) | |||
return sizeof(*rndv_rts_hdr) + packed_rkey_size; | |||
} | |||
|
|||
void ucp_rndv_req_add_to_cancel_list(ucp_request_t *sreq, ucs_status_t status) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think better rename to
ucp_rndv_req_add_to_canceled_list
src/ucp/tag/rndv.c
Outdated
ucp_request_send(sreq, 0); | ||
} | ||
if (sreq->flags & UCP_REQUEST_FLAG_RNDV_RTS_SENT) { | ||
ucp_rndv_req_add_to_cancel_list(sreq, UCS_ERR_CANCELED); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like we don't really send cancel now?
5a153cb
to
6a4677f
Compare
the MTT/io-demo passed successfully 20 iterations during the night with the latest changes. |
passed 20 more iterations |
6a4677f
to
9de9fbc
Compare
@yosefe thanks for the review! squashed |
Fixes:
found by NVDA/MLNX MTT (internal link)