-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCP/CORE/RNDV/GTEST: Handle status from AM/TAG RNDV RTS/data correctly #6163
Conversation
f513f47
to
d4edc65
Compare
4eef417
to
44e567b
Compare
src/ucp/rndv/rndv.c
Outdated
} else if (status == UCP_STATUS_PENDING_SWITCH) { | ||
status = UCS_OK; | ||
|
||
ret_status = ucp_am_bcopy_handle_status_from_pending(self, !single, 0, 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only place, where no_complete
is passed to ucp_am_bcopy_handle_status_from_pending
.
I'd remove changes in ucp_am_bcopy_handle_status_from_pending
and rewrite code here without it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
test/gtest/ucp/test_ucp_sockaddr.cc
Outdated
@@ -1137,6 +1134,16 @@ class test_ucp_sockaddr_protocols : public test_ucp_sockaddr { | |||
ucp_request_release(sreq); | |||
} | |||
|
|||
void extra_send_before_disconenct(entity &e, const std::string &send_buf, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
disconnect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
test/gtest/ucp/test_ucp_sockaddr.cc
Outdated
"RNDV_THRESH=0", "RNDV_SCHEME=get_zcopy") | ||
{ | ||
test_tag_send_recv(64 * UCS_KBYTE, false, false); | ||
} | ||
|
||
UCS_TEST_P(test_ucp_sockaddr_protocols_err, tag_rndv_unexp_get_scheme, | ||
"RNDV_THRESH=0", "RNDV_SCHEME=get_zcopy") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the difference in these tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when we are asking for GET_ZCOPY, but transport doesn't support it (e.g. TCP), it fallbacks to RNDV AM
the idea is to cover as much as possible cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these two tests are identical (this one and the previous one)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
bot:pipe:retest |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
bot:pipe:retest |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
test/gtest/ucp/test_ucp_sockaddr.cc
Outdated
UCP_INSTANTIATE_TEST_CASE_TLS(_test_case, rcx, "rc_x") \ | ||
UCP_INSTANTIATE_TEST_CASE_TLS(_test_case, ib, "ib") \ | ||
UCP_INSTANTIATE_TEST_CASE_TLS(_test_case, tcp, "tcp") \ | ||
UCP_INSTANTIATE_TEST_CASE_TLS(_test_case, dcudx, "dc_x,ud," UCP_TEST_GPU_COPY_TLS) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we reuse UCP_INSTANTIATE_TEST_CASE_GPU_AWARE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately, they have different sets of TLs instantiated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to unite some common parts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yosefe is it ok to have UCP_INSTANTIATE_TEST_CASE_TLS_GPU_AWARE
macro to instantiate a test case with GPU support?
I didn't find a better way to reuse UCP_INSTANTIATE_TEST_CASE_GPU_AWARE
macro for CM instantiation macro.
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
e0f6c8c
to
0fb779c
Compare
@yosefe could you review pls? |
ce154d7
to
20bf4f0
Compare
@yosefe ok to merge? |
What
Handle status from AM/TAG RNDV RTS/data correctly in UCP progress functions.
Why ?
If AM/TAG RNDV RTS/data sending failed in a progress function (i.e. the status is neither
UCS_OK
norUCS_ERR_NO_RESOURCE
), a UCP request has to be completed with the status, butUCS_OK
should be returned from a function to satisfyucp_request_try_send()
expectations thatUCS_OK
/UCS_INPROGRESS
/UCS_ERR_NO_RESOURCE
statuses could be returned from progress functions.How ?
ucp_rndv_rts_handle_status_from_pending()
to handlestatus
returned from RTS send progress function.status
returned for sending AM/TAG RNDV RTS packet.ucp_am_bcopy_handle_status_from_pending()
w/o completing a request.ucp_proto_progress_am_single()
.test_ucp_sockaddr_protocols_err
to reproduce the bug fixed in this PR:test_ucp_sockaddr_protocols_err
to reproduce the bug fixed in UCP/PROTO: Handle AM short failure correctly #6157 PR: