-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
validate rma lane if rkey is not needed for mem type #2995
Conversation
Test FAILed. |
Test FAILed. |
@@ -40,6 +42,8 @@ void ucp_rkey_packed_copy(ucp_context_h context, ucp_md_map_t md_map, | |||
*(ucp_md_map_t*)p = md_map; | |||
p += sizeof(ucp_md_map_t); | |||
|
|||
*((uint8_t *)p++) = mem_type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls use uct_memory_type_t
instead of uint8_t
for casting and sizeof
where relevant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@evgeny-leksikov uct_memory_type_t is defined as enum. wanted to save on rkey packed size with typecasting to uint8_t.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then pls check if we have static assert for UCT_MD_MEM_TYPE_LAST <= 255 or similar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added assert while packing rkey
Test FAILed. |
Test FAILed. |
Test FAILed. |
bot:mlx:retest |
Test PASSed. |
Test FAILed. |
@@ -13,7 +13,10 @@ | |||
#include <inttypes.h> | |||
|
|||
|
|||
static ucp_md_map_t ucp_mem_dummy_buffer = 0; | |||
static struct { | |||
ucp_md_map_t md_map; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent on column
src/ucp/core/ucp_rkey.c
Outdated
static struct { | ||
ucp_md_map_t md_map; | ||
uint8_t mem_type; | ||
} UCS_S_PACKED ucp_mem_dummy_buffer = {0, 0}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe use UCT_MEM_TYPE_xx constant instead of 0?
src/ucp/core/ucp_rkey.c
Outdated
if ((md_index != UCP_NULL_RESOURCE) && | ||
(!(context->tl_mds[md_index].attr.cap.flags & UCT_MD_FLAG_NEED_RKEY))) | ||
(!(md_attr->cap.flags & UCT_MD_FLAG_NEED_RKEY)) && | ||
(!rkey || (md_attr->cap.mem_type == mem_type && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
( ) around internal conditions
also - maybe use helper bool variable? it's hard to parse this..
4b22ebb
to
8276428
Compare
Test FAILed. |
Test PASSed. |
Test PASSed. |
Test PASSed. |
src/ucp/core/ucp_rkey.c
Outdated
{ | ||
/* Lane does not need rkey, can use the lane with invalid rkey */ | ||
*uct_rkey_p = UCT_INVALID_RKEY; | ||
return lane; | ||
if (!rkey || ((md_attr->cap.mem_type == mem_type) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why need to check rkey==NULL? we already assume in this function that rkey!=NULL, in line 393
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently, RKEY is not needed for pipeline stating protocol on local mem type(cuda) endpoint rma bw lane (only cuda_copy) for it's zcopy ops to move data from GPU to CPU memory. So, instead creating dummy key pack/unpack I had passed passed NULL here (https://github.com/openucx/ucx/pull/2995/files#diff-f89c8c0721aa5b2b55347e1612b8395bR880).
src/ucp/core/ucp_rkey.c
Outdated
*uct_rkey_p = UCT_INVALID_RKEY; | ||
return lane; | ||
if (!rkey || ((md_attr->cap.mem_type == mem_type) && | ||
(md_attr->cap.mem_type == rkey->mem_type))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: can compare rkey->mem_type to mem_type to save pointer dereference
Test PASSed. |
Test PASSed. |
What
Fix rma lane selection for CUDA mem type when there is lane with is not required RKEY(ex CMA)
Why ?
When KNEM is not present, CMA will be selected for RMA for which RKEY is not needed. When RKEY is not needed, it is hard determinically indicate if it can do RMA to specific mem type. With current mem type protocols it is not possible to support all combinations for src and dst mem type combinations( {(host,host) (host,cuda), (cuda,host), (cuda, cuda)) in this scenario.
Fixes #2919
How ?
pack mem_type info in RKEY and validate rma lane mem type with remote and local memory type