-
Notifications
You must be signed in to change notification settings - Fork 376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More API updates for 2.0 #10317
base: main
Are you sure you want to change the base?
More API updates for 2.0 #10317
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stopping review to move discussion to PR
This change is kind of like memory windows, but not. It requires the provider support 64-bit keys (or smaller). And it's actually adding 2 concepts together. FI_MR_SINGLE_USE is really independent and could apply to existing registrations. There's no way for a provider to indicate that it supports single use regions, outside of just trying the flag in a call and seeing if it works. I don't know if this should be a domain capability, but maybe... The dynamic keys are basically trying to create a new MR object that references the same pinned pages. That's equivalent to a memory window. However, MWs are created by posting to a QP/EP, whereas, this is a MR/domain operation. In either case, the user should be given a fid_mr structure here, not a u64 key. That allows integration with the other fi_mr calls (map, unmap, bind, refresh, enable). This is probably doable by adding fid_mr to the fi_mr_attr, uhm, somewhere. The user can just create/destroy the extra MRs through the existing calls, with the same capabilities/restrictions that the provider has for other MRs (user or provider selected keys). I don't know if we would actually need a new capability in the latter case. A provider could always perform a second registration, in which case, saving on the page pinning is simply an optimization. |
@shefty Thanks for the feedback. Yes, single use and dynamic key assignment did come as two separate issues but I combined them here with the assumption that single use could be simpler with the key instead of the entire MR. I agree that this mostly equivalent to memory windows. One consideration about the bulk key allocations is that it may have the advantage of avoiding kernel involvement when the key is assigned (in case the provider doesn't allow user selected keys). But how much can be gained from that is unclear. I am going to rework the patch, maybe along the line of getting closer to a MW-like approach. |
Bulk key allocation could be defined as an attribute when the original MR is created. I agree that a single use MR is less useful than a single use window/key. |
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
PR updated. The first three commits cover the rework of the original dynamic MR proposal. Four more 2.0 API related commits are added. PR title updated to reflect the new scope. |
PR updated to address comments. |
@shefty Do you see any more changes needed? |
man/fi_mr.3.md
Outdated
## base_mr | ||
|
||
If non-NULL, the base_mr field is used to specify the MR from which a | ||
sub-MR is to be created. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You describe a sub-mr in the commit message, but it's not defined in the man pages. So a user doesn't understand what this is. From the viewpoint of the API, the user only knows of an MR. Note that it's possible to derive a new MR from a MR that itself is derived from another MR.
I would rework the text.
If non-NULL, creates a new memory region that is associated with a previously created region.
The new region must be fully contained within the existing region; however, new region has
its own access rights. The following attributes are inherited by the new region, and as a result,
are ignored when creating the new region:
auth_key_size, auth_key, iface, device, hmem_data, page_size, reserved_key_count
(what attributes are actually inherited, versus which may be specified?).
Associated memory regions may be used to shared hardware resources, such as virtual to physical
address translations and page pinning. This can improve performance when creating and destroying
sub-regions that need different access rights.
Use of this field requires API version X.Y.
blah blah
man/fi_mr.3.md
Outdated
|
||
The base_mr field must be NULL if the FI_MR_DMABUF flag is set. | ||
|
||
## reserved_key_count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rename. From the viewpoint of the application, this is the number of associated memory regions. The API also uses 'cnt' instead of 'count'.
sub_mr_cnt ?
assoc_mr_cnt ?
bound_mr_cnt ?
linked_mr_cnt ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly. One could reserve less keys and when used out, revert back to regular key allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is about the name. The application is creating an associated MR (not a key). What, if anything, the provider pre-allocates isn't defined. It could pre-allocate the fid_mr's. I excluded using the term 'max' as part of the name because it may or may not be a max.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, now I see your point. Since application does not need to know the pre-allocation details, it makes sense to make it a hint for intended usage instead.
man/fi_mr.3.md
Outdated
memory region is automatically invalidated and can no longer be used for | ||
remote access. This flag is intended to be used when registering a | ||
sub-MR to get a single use key of the base MR. However, it can be used | ||
with any memory region. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop text referring to sub-mr. It applies to any MR, no need to call out an intent.
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Currently FI_MULTI_RECV is effectively only defined for untagged message only. Simply expanding the definition to tagged message would cause difficulties in either provider support or discovery. Define FI_TAGGED_MULTI_RECV to indicate that multi recv is supported in tagged message as well. This is only used as a capability bit. The op flag and cq flag continues to use FI_MULTI_RECV. Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
FI_DIRECTED_RECV covers both untagged and tagged message. However, the most often used case is for tagged message. Having a saparate bit for tagged message allows the provider to optimize non-tagged messsage implementation while maintain support directed recv over tagged message. Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
The new bit FI_EXACT_DIRECTED_RECV is similar to FI_DIRECTED_RECV, but requires exact source address. I.e., the wildcard address FI_ADDR_UNSPEC is not allowed. It can be used alone, or be used together with FI_DIRECTED_RECV or FI_TAGGED_DIRECTED_RECV as a modifier. Not allowing wildcard source address allows the provider to better optmize the receive handling. Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Memory registration consists of two parts: map/pin the memory for local access and export with a key for remote access. The first part is usually heavyweight and requries kernel involvement. The second part is less expensive and can be further separated into key allocation and key assignment. Key allocation may needs kernel involvement, but key assignment can be done in user space. Here sub-MR is introduced as a way to allow separattion of the forementioned two parts, and key reservation is added to further optimize sub-MR creation. A sub-MR is created from an existing MR (the base MR). It inherits the memory mapping/pinning of the base MR but has its own access key. The address range exposed can be same as the the base MR or a subpart of that. The access rights can be different, too. Now the base MR can be created with a few extra keys reserved. These reserved keys will be automatically used for sub-MR registration. This only applies to FI_MR_PROV_KEY mode. Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
@shefty PR updated based on your feedback w/ some tweak. I added definition of sub-MR and bass MR to the beginning and kept the use of such terminology later. |
man/fi_msg.3.md
Outdated
@@ -173,6 +173,11 @@ to write CQ entries for all successful completions. See the flags | |||
discussion below for more details. The requested message size that | |||
can be used with fi_inject is limited by inject_size. | |||
|
|||
If FI_HMEM is enabled, the fi_inject call can only accept buffer with | |||
iface equal to FI_HMEM_SYSTEM if the provider requires FI_MR_VIRT_ADDR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean FI_MR_HMEM instead of FI_MR_VIRT_ADDR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Blame the co-pilot ...
Add text to clarify that only FI_HMEM_SYSTEM is allowed for inject calls if FI_MR_HMEM is required. Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Reworked the dynamic MR commit and expand to include more pending 2.0 API changes:
·