Support for VA to PA changes #10062

iziemba · 2024-06-04T16:20:53Z

The following series of commits updates documentation and defines new fields to enable use-cases which may result in VA to PA changes with memory being registered.

iziemba · 2024-06-04T16:23:50Z

This PR is not ready to land. At this point, I am looking for feedback on the proposed changes.

include/rdma/fi_domain.h

man/fi_mr.3.md

Clarify the behavior of FI_MR_HMEM being unset. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

iziemba · 2024-06-04T22:07:03Z

Ready for round 2.

shefty

thanks

man/fi_mr.3.md

prov/cxi/src/cxip_mr.c

Previously, the collective and local data transfer operations did not allow for apps to pass in a desc unless explicit memory registration is required. This change allows apps to always pass in a desc. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

iziemba · 2024-06-06T14:24:54Z

Ready for round 3.

man/fi_mr.3.md

swelch

This looks like a good solution for ODP.

man/fi_mr.3.md

shefty

thanks! changes look good to me

man/fi_mr.3.md

iziemba · 2024-06-10T18:20:22Z

I'll have all comments address mid this week.

iziemba · 2024-06-10T19:51:53Z

One question I have is do we need to address whether a provider can support FI_MR_ALLOCATED = 0 by HMEM iface types?

shefty · 2024-06-10T20:48:17Z

We could restrict FI_MR_ALLOCATED to memory allocated using system calls only. FI_MR_HMEM overrides FI_MR_LOCAL for HMEM. It could also override the FI_MR_ALLOCATED bit for HMEM.

iziemba · 2024-06-10T21:13:43Z

So FI_MR_HMEM = 1 and FI_MR_ALLOCATED = 0 results in VA <-> PA migration for FI_HMEM_SYSTEM only? The issue with this is that it now forces other FI_MR_HMEM behavior like explicit memory registration for local others, right?

Thinking out loud... CUDA, ROCR, and ZE device drivers all support a invalidation mechanism. This opens the door for implementing ODP for even device memory. We could have a bitmask to denote which HMEM ifaces support VA <-> PA changes with FI_MR_ALLOCATED = 0?

shefty · 2024-06-10T21:36:39Z

In general, to keep things reasonable for a user, if a provider can't support all features for all types of memory that an app can use, my vote is for the provider to fallback and disable a feature entirely. The permutations simply become untenable.

I don't want apps to deal with: "I can support X with system memory. And that GPU is okay. But that GPU isn't; it's too old. And that NIC memory, I can do, but only if you touch the memory using system calls. And NVMe is okay, but only if it's off the PCI bus, not attached to CXL..."

iziemba · 2024-06-10T22:07:40Z

I agree the permutations become untenable.

How about updating the FI_MR_ALLOCATED = 0 with something like the following:

If FI_MR_ALLOCATED = 0 is and FI_HMEM is supported, the ability for the VA to PA mapping to change extends to to HMEM interfaces as well. If a provider cannot support VA to PA changing for a given HMEM iface, the provider should support a reasonable fallback or the operation should fail.

shefty · 2024-06-10T22:37:26Z

I'm good with that update.

FI_MR_ALLOCATED set documentation is updated to state that the VA to PA mapping must not change while memory is still registered. FI_MR_ALLOCATED unset documentation is added stating that the VA to PA mapping may change. This behavior can be used, for example, to support system memory to device memory page migration. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

Support fi_mr_refresh without FI_MR_ALLOCATED set. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

Page size allows applications to optionally notify the provider the page size to be used for an MR allocation. Typically, providers can select the optimal page size. In cases where a VA range has zero pages backing it, the provider may not know the optimal page size during registration. Rather than always use a less efficient page size, allow apps to specify the page size to be used. If page size is zero, provider will select the page size. If non-zero, page size must be page size support by OS. If a specific page size is specified for a memory region during creation, all pages later associated with the region must be of the given size. Attaching a memory page of a different size to a region may result in failed transfers to or from the region. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

iziemba · 2024-06-12T19:10:16Z

Comments addressed and the following added:

If FI_MR_ALLOCATED = 0 is and FI_HMEM is supported, the ability for the VA to PA mapping to change extends to to HMEM interfaces as well. If a provider cannot support VA to PA changing for a given HMEM iface, the provider should support a reasonable fallback or the operation should fail.

iziemba requested review from shefty and j-xiong June 4, 2024 16:20

iziemba commented Jun 4, 2024

View reviewed changes

include/rdma/fi_domain.h Outdated Show resolved Hide resolved

chuckfossen reviewed Jun 4, 2024

View reviewed changes

man/fi_mr.3.md Outdated Show resolved Hide resolved

man/fi_mr.3.md Outdated Show resolved Hide resolved

shefty reviewed Jun 4, 2024

View reviewed changes

iziemba force-pushed the mr_updates branch from 604b01f to f603585 Compare June 4, 2024 21:53

man/fi_mr: Improve FI_MR_HMEM documentation

f66b70b

Clarify the behavior of FI_MR_HMEM being unset. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

iziemba force-pushed the mr_updates branch from f603585 to 479c71b Compare June 4, 2024 22:05

iziemba requested review from shefty and chuckfossen June 4, 2024 22:06

shefty reviewed Jun 4, 2024

View reviewed changes

man/fi_mr.3.md Show resolved Hide resolved

prov/cxi/src/cxip_mr.c Outdated Show resolved Hide resolved

man/fi_mr: Support optional MR desc

bcf40da

Previously, the collective and local data transfer operations did not allow for apps to pass in a desc unless explicit memory registration is required. This change allows apps to always pass in a desc. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

iziemba force-pushed the mr_updates branch from 479c71b to 4cbdd8a Compare June 6, 2024 14:23

iziemba requested a review from shefty June 6, 2024 14:24

chuckfossen reviewed Jun 6, 2024

View reviewed changes

man/fi_mr.3.md Outdated Show resolved Hide resolved

swelch approved these changes Jun 6, 2024

View reviewed changes

chuckfossen reviewed Jun 6, 2024

View reviewed changes

man/fi_mr.3.md Show resolved Hide resolved

shefty approved these changes Jun 6, 2024

View reviewed changes

aingerson reviewed Jun 6, 2024

View reviewed changes

man/fi_mr.3.md Show resolved Hide resolved

iziemba added 2 commits June 12, 2024 14:08

man/fi_mr: Extend fi_mr_refresh support

1d708ad

Support fi_mr_refresh without FI_MR_ALLOCATED set. Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>

iziemba force-pushed the mr_updates branch from 4cbdd8a to 9a07f12 Compare June 12, 2024 19:09

j-xiong approved these changes Jun 12, 2024

View reviewed changes

j-xiong merged commit 03a2ea2 into ofiwg:main Jun 14, 2024
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for VA to PA changes #10062

Support for VA to PA changes #10062

iziemba commented Jun 4, 2024

iziemba commented Jun 4, 2024

iziemba commented Jun 4, 2024

shefty left a comment

iziemba commented Jun 6, 2024

swelch left a comment

shefty left a comment

iziemba commented Jun 10, 2024

iziemba commented Jun 10, 2024

shefty commented Jun 10, 2024

iziemba commented Jun 10, 2024

shefty commented Jun 10, 2024

iziemba commented Jun 10, 2024

shefty commented Jun 10, 2024

iziemba commented Jun 12, 2024

Support for VA to PA changes #10062

Support for VA to PA changes #10062

Conversation

iziemba commented Jun 4, 2024

iziemba commented Jun 4, 2024

iziemba commented Jun 4, 2024

shefty left a comment

Choose a reason for hiding this comment

iziemba commented Jun 6, 2024

swelch left a comment

Choose a reason for hiding this comment

shefty left a comment

Choose a reason for hiding this comment

iziemba commented Jun 10, 2024

iziemba commented Jun 10, 2024

shefty commented Jun 10, 2024

iziemba commented Jun 10, 2024

shefty commented Jun 10, 2024

iziemba commented Jun 10, 2024

shefty commented Jun 10, 2024

iziemba commented Jun 12, 2024