Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs/cuda: reword cuda-aware support of communication APIs #12137

Closed
wants to merge 1 commit into from

Conversation

wenduwan
Copy link
Contributor

This change also reorganizes the paragraphs and removes duplicate contents.

For testing, I did a local make dist and inspected the html file.

This change also reorganizes the paragraphs and removes duplicate contents.

Signed-off-by: Wenduo Wang <wenduwan@amazon.com>
requirements to use GPUDirect support on Intel Omni-Path. The minimum
PSM2 build version required is `PSM2 10.2.175
<https://github.com/01org/opa-psm2/releases/tag/PSM2_10.2-175>`_.
How do I run Open MPI with CUDA applications?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that we try to get away from FAQ style headers to headers describing the content of the next subsection. E.g. this header should probably be something like

"Running CUDA applications with Open MPI"
(or similar)

(Same for a number of other headers, not pointing each of them out)


Which MPI APIs work with CUDA-aware UCX?
----------------------------------------
UCX and UCC supports CUDA-aware blocking reduction collective APIs:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought (actually I am pretty sure) that UCC also supports non-blocking reductions for GPU buffers.


Which MPI APIs do NOT work with CUDA-aware UCX?
-----------------------------------------------
However, the following APIs do not support GPU buffers:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this really correct? at least with recent UCX versions I think some/most of these functions should work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bosilca @janjust Do you know if UCC currently supports non-blocking collectives with CUDA buffer? I tried single-node and from what I saw it worked.

Also could you help double check the statement about UCX CUDA support status?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UCC supports both blocking and non-blocking collectives with cuda buffers

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UCX supports cuda buffers I think for most, not sure about cuda-ipc but osc ucx does not I think. I have to double check but I'm pretty sure we have issues in accumulate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Tommy! I will update the UCC part first.


.. code-block:: sh
OFI support for CUDA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to add something about versions of libfabric to use. Maybe say something like latest version of libfabric is recommended or at least something newer than 1.9?

@wenduwan
Copy link
Contributor Author

Sorry I won't have time to work on this. Closing...

@wenduwan wenduwan closed this Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants