docs/cuda: reword cuda-aware support of communication APIs #12137

wenduwan · 2023-11-29T16:35:48Z

This change also reorganizes the paragraphs and removes duplicate contents.

For testing, I did a local make dist and inspected the html file.

This change also reorganizes the paragraphs and removes duplicate contents. Signed-off-by: Wenduo Wang <wenduwan@amazon.com>

edgargabriel · 2023-11-30T14:56:34Z

docs/tuning-apps/networking/cuda.rst

-requirements to use GPUDirect support on Intel Omni-Path. The minimum
-PSM2 build version required is `PSM2 10.2.175
-<https://github.com/01org/opa-psm2/releases/tag/PSM2_10.2-175>`_.
+How do I run Open MPI with CUDA applications?


My understanding is that we try to get away from FAQ style headers to headers describing the content of the next subsection. E.g. this header should probably be something like

"Running CUDA applications with Open MPI"
(or similar)

(Same for a number of other headers, not pointing each of them out)

edgargabriel · 2023-11-30T14:58:43Z

docs/tuning-apps/networking/cuda.rst

-
-Which MPI APIs work with CUDA-aware UCX?
----------------------------------------
+UCX and UCC supports CUDA-aware blocking reduction collective APIs:


I thought (actually I am pretty sure) that UCC also supports non-blocking reductions for GPU buffers.

edgargabriel · 2023-11-30T15:03:10Z

docs/tuning-apps/networking/cuda.rst


-Which MPI APIs do NOT work with CUDA-aware UCX?
-----------------------------------------------
+However, the following APIs do not support GPU buffers:


is this really correct? at least with recent UCX versions I think some/most of these functions should work.

@bosilca @janjust Do you know if UCC currently supports non-blocking collectives with CUDA buffer? I tried single-node and from what I saw it worked.

Also could you help double check the statement about UCX CUDA support status?

UCC supports both blocking and non-blocking collectives with cuda buffers

UCX supports cuda buffers I think for most, not sure about cuda-ipc but osc ucx does not I think. I have to double check but I'm pretty sure we have issues in accumulate

Thanks Tommy! I will update the UCC part first.

hppritcha · 2023-12-01T15:41:55Z

docs/tuning-apps/networking/cuda.rst


-   .. code-block:: sh
+OFI support for CUDA


Might want to add something about versions of libfabric to use. Maybe say something like latest version of libfabric is recommended or at least something newer than 1.9?

wenduwan · 2024-09-10T20:28:25Z

Sorry I won't have time to work on this. Closing...

docs/cuda: reword cuda-aware support of communication APIs

909168e

This change also reorganizes the paragraphs and removes duplicate contents. Signed-off-by: Wenduo Wang <wenduwan@amazon.com>

wenduwan requested review from jsquyres, bosilca, hppritcha and edgargabriel November 29, 2023 16:35

github-actions bot added the Target: main label Nov 29, 2023

edgargabriel requested changes Nov 30, 2023

View reviewed changes

hppritcha reviewed Dec 1, 2023

View reviewed changes

bosilca approved these changes Dec 2, 2023

View reviewed changes

wenduwan self-assigned this Jan 21, 2024

wenduwan mentioned this pull request Feb 19, 2024

Building cuda aware openMPI does not seem to work #12334

Closed

pgrete mentioned this pull request Sep 9, 2024

Bump Parthenon and Kokkos parthenon-hpc-lab/athenapk#114

Merged

wenduwan closed this Sep 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs/cuda: reword cuda-aware support of communication APIs #12137

docs/cuda: reword cuda-aware support of communication APIs #12137

wenduwan commented Nov 29, 2023

edgargabriel Nov 30, 2023

edgargabriel Nov 30, 2023

edgargabriel Nov 30, 2023

wenduwan Dec 14, 2023

janjust Dec 14, 2023

janjust Dec 14, 2023

wenduwan Dec 14, 2023

hppritcha Dec 1, 2023

wenduwan commented Sep 10, 2024

docs/cuda: reword cuda-aware support of communication APIs #12137

docs/cuda: reword cuda-aware support of communication APIs #12137

Conversation

wenduwan commented Nov 29, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wenduwan commented Sep 10, 2024