Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--with-cuda failes to find libcuda.so #12509

Closed
BKitor opened this issue May 1, 2024 · 4 comments
Closed

--with-cuda failes to find libcuda.so #12509

BKitor opened this issue May 1, 2024 · 4 comments

Comments

@BKitor
Copy link
Contributor

BKitor commented May 1, 2024

OpenMPI can fail to find libcuda.so and will build without opal acclerator cuda when --with-cuda/--with-cuda-libdir is specified.
This was already reported as a bug in #12264 and fixed in #12382, but the bug persists.
I noticed in with a v5.0.3 tarball, and have been able to reproduce on master.

Details of the problem

user@bigtwin1d:~/bkitor/bk_share/ompi_builds/ompi[master]$ ./build/bin/ompi_info | grep 'Configure command'
  Configure command line: '--prefix=/home/user/bkitor/bk_share/ompi_builds/ompi/build' '--with-cuda=/usr/local/cuda' '--with-ofi=/usr/local'
user@bigtwin1d:~/bkitor/bk_share/ompi_builds/ompi[master]$ ./build/bin/ompi_info | grep 'MCA accelerator'
         MCA accelerator: null (MCA v2.1.0, API v1.0.0, Component v5.1.0)
user@bigtwin1d:~/bkitor/bk_share/ompi_builds/ompi[master]$ ./build/bin/ompi_info | grep 'Configure command'
  Configure command line: '--prefix=/home/user/bkitor/bk_share/ompi_builds/ompi/build' '--with-cuda=/usr/local/cuda' '--with-cuda-libdir=/usr/local/cuda' '--with-ofi=/usr/local'
user@bigtwin1d:~/bkitor/bk_share/ompi_builds/ompi[master]$ ./build/bin/ompi_info | grep 'MCA accelerator'
         MCA accelerator: null (MCA v2.1.0, API v1.0.0, Component v5.1.0)

The crux of the issue is that /usr/local/cuda is a symlink, and the find command in opal_check_cuda.m4 won't follow it by default.
Adding the -H flag should fix the issue.

@jsquyres
Copy link
Member

jsquyres commented May 2, 2024

This is a different issue than was reported by #12264, but still falls under the same end-visible symptom of not being able to find the CUDA libraries.

Thanks for reporting, and thanks for submitting the PR!

@janjust
Copy link
Contributor

janjust commented May 2, 2024

fixed with #12510

@janjust janjust closed this as completed May 2, 2024
@jsquyres
Copy link
Member

jsquyres commented May 2, 2024

@janjust Do you want to wait until this is fixed in v5.0.x?

@janjust
Copy link
Contributor

janjust commented May 2, 2024

#12513 I have the cherry-pick up already

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants