Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Really need a complete guide to build gdrcopy in the official nvidia docker container #238

Open
goooxu opened this issue Oct 22, 2022 · 2 comments
Labels

Comments

@goooxu
Copy link

goooxu commented Oct 22, 2022

If just follow the instructions in README.md and you'll find a lot of things that don't work, I tried for hours without success.

My docker images is nvidia/cuda:11.8.0-devel-ubuntu22.04

@goooxu
Copy link
Author

goooxu commented Oct 23, 2022

After a whole night of hard trying, I finally got a feasible steps to share with you

  • Starting a gdrcopy-capable container
KERNEL_VERSION=`uname -r`
docker run --gpus=all -it -v /lib/modules/${KERNEL_VERSION}:/lib/modules/${KERNEL_VERSION} -v /usr/src/linux-headers-${KERNEL_VERSION%-*}:/usr/src/linux-headers-${KERNEL_VERSION%-*} -v /usr/src/linux-headers-${KERNEL_VERSION}:/usr/src/linux-headers-${KERNEL_VERSION} --privileged nvidia/cuda:11.8.0-devel-ubuntu22.04
  • Compile, install and load the kernel module inside a container
apt update --fix-missing
apt install git check pkg-config nvidia-dkms-<your-nvidia-driver-version>
git clone https://github.com/NVIDIA/gdrcopy
cd gdrcopy
make prefix=/ all install
./insmod.sh
  • Try if it works
sanity

@pakmarkthub
Copy link
Collaborator

Hi @goooxu,

We don't have an official document about installing GDRCopy in Docker. But I can provide some guideline here.
GDRCopy composes of two important modules: 1) gdrdrv driver, and 2) libgdrapi. The driver needs to be install in the host (outside the container). After the installation, you should see /dev/gdrdrv on the host. You should mount /dev/gdrdrv into your container. Then, please check inside your container that you see /dev/gdrdrv and it links to /dev/gdrdrv on the host.

For libgdrapi, you can install it in your container only -- no need to install libgdrapi on the host.

The steps you provide work as well. You should be able to install gdrdrv as you run the container in the privilege mode. But you may want to avoid doing this in shared environments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants