Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python bindings for cuda_async_memory_resource #718

Merged
merged 21 commits into from
Mar 3, 2021

Conversation

shwina
Copy link
Contributor

@shwina shwina commented Mar 1, 2021

Closes #701.

@shwina shwina requested a review from a team as a code owner March 1, 2021 21:52
@github-actions github-actions bot added the Python Related to RMM Python API label Mar 1, 2021
cdef class CudaAsyncMemoryResource(DeviceMemoryResource):
def __cinit__(self, device=None):
self.c_obj.reset(
new cuda_async_memory_resource()
Copy link
Contributor Author

@shwina shwina Mar 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should failure look like here?

  • Should we just let the C++ error propagate up and expose that directly?
  • Do we want to wrap this call in a try..except and re-raise with more information?
  • Do we want to call driverGetVersion() and duplicate the check for 11.2 in C++ and Python?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the C++ error look like if someone tries to create this on CUDA 11.0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to improve the message here: https://github.com/rapidsai/rmm/blob/branch-0.19/include/rmm/mr/device/cuda_async_memory_resource.hpp#L53 to say that it was compiled without support instead of just the generic error message

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this part of the macro deals specifically with CUDA version < 11.2 -- @harrism any thoughts here on a possibly more informative error message? This will directly be propagated up to Python users.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"cudaMallocAsync not supported by the version of the CUDA Toolkit used for compilation"? I don't want to say "... used to compile RMM" since RMM is header-only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved the error message based on your suggestion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You changed the wrong error message. :)

Co-authored-by: Keith Kraus <kkraus@nvidia.com>
@kkraus14 kkraus14 added feature request New feature or request non-breaking Non-breaking change labels Mar 1, 2021
Copy link
Member

@jakirkham jakirkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Ashwin! 😄 Had a couple of questions below 🙂

python/rmm/_lib/memory_resource.pyx Outdated Show resolved Hide resolved
python/setup.py Outdated Show resolved Hide resolved
@github-actions github-actions bot added the cpp Pertains to C++ code label Mar 2, 2021
@shwina shwina requested a review from a team as a code owner March 2, 2021 21:14
@shwina shwina requested review from rongou and cwharris March 2, 2021 21:14
Copy link
Member

@jakirkham jakirkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks Ashwin! 😄



@pytest.mark.skipif(
rmm._cuda.gpu.runtimeGetVersion() < 11020,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think technically we need to check both the runtime and driver version here. Someone could use a newer runtime with an older driver for example. Where the call would exist but would error at runtime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused but happy to make the change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cudaMallocAsync depends on having both libcudart >= 11.2 and libcuda >= 11.2. If say you have libcudart == 11.2 and libcuda == 11.0, then https://github.com/rapidsai/rmm/blob/branch-0.19/include/rmm/mr/device/cuda_async_memory_resource.hpp#L49 would error at runtime that there isn't a new enough driver for the feature. If you had libcudart == 11.0 and libcuda == 11.2, then https://github.com/rapidsai/rmm/blob/branch-0.19/include/rmm/mr/device/cuda_async_memory_resource.hpp#L49 would error with an invalid DeviceAttribute since it doesn't exist in libcudart 11.0.

include/rmm/mr/device/cuda_async_memory_resource.hpp Outdated Show resolved Hide resolved
include/rmm/mr/device/cuda_async_memory_resource.hpp Outdated Show resolved Hide resolved
cdef class CudaAsyncMemoryResource(DeviceMemoryResource):
def __cinit__(self, device=None):
self.c_obj.reset(
new cuda_async_memory_resource()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You changed the wrong error message. :)

@kkraus14
Copy link
Contributor

kkraus14 commented Mar 3, 2021

rerun tests

@kkraus14
Copy link
Contributor

kkraus14 commented Mar 3, 2021

@gpucibot merge

@mike-wendt
Copy link
Contributor

rerun tests

Comment on lines +80 to +97
for pxd_basename in files_to_preprocess:
pxi_basename = os.path.splitext(pxd_basename)[0] + ".pxi"
if CUDA_VERSION in cuda_version_to_pxi_dir:
pxi_pathname = os.path.join(
cwd,
"rmm/_cuda",
cuda_version_to_pxi_dir[CUDA_VERSION],
pxi_basename,
)
pxd_pathname = os.path.join(cwd, "rmm/_cuda", pxd_basename)
try:
if filecmp.cmp(pxi_pathname, pxd_pathname):
# files are the same, no need to copy
continue
except FileNotFoundError:
# pxd_pathname doesn't exist yet
pass
shutil.copyfile(pxi_pathname, pxd_pathname)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move the cuda version check outside of the loop and invert it to reduce nesting?

if CUDA_VERSION not in cuda_version_to_pxi_dir:
    raise TypeError(f"{CUDA_VERSION} is not supported.")

Copy link
Contributor

@cwharris cwharris Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would mean we always check, regardless of how many files we have to preprocess, so that might need to be accounted for. example: if len(files_to_preprocess) and CUDA_VERSION not in cuda_version_to_pxi_dir

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that this is low hanging fruit to fix and we may as well tackle it now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woops, this merged before fixing this. Will raise an issue to tackle it in a follow up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry. I'll put in one tomorrow.

@kkraus14
Copy link
Contributor

kkraus14 commented Mar 3, 2021

rerun tests

1 similar comment
@kkraus14
Copy link
Contributor

kkraus14 commented Mar 3, 2021

rerun tests

@rapids-bot rapids-bot bot merged commit 3b4a555 into rapidsai:branch-0.19 Mar 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cpp Pertains to C++ code feature request New feature or request non-breaking Non-breaking change Python Related to RMM Python API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Add Python bindings for cuda_async_memory_resource
6 participants