Skip to content

Commit

Permalink
Ensure UpstreamResourceAdaptor is not cleared by the Python GC (#1170)
Browse files Browse the repository at this point in the history
Closes #1169.

Essentially, we are running into the situation described in https://cython.readthedocs.io/en/latest/src/userguide/extension_types.html#disabling-cycle-breaking-tp-clear with `UpstreamResourceAdaptor`.

The solution is to prevent clearing of `UpstreamResourceAdaptor` objects by decorating them with `no_gc_clear`.

Cython calls out the following:

> If you use no_gc_clear, it is important that any given reference cycle contains at least one object without no_gc_clear. Otherwise, the cycle cannot be broken, which is a memory leak.

The other object in RMM that we mark `@no_gc_clear` is `DeviceBuffer`, and a `DeviceBuffer` can keep a reference to an `UpstreamResourceAdaptor`. But, an `UpstreamResourceAdaptor` cannot keep a reference to a `DeviceBuffer`, so instances of the two cannot form a reference cycle AFAICT.

Authors:
  - Ashwin Srinath (https://github.com/shwina)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Mark Harris (https://github.com/harrism)

URL: #1170
  • Loading branch information
shwina committed Dec 19, 2022
1 parent a53ce95 commit 741a1df
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 6 deletions.
3 changes: 3 additions & 0 deletions python/rmm/_lib/memory_resource.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ import os
import warnings
from collections import defaultdict

cimport cython
from cython.operator cimport dereference as deref
from libc.stdint cimport int8_t, int64_t, uintptr_t
from libcpp cimport bool
Expand Down Expand Up @@ -228,6 +229,8 @@ cdef class DeviceMemoryResource:
self.c_obj.get().deallocate(<void*>(ptr), nbytes)


# See the note about `no_gc_clear` in `device_buffer.pyx`.
@cython.no_gc_clear
cdef class UpstreamResourceAdaptor(DeviceMemoryResource):

def __cinit__(self, DeviceMemoryResource upstream_mr, *args, **kwargs):
Expand Down
29 changes: 23 additions & 6 deletions python/rmm/tests/test_rmm.py
Original file line number Diff line number Diff line change
Expand Up @@ -725,6 +725,13 @@ def callback(nbytes: int) -> bool:


def test_dev_buf_circle_ref_dealloc():
# This test creates a reference cycle containing a `DeviceBuffer`
# and ensures that the garbage collector does not clear it, i.e.,
# that the GC does not remove all references to other Python
# objects from it. The `DeviceBuffer` needs to keep its reference
# to the `DeviceMemoryResource` that was used to create it in
# order to be cleaned up properly. See GH #931.

rmm.mr.set_current_device_resource(rmm.mr.CudaMemoryResource())

dbuf1 = rmm.DeviceBuffer(size=1_000_000)
Expand All @@ -734,17 +741,27 @@ def test_dev_buf_circle_ref_dealloc():
l1.append(l1)

# due to the reference cycle, the device buffer doesn't actually get
# cleaned up until later, when we invoke `gc.collect()`:
# cleaned up until after `gc.collect()` is called.
del dbuf1, l1

rmm.mr.set_current_device_resource(rmm.mr.CudaMemoryResource())

# by now, the only remaining reference to the *original* memory
# resource should be in `dbuf1`. However, the cyclic garbage collector
# will eliminate that reference when it clears the object via its
# `tp_clear` method. Later, when `tp_dealloc` attemps to actually
# deallocate `dbuf1` (which needs the MR alive), a segfault occurs.
# test that after the call to `gc.collect()`, the `DeviceBuffer`
# is deallocated successfully (i.e., without a segfault).
gc.collect()


def test_upstream_mr_circle_ref_dealloc():
# This test is just like the one above, except it tests that
# instances of `UpstreamResourceAdaptor` (such as
# `PoolMemoryResource`) are not cleared by the GC.

rmm.mr.set_current_device_resource(rmm.mr.CudaMemoryResource())
mr = rmm.mr.PoolMemoryResource(rmm.mr.get_current_device_resource())
l1 = [mr]
l1.append(l1)
del mr, l1
rmm.mr.set_current_device_resource(rmm.mr.CudaMemoryResource())
gc.collect()


Expand Down

0 comments on commit 741a1df

Please sign in to comment.