Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] rmm unconditionally imports both torch and cupy if they are in the environment #1211

Closed
wence- opened this issue Feb 16, 2023 · 1 comment · Fixed by #1221
Closed
Assignees
Labels
bug Something isn't working Python Related to RMM Python API

Comments

@wence-
Copy link
Contributor

wence- commented Feb 16, 2023

Describe the bug

As investigation in rapidsai/cudf#627, it turns out that RMM always imports cupy and torch, even if the user is never going to use them. This is expensive in the sense that it makes "import rmm" take more than half a second (even though the package itself is very small).

It would be nice to load those packages lazily/optionally.

Happy to take this on.

@wence- wence- added ? - Needs Triage Need team to review and classify bug Something isn't working labels Feb 16, 2023
@harrism
Copy link
Member

harrism commented Feb 22, 2023

Great find. That would be great, thanks!

@harrism harrism added Python Related to RMM Python API and removed ? - Needs Triage Need team to review and classify labels Feb 22, 2023
rapids-bot bot pushed a commit that referenced this issue Feb 27, 2023
…1221)

RMM provides callbacks to configure third-party libraries to use RMM
for memory allocation.

Previously, these were defined in the top-level package, but that
requires (potentially expensive) import of the package we're providing
a hook for, since typically we must import that package to define the
callback. This makes importing RMM expensive. To avoid this, move the
callbacks into (not imported by default) sub-modules in
`rmm.allocators`. So, if we want to configure the CuPy allocator, we
now import `rmm_cupy_allocator` from `rmm.allocators.cupy`, and don't
pay the price of importing pytorch.

This change **deprecates** the use of the allocator callbacks in the
top-level `rmm` module in favour of explicit imports from the relevant
`rmm.allocators.XXX` sub-module.

Before these changes, a sampling trace of `import rmm` with
pyinstrument shows:
    
    $ pyinstrument -i 0.01 importrmm.py

      _     ._   __/__   _ _  _  _ _/_   Recorded: 10:19:56  Samples:  67
     /_//_/// /_\ / //_// / //_'/ //     Duration: 0.839     CPU time: 0.837
    /   _/                      v4.4.0

    Program: importrmm.py

    0.839 <module>  importrmm.py:1
    └─ 0.839 <module>  rmm/__init__.py:1
       ├─ 0.315 <module>  rmm/allocators/torch.py:1
       │  └─ 0.315 <module>  torch/__init__.py:1
       │        [96 frames hidden]  torch, <built-in>, enum, inspect, tok...
       ├─ 0.297 <module>  rmm/mr.py:1
       │  └─ 0.297 <module>  rmm/_lib/__init__.py:1
       │     ├─ 0.216 <module>  numba/__init__.py:1
       │     │     [140 frames hidden]  numba, abc, <built-in>, importlib, em...
       │     ├─ 0.040 <module>  numba/cuda/__init__.py:1
       │     │     [34 frames hidden]  numba, asyncio, ssl, <built-in>, re, ...
       │     ├─ 0.030 __new__  enum.py:180
       │     │     [5 frames hidden]  enum, <built-in>
       │     └─ 0.011 [self]  None
       └─ 0.227 <module>  rmm/allocators/cupy.py:1
          └─ 0.227 <module>  cupy/__init__.py:1
                [123 frames hidden]  cupy, pytest, _pytest, attr, <built-i...

That is, almost a full second to import things, most of which is spent
importing pytorch and cupy. These modules are not needed in normal
usage of RMM, so we can defer the imports. Numba is a little bit
trickier, but we can also defer up-front imports, with a final result
that after these changes the same `import rmm` call takes just a tenth
of a second:

    $ pyinstrument -i 0.01 importrmm.py

      _     ._   __/__   _ _  _  _ _/_   Recorded: 10:37:40  Samples:  9
     /_//_/// /_\ / //_// / //_'/ //     Duration: 0.099     CPU time: 0.099
    /   _/                      v4.4.0

    Program: importrmm.py

    0.099 <module>  importrmm.py:1
    └─ 0.099 <module>  rmm/__init__.py:1
       └─ 0.099 <module>  rmm/mr.py:1
          └─ 0.099 <module>  rmm/_lib/__init__.py:1
             ├─ 0.059 <module>  numpy/__init__.py:1
             │     [31 frames hidden]  numpy, re, sre_compile, <built-in>, s...
             ├─ 0.020 __new__  enum.py:180
             │     [2 frames hidden]  enum
             ├─ 0.010 <module>  ctypes/__init__.py:1
             │     [3 frames hidden]  ctypes, <built-in>
             └─ 0.010 _EnumDict.__setitem__  enum.py:89
                   [3 frames hidden]  enum

Closes #1211.

Authors:
  - Lawrence Mitchell (https://github.com/wence-)
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Bradley Dice (https://github.com/bdice)

URL: #1221
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Related to RMM Python API
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants