Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RELEASE] rmm v23.04 #1240

Merged
merged 30 commits into from
Apr 12, 2023
Merged

[RELEASE] rmm v23.04 #1240

merged 30 commits into from
Apr 12, 2023

Conversation

GPUtester
Copy link
Contributor

❄️ Code freeze for branch-23.04 and v23.04 release

What does this mean?

Only critical/hotfix level issues should be merged into branch-23.04 until release (merging of this PR).

What is the purpose of this PR?

  • Update documentation
  • Allow testing for the new release
  • Enable a means to merge branch-23.04 into main for the release

raydouglass and others added 30 commits January 23, 2023 10:34
Forward-merge branch-23.02 to branch-23.04
Forward-merge branch-23.02 to branch-23.04
Forward-merge branch-23.02 to branch-23.04
Forward-merge branch-23.02 to branch-23.04
Moves date information from the version to the build string. This will help with installing PR artifacts and nightly builds locally and in downstream CI workflows. cc: @ajschmidt8

Authors:
  - Bradley Dice (https://github.com/bdice)
  - AJ Schmidt (https://github.com/ajschmidt8)

Approvers:
  - Ray Douglass (https://github.com/raydouglass)

URL: #1195
This PR updates the branch reference used for our shared workflows.

Authors:
  - AJ Schmidt (https://github.com/ajschmidt8)

Approvers:
  - Ray Douglass (https://github.com/raydouglass)

URL: #1203
This PR adds a less verbose [trap method](https://github.com/rapidsai/cugraph/blob/f2b081075704aabc789603e14ce552eac3fbe692/ci/test.sh#L19), for error handling to help ensure that we capture all potential error codes in our test scripts, and works as follows:

- setting an environment variable, EXITCODE, with a default value of 0
- setting a trap statement triggered by ERR signals which will set EXITCODE=1 when any commands return a non-zero exit code


cc @ajschmidt8

Authors:
  - Ajay Thorve (https://github.com/AjayThorve)
  - AJ Schmidt (https://github.com/ajschmidt8)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #1204
Forward-merge branch-23.02 to branch-23.04
Updates to `spdlog>=1.11.0` and `fmt>=9.1.0`. Also resolves some issues with spdlog in the librmm conda packages. Thanks @robertmaynard for helping advise me on this PR.

**We need to test this downstream before merging.** Perhaps with cuML or some other library.

Authors:
  - Bradley Dice (https://github.com/bdice)
  - Keith Kraus (https://github.com/kkraus14)

Approvers:
  - Robert Maynard (https://github.com/robertmaynard)
  - Mark Harris (https://github.com/harrism)
  - Keith Kraus (https://github.com/kkraus14)
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #1177
PR #1177 was merged a little too early when CI passed due to the presence of a `/merge` comment and sufficient approvals. This reverts a temporary change to the rapids-cmake repo that is no longer needed because rapidsai/rapids-cmake#368 has been merged.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Robert Maynard (https://github.com/robertmaynard)

URL: #1209
This PR replaces usage of versioneer with hard-coded version numbers in setup.py and __init__.py. Since rmm needs to manage versions across a wide range of file types (CMake, C++, Sphinx and doxygen docs, etc), versioneer cannot be relied on as a single source of truth and therefore does not allow us to single-source our versioning to the Git repo as is intended. Additionally, since the primary means of installing rmm is via conda packages (or now, pip packages), information from the package manager tends to be far more informative than the version strings for troubleshooting and debugging purposes. Conversely, the nonstandard version strings that it produces tend to be problematic for other tools, which at best will ignore such versions but at worst will simply fail.

Relies on rapidsai/shared-workflows#38

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Sevag H (https://github.com/sevagh)

URL: #1190
This PR configures the branch workflow to skip the docs job during nightly runs.

Authors:
  - Jake Awe (https://github.com/AyodeAwe)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #1215
…1212)

The package name defined in setup.py needs to be modified for wheels to reflect the CUDA version that the wheel was built for. Currently that modification is done via an environment variable that is pulled in setup.py code. This changeset replaces that approach with a direct modification using a script (similar to what is done for versions in #1190) to facilitate moving towards static project metadata specification via pyproject.toml.

This PR depends on rapidsai/shared-workflows#45.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Sevag H (https://github.com/sevagh)
  - Ashwin Srinath (https://github.com/shwina)

URL: #1212
…1214)

Do not explicitly specify to run the "manual" stage when running pre-commits as part of the ci/check_style.sh script.

Authors:
  - Carl Simon Adorf (https://github.com/csadorf)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #1214
* Adds `tomli` dependency for Python pre-3.11
* Moves package data handling to `MANIFEST.in`
* Symlinks and packages README
* Specifies build backend & moves to newer `setuptools` backend
* Pulls all metadata into `pyproject.toml`
* Migrates `setup.cfg` content to `pyproject.toml`
* Simplifies `setup.py` to the minimal amount of logic necessary

Authors:
  - https://github.com/jakirkham
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #1151
For pure setuptools-based builds, the `include_package_data` option historically defaulted to False. With newer projects, i.e. with projects containing a pyproject.toml file, it defaults to True. However, for projects using scikit-build this setting must be provided explicitly to the `setup` function (not in the config file) to allow scikit-build to intercept the argument and use it in the preprocessing it does before invoking `setuptools.setup`.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - https://github.com/jakirkham

URL: #1218
There were a couple of issues that accidentally made it through in #1151.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #1226
This feature is enabled by the most recent release of dfg. This changes also allows us to disable the dfg CI check as it is now redundant.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #1217
…1221)

RMM provides callbacks to configure third-party libraries to use RMM
for memory allocation.

Previously, these were defined in the top-level package, but that
requires (potentially expensive) import of the package we're providing
a hook for, since typically we must import that package to define the
callback. This makes importing RMM expensive. To avoid this, move the
callbacks into (not imported by default) sub-modules in
`rmm.allocators`. So, if we want to configure the CuPy allocator, we
now import `rmm_cupy_allocator` from `rmm.allocators.cupy`, and don't
pay the price of importing pytorch.

This change **deprecates** the use of the allocator callbacks in the
top-level `rmm` module in favour of explicit imports from the relevant
`rmm.allocators.XXX` sub-module.

Before these changes, a sampling trace of `import rmm` with
pyinstrument shows:
    
    $ pyinstrument -i 0.01 importrmm.py

      _     ._   __/__   _ _  _  _ _/_   Recorded: 10:19:56  Samples:  67
     /_//_/// /_\ / //_// / //_'/ //     Duration: 0.839     CPU time: 0.837
    /   _/                      v4.4.0

    Program: importrmm.py

    0.839 <module>  importrmm.py:1
    └─ 0.839 <module>  rmm/__init__.py:1
       ├─ 0.315 <module>  rmm/allocators/torch.py:1
       │  └─ 0.315 <module>  torch/__init__.py:1
       │        [96 frames hidden]  torch, <built-in>, enum, inspect, tok...
       ├─ 0.297 <module>  rmm/mr.py:1
       │  └─ 0.297 <module>  rmm/_lib/__init__.py:1
       │     ├─ 0.216 <module>  numba/__init__.py:1
       │     │     [140 frames hidden]  numba, abc, <built-in>, importlib, em...
       │     ├─ 0.040 <module>  numba/cuda/__init__.py:1
       │     │     [34 frames hidden]  numba, asyncio, ssl, <built-in>, re, ...
       │     ├─ 0.030 __new__  enum.py:180
       │     │     [5 frames hidden]  enum, <built-in>
       │     └─ 0.011 [self]  None
       └─ 0.227 <module>  rmm/allocators/cupy.py:1
          └─ 0.227 <module>  cupy/__init__.py:1
                [123 frames hidden]  cupy, pytest, _pytest, attr, <built-i...

That is, almost a full second to import things, most of which is spent
importing pytorch and cupy. These modules are not needed in normal
usage of RMM, so we can defer the imports. Numba is a little bit
trickier, but we can also defer up-front imports, with a final result
that after these changes the same `import rmm` call takes just a tenth
of a second:

    $ pyinstrument -i 0.01 importrmm.py

      _     ._   __/__   _ _  _  _ _/_   Recorded: 10:37:40  Samples:  9
     /_//_/// /_\ / //_// / //_'/ //     Duration: 0.099     CPU time: 0.099
    /   _/                      v4.4.0

    Program: importrmm.py

    0.099 <module>  importrmm.py:1
    └─ 0.099 <module>  rmm/__init__.py:1
       └─ 0.099 <module>  rmm/mr.py:1
          └─ 0.099 <module>  rmm/_lib/__init__.py:1
             ├─ 0.059 <module>  numpy/__init__.py:1
             │     [31 frames hidden]  numpy, re, sre_compile, <built-in>, s...
             ├─ 0.020 __new__  enum.py:180
             │     [2 frames hidden]  enum
             ├─ 0.010 <module>  ctypes/__init__.py:1
             │     [3 frames hidden]  ctypes, <built-in>
             └─ 0.010 _EnumDict.__setitem__  enum.py:89
                   [3 frames hidden]  enum

Closes #1211.

Authors:
  - Lawrence Mitchell (https://github.com/wence-)
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Bradley Dice (https://github.com/bdice)

URL: #1221
This is a small cleanup PR to remove a pickle compatibility layer for Python < 3.8.

Authors:
  - Bradley Dice (https://github.com/bdice)
  - https://github.com/jakirkham

Approvers:
  - https://github.com/jakirkham

URL: #1224
We need to update the version regex in update-version.sh for the pyproject.toml changes in #1151.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Ray Douglass (https://github.com/raydouglass)

URL: #1227
…1100)

The `DeviceBuffer.__cinit__` method syncs the default stream to ensure that access to the `.ptr` and `.size` attributes of the underlying `rmm::device_buffer` is safe.

However, the staticmethod `DeviceBuffer.c_from_unique_ptr` does not. This PR adds a stream sync to it.

Authors:
  - Ashwin Srinath (https://github.com/shwina)
  - https://github.com/jakirkham

Approvers:
  - Mark Harris (https://github.com/harrism)
  - https://github.com/jakirkham

URL: #1100
This PR updates builds to use GCC 11.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Mark Harris (https://github.com/harrism)

URL: #1228
…#1230)

This PR ensures that the `AWS_SESSION_TOKEN` and `SCCACHE_S3_USE_SSL` environment variables are passed to our conda build process.

`AWS_SESSION_TOKEN` is necessary in order to support using temporary credentials via AWS STS (we recently adopted this method in CI).

`SCCACHE_S3_USE_SSL` has been reported to increase cache performance for S3.

This PR also alphabetizes the `script_env` lists.

Authors:
  - AJ Schmidt (https://github.com/ajschmidt8)

Approvers:
  - Ray Douglass (https://github.com/raydouglass)

URL: #1230
Following the example of rapidsai/cudf#12097, this PR adds [codespell](https://github.com/codespell-project/codespell) as a linter for rmm.

Note: I have not included a section in the CONTRIBUTING.md about how to use this (as was done in cudf's PR) because I plan to overhaul the contributing guides for all RAPIDS repos in the near term, and have a single source in docs.rapids.ai with common information about linters used in RAPIDS.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Rong Ou (https://github.com/rongou)
  - Ben Frederickson (https://github.com/benfred)
  - Mark Harris (https://github.com/harrism)

URL: #1231
… for wheels (#1233)

Using MANIFEST.in currently runs into a pretty nasty scikit-build bug (scikit-build/scikit-build#886) that results in any file included by the manifest being copied from the install tree back into the source tree whenever an in place build occurs after an install, overwriting any local changes. We need an alternative approach to ensure that all necessary files are included in built packages. There are two types:
- sdists: scikit-build automatically generates a manifest during sdist generation if we don't provide one, and that manifest is reliably complete. It contains all files needed for a source build up to the rmm C++ code (which has always been true and is something we can come back to improving later if desired).
- wheels: The autogenerated manifest is not used during wheel generation because the manifest generation hook is not invoked during wheel builds, so to include data in the wheels we must provide the `package_data` argument to `setup`. In this case we do not need to include CMake or pyx files because the result does not need to be possible to build from, it just needs pxd files for other packages to cimport if desired.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #1233
Converts librmm over to use `rapids-cmake` new GPU aware parallel testing feature, which allows tests to run across all the GPUs on a machine without oversubscription.

This will allow developers to run `ctest -j<N>` and ctest will figure out given the current machine how many tests it can run in parallel given the current GPU set ( currently 4 tests per GPU ).

Authors:
  - Robert Maynard (https://github.com/robertmaynard)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Mark Harris (https://github.com/harrism)
  - Ray Douglass (https://github.com/raydouglass)

URL: #1183
This PR removes modification of the `__init__.py::version` attribute that occurs during the wheel build process. See rapidsai/ops#2592 for more information.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Sevag H (https://github.com/sevagh)

URL: #1236
This PR updates dependencies.yaml to also generates the relevant dependency sections of pyproject.toml.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #1219
@GPUtester GPUtester requested review from a team as code owners March 30, 2023 14:54
@github-actions github-actions bot added ci CMake conda cpp Pertains to C++ code Python Related to RMM Python API labels Mar 30, 2023
@raydouglass raydouglass merged commit fd39993 into main Apr 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci CMake conda cpp Pertains to C++ code Python Related to RMM Python API
Projects
None yet
Development

Successfully merging this pull request may close these issues.