Skip to content

Releases: nv-legate/cunumeric

v24.06.01

11 Sep 20:36
427da00
Compare
Choose a tag to compare

This is a patch release, and includes the following fixes:

x86 conda packages with multi-node support (based on UCX) are available at https://anaconda.org/legate/cunumeric.

Documentation for this release can be found at https://docs.nvidia.com/cunumeric/24.06/.

v24.06.00

03 Jul 22:35
510e24a
Compare
Choose a tag to compare

This release ports cuNumeric to the C++-based Legate-Core. Additionally, it includes the following new features:

  • np.linalg.qr, np.linalg.svd (single-GPU support only)
  • "where" argument for unary operations
  • np.select
  • np.flipup, np.fliplr
  • np.cov
  • np.load (initial, unoptimized implementation)
  • np.average
  • np.logical_and/or.reduce
  • np.digitize
  • np.diff
  • np.linalg.cholesky, np.linalg.solve (multi-GPU support, based on cuSolverMp -- not included in conda packages, requires a manual build)
  • C++-based ndarray class (experimental support)

x86 conda packages with multi-node support (based on UCX) are available at https://anaconda.org/legate/cunumeric.

Documentation for this release can be found at https://docs.nvidia.com/cunumeric/24.06/.

Known issues

Including the nvidia conda channel in an environment with cunumeric may end up pulling cutensor 2.0, even though the cunumeric packages explicitly request cutensor 1.7. This can cause error messages like this:

OSError: libcutensor.so.1: cannot open shared object file: No such file or directory

This is not an issue with cuNumeric, but with incorrect constraints on the cutensor packages on the nvidia channel. Please avoid including the nvidia conda channel in any conda environment including cunumeric.

v23.11.00

21 Nov 01:47
d91f17c
Compare
Choose a tag to compare

This release contains performance improvements to the variance operation, and a multi-dimensional Cholesky implementation.

Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

What's Changed

🚀 New Features

🐛 Bug Fixes

📖 Documentation

Full Changelog: v23.09.00...v23.11.00

v23.09.00

03 Oct 15:23
e66a063
Compare
Choose a tag to compare

This release adds support for the quantile API, and includes some performance and documentation improvements (notably a "Best Practices" guide).

Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

What's Changed

🚀 New Features

🛠️ Improvements

📖 Documentation

🐛 Bug Fixes

New Contributors

Full Changelog: v23.07.00...v23.09.00

v23.07.00

25 Jul 04:51
d413db2
Compare
Choose a tag to compare

This release adds support for histogram, broadcast* and various nan* APIs. It also includes performance improvements to the FFT functions and cleanups in ufunc support.

Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

What's Changed

🚀 New Features

🛠️ Improvements

📖 Documentation

  • Note new minimum CUDA requirements for conda packages by @manopapad in #875

🐛 Bug Fixes

New Contributors

Full Changelog: v23.03.00...v23.07.00

v23.03.00

15 Mar 20:02
9ac887b
Compare
Choose a tag to compare

This is the beta release of cuNumeric.

This release is focused on bug fixes, code clean-up and documentation updates, in preparation for entering beta status.

Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

What's Changed

🐛 Bug Fixes

🛠️ Improvements

📖 Documentation

Full Changelog: v23.01.00...v23.03.00

v23.01.00

31 Jan 03:38
2455b55
Compare
Choose a tag to compare

This release introduces support for the put and putmask operations, adds an optimized implementation for the common case of advanced indexing using a single (possibly broadcasted) boolean array, includes more information in the tags of unary/binary operations on profiles (for easier cross-referencing with the source script), and adds some small improvements to OpenMP execution.

Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

What's Changed

🐛 Bug Fixes

🚀 New Features

🛠️ Improvements

Full Changelog: v22.10.00...v23.01.00

v22.10.00

13 Oct 23:53
81ad156
Compare
Choose a tag to compare

The biggest change in Release 22.10 is a new build infrastructure using CMake and scikit-build. The new build system brings several benefits including robust build dependency tracking and compliance with Python site-packages. This release includes several new search and indexing operators, fixes for several performance and correctness bugs, and provenance tracking for top-level and ndarray routines in execution profiles.

Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

What's Changed

🚀 New Features

• Argwhere and flatnonzero by @mfoerste4 in #525

🛠️ Improvements

  • adding support for array shape () passed as an index argument in advanced indexing by @ipdemes in #486
  • Refactor test driver for cpu/gpu sharding by @bryevdv in #451
  • Collate test output to allow workers > 1 with verbose output by @bryevdv in #507
  • Ensure test.py --use flag fully overrides USE_* envvars by @manopapad in #524
  • Enhance two integration tests by @robinw0928 in #511
  • Add typing to array.py by @bryevdv in #478
  • Update test runner for osx by @bryevdv in #529
  • Don't blindly trust user-supplied bincount.minlength by @manopapad in #523
  • Make reduced-precision cuBLAS mode opt-in by @manopapad in #519
  • Fix reciprocal tests for zero values and improve test value customization (#467) by @marcinz in #537
  • Refactor test runner to support more pinning options by @bryevdv in #535
  • Remove dead code ian bincount by @magnatelee in #546
  • Make the validation condition for random distributions lenient by @magnatelee in #550
  • src/cunumeric: handle high number of bins in GPU bincount by @rohany in #526
  • Construct NumPy arrays correctly from 0D deferred arrays backed by region fields by @magnatelee in #551
  • Collect test failure details at the end by @bryevdv in #556
  • Simplify some thunk conversion helpers by @manopapad in #553
  • Fix a compiler warning by @magnatelee in #555
  • Add option to disable CPU pinning in tests by @bryevdv in #558
  • Use the new mapper registration to enable detailed mapper logging by @magnatelee in #570
  • src/cunumeric/search: make nonzero not always allocate SYS_MEM buffers by @rohany in #572
  • add negative test case in test_array_split.py by @xialu00 in #545
  • add some test cases for test_arg_reduce.py by @xialu00 in #575
  • Testcase-add test cases for test_flip and test_indices by @xialu00 in #579
  • Refactor scalar reductions to use common execution policy by @jjwilke in #573
  • Sanitize k for the eye operator by @magnatelee in #586
  • Add CMake build for C++ and scikit-build infrastructure for Python package installation by @jjwilke in #514
  • Enhance test_block.py and test_eye.py by @robinw0928 in #578
  • Testcase add test cases for test_fill.py and test_ndim.py by @xialu00 in #588
  • Remove run dependency on curand by @marcinz in #520
  • Use Legion Fills when possible by @manopapad in #604
  • Support building with GASNet-Ex and MPI backends by @manopapad in #610
  • Provenance tracking for cuNumeric operators by @magnatelee in #596
  • Fix tests utils to make --directory work correctly. by @robinw0928 in #592
  • Fix a compiler warning by @magnatelee in #594
  • Enhance test_diag_indices.py and test_flatten.py. by @robinw0928 in #609
  • cuNumeric doesn't need nested provenance tracking by @magnatelee in #617
  • Add RuntimeError exception to legate.time by @robinw0928 in #618
  • Stop instantiating min and max reduction ops for complex types by @magnatelee in #621
  • Mark temporary conversion outputs as linear for eager storage recycling by @magnatelee in #608
  • Make the negative test on fill robust across Python versions by @magnatelee in #619
  • Enhance mask_indices and move_axis by @robinw0928 in #622
  • src/cunumeric/matrix: stop including coll.h in solve_template.inl by @rohany in #620

🐛 Bug Fixes

  • Fix performance bugs in scalar reductions by @magnatelee in #509
  • Don't use internal LAPACK function names by @manopapad in #522
  • Bug fixes for advanced indexing by @magnatelee in #532
  • Handle the case where LAPACK_*potrf is a macro, not a function by @manopapad in #527
  • fix mypy issue w/ np methods by @bryevdv in #542
  • Fix buggy complex-to-bool conversions and add correctness tests for astype by @magnatelee in #549
  • fixing advanced indexing operation for empty arrays by @ipdemes in #504
  • Do not link curand by @marcinz in #541
  • Fixing issues with advanced_indexing_kernel by @ipdemes in #557
  • fixing another corner case for advanced indexing by @ipdemes in #554
  • Fix OSX test shard generation by @bryevdv in #563
  • fix error print in test_unary_ufunc by @jjwilke in #566
  • Add NAN handling to convert() needed for some prefix routines with integer outputs. by @rkarim2 in #502
  • Fixing logic for slicing by @ipdemes in #574
  • Fix linalg.solve when inputs are scalars by @magnatelee in #585
  • Allow casting in cn.dot, to match numpy's behavior by @manopapad in #598
  • Add linalg.solve to the cmake build by @magnatelee in #603
  • Invoke eye with read-write privilege, not write-discard by @manopapad in #616
  • Fix a bug in scalar reduction launching kernels with empty domains by @magnatelee in #606

📖 Documentation

  • Added note to prefix documentation for corner cases where cunumeric results can diverge from numpy by @rkarim2 in #528
  • updating documentation by @ipdemes in #614
  • Add missing docs symlink by @bryevdv in #635

v22.08.00

09 Aug 03:38
ece6585
Compare
Choose a tag to compare

Release 22.08.00 features a variety of random distribution implementations (backed by cuRAND), distributed prefix scan operators, and a complete implementation of sorting for multi-node multi-CPU execution. This release also includes several quality-of-life changes and bug fixes, including type annotations for all but one Python module, improvements to the parallel test driver, fixes for several operators when inputs are empty, and proper handling of ndarrays passed as array sizes or indices.

Conda packages for this release are available at https://anaconda.org/legate/cunumeric.

New Features

Improvements

Bug Fixes

Documentation

New Contributors

Full Changelog: v22.05.02...v22.08.00

v22.05.02

21 Jun 10:52
8b163e6
Compare
Choose a tag to compare

This hotfix release fixes issues in conda recipes.

What's Changed

  • Cherry pick: Update conda requirements (#383) by @marcinz in #406
  • Cherry pick: Set cuda virtual package as hard run requirement for conda gpu package (#398) by @marcinz in #407
  • Cherry pick: Fix nargs for report:dump-csv (#400) by @marcinz in #408
  • Re-freezing conda compiler versions by @m3vaz in #415

Full Changelog: v22.05.01...v22.05.02