Merge branch '2306_update' into main

NVIDIA · Jul 13, 2023 · d3bfec1 · d3bfec1
2 parents 2cc68ed + ddb9f92
commit d3bfec1
Show file tree

Hide file tree

Showing 59 changed files with 5,837 additions and 751 deletions.
diff --git a/benchmarks/README.md b/benchmarks/README.md
@@ -12,7 +12,7 @@ You can install all optional dependencies via
 ```
 pip install .[all]
 ```
-if running outside of the [cuQuantum Appliance container](https://docs.nvidia.com/cuda/cuquantum/appliance/index.html).
+if running outside of the [cuQuantum Appliance container](https://docs.nvidia.com/cuda/cuquantum/latest/appliance/index.html).
 
 **Note: You may have to build `qsimcirq` and `qiskit-aer` GPU support from source if needed.**
 

diff --git a/extra/custatevec/README.md b/extra/custatevec/README.md
@@ -1,6 +1,6 @@
 ## MPI Comm Plugin Extension
 
-The first version of [multi-node state vector simulator](https://docs.nvidia.com/cuda/cuquantum/appliance/qiskit.html) has been released in cuQuantum Appliance 22.11.  It currently supports a limited set of versions of OpenMPI and MPICH.  Other MPI libraries are supported by using an extension module called as External CommPlugin.
+The first version of [multi-node state vector simulator](https://docs.nvidia.com/cuda/cuquantum/latest/appliance/qiskit.html) has been released in cuQuantum Appliance 22.11.  It currently supports a limited set of versions of Open MPI and MPICH.  Other MPI libraries are supported by using an extension module (called External CommPlugin).
 External CommPlugin is a small shared object that wraps MPI functions.  A customer needs to build its own external CommPlugin and link it to the MPI library of its preference to create a shared object.  Then, by specifying appropriate options to the simulator, the compiled shared object is dynamically loaded to use the MPI library for inter-process communications.
 
 ## Prerequisite
@@ -25,7 +25,7 @@ $ ls -l
 
 ## Simulator options
 
-The custom Comm Plugin object is selected by [cusvaer options](https://docs.nvidia.com/cuda/cuquantum/appliance/cusvaer.html#commplugin), `cusvaer_comm_plugin_type` and `cusvaer_comm_plugin_soname`.
+The custom Comm Plugin object is selected by [cusvaer options](https://docs.nvidia.com/cuda/cuquantum/latest/appliance/cusvaer.html#commplugin), `cusvaer_comm_plugin_type` and `cusvaer_comm_plugin_soname`.
 
 - `cusvaer_comm_plugin_type`: The value is `cusvaer.CommPluginType.EXTERNAL`
 - `cusvaer_comm_plugin_soname`  The name of the shared object of an external comm plugin

diff --git a/python/README.md b/python/README.md
@@ -2,7 +2,7 @@
 
 ## Documentation
 
-Please visit the [NVIDIA cuQuantum Python documentation](https://docs.nvidia.com/cuda/cuquantum/python).
+Please visit the [NVIDIA cuQuantum Python documentation](https://docs.nvidia.com/cuda/cuquantum/latest/python).
 
 
 ## Installation
@@ -13,9 +13,10 @@ If you already have a Conda environment set up, it is the easiest to install cuQ
 ```
 conda install -c conda-forge cuquantum-python
 ```
-The Conda solver will install all required dependencies for you.
-
-**Note**: Currently CUDA 12 support is pending the NVIDIA-led community effort ([conda-forge/staged-recipes#21382](https://github.com/conda-forge/staged-recipes/issues/21382)). Once conda-forge supports CUDA 12 we will make compatible conda packages available.
+The Conda solver will install all required dependencies for you. If you need to select a particular CUDA version, say CUDA 12.0, please issue the following command:
+```
+conda install -c conda-forge cuquantum-python cuda-version=12.0
+```
 
 ### Install cuQuantum Python from PyPI
 
@@ -26,12 +27,12 @@ you can also install cuQuantum Python this way:
 pip install cuquantum-python-cuXX
 ```
 with `XX` being `11` (for CUDA 11) or `12` (for CUDA 12).
-The `pip` solver will also install all dependencies, with the exception of CuPy, for you (including both cuTENSOR and cuQuantum wheels). Please follow
-[CuPy's installation guide](https://docs.cupy.dev/en/stable/install.html).
+The `pip` solver will also install all required dependencies for you (including both cuTENSOR and cuQuantum wheels).
 
 Notes:
 
-- Users can install cuQuantum Python using `pip install cuquantum-python`, which will attempt to detect the current CUDA environment and choose the appropriate wheel to install. In the event of detection failure, CUDA 11 is assumed. This is subject to change in the future. Installing wheels with the `-cuXX` suffix is encouraged.
+- Users can install cuQuantum Python using `pip install --no-cache-dir cuquantum-python`, which will attempt to detect the current CUDA environment and choose the appropriate wheel to install. In the event of detection failure, CUDA 11 is assumed. This is subject to change in the future. Installing wheels with the `-cuXX` suffix is encouraged. `--no-cache-dir` is required when using `pip` 23.1+.
+- CuPy also uses a similar auto-detection mechanism to determine the correct wheel to install. If in doubt, or if installing `cuquantum-python-cu11`, please follow [CuPy's installation guide](https://docs.cupy.dev/en/stable/install.html) and install it manually.
 - To manually manage all Python dependencies, append `--no-deps` to `pip install` to bypass the `pip` solver, see below.
 
 ### Building and installing cuQuantum Python from source
@@ -41,10 +42,10 @@ Notes:
 The build-time dependencies of the cuQuantum Python package include:
 
 * CUDA Toolkit 11.x or 12.x
-* cuStateVec 1.3.0+
-* cuTensorNet 2.1.0+
+* cuStateVec 1.4.0+
+* cuTensorNet 2.2.0+
 * cuTENSOR 1.6.1+
-* Python 3.8+
+* Python 3.9+
 * Cython >=0.29.22,<3
 * pip 21.3.1+
 * [packaging](https://packaging.pypa.io/en/latest/)
@@ -84,12 +85,12 @@ Runtime dependencies of the cuQuantum Python package include:
 * An NVIDIA GPU with compute capability 7.0+
 * Driver: Linux (450.80.02+ for CUDA 11, 525.60.13+ for CUDA 12)
 * CUDA Toolkit 11.x or 12.x
-* cuStateVec 1.3.0+
-* cuTensorNet 2.1.0+
+* cuStateVec 1.4.0+
+* cuTensorNet 2.2.0+
 * cuTENSOR 1.6.1+
-* Python 3.8+
-* NumPy v1.19+
-* CuPy v9.5.0+ (see [installation guide](https://docs.cupy.dev/en/stable/install.html))
+* Python 3.9+
+* NumPy v1.21+
+* CuPy v10.0.0+ (see [installation guide](https://docs.cupy.dev/en/stable/install.html))
 * PyTorch v1.10+ (optional, see [installation guide](https://pytorch.org/get-started/locally/))
 * Qiskit v0.24.0+ (optional, see [installation guide](https://qiskit.org/documentation/getting_started.html))
 * Cirq v0.6.0+ (optional, see [installation guide](https://quantumai.google/cirq/install))
@@ -100,7 +101,7 @@ If you install everything from conda-forge, all the required dependencies are ta
 If you install the pip wheels, CuPy, cuTENSOR and cuQuantum (but not CUDA Toolkit or the driver,
 please make sure the CUDA libraries are visible through your `LD_LIBRARY_PATH`) are installed for you.
 
-If you build cuQuantum Python from source, please make sure the paths to the CUDA, cuQuantum, and cuTENSOR libraries are added
+If you build cuQuantum Python from source, please make sure that the paths to the CUDA, cuQuantum, and cuTENSOR libraries are added
 to your `LD_LIBRARY_PATH` environment variable, and that a compatible CuPy is installed.
 
 Known issues:

diff --git a/python/builder/pep517.py b/python/builder/pep517.py
@@ -30,8 +30,8 @@ def get_requires_for_build_wheel(config_settings=None):
     # set up version constraints: note that CalVer like 22.03 is normalized to
     # 22.3 by setuptools, so we must follow the same practice in the constraints;
     # also, we don't need the patch number here
-    cuqnt_require = [f'custatevec-cu{utils.cuda_major_ver}~=1.3',   # ">=1.3.0,<2"
-                     f'cutensornet-cu{utils.cuda_major_ver}~=2.1',  # ">=2.1.0,<3"
+    cuqnt_require = [f'custatevec-cu{utils.cuda_major_ver}~=1.4',   # ">=1.4.0,<2"
+                     f'cutensornet-cu{utils.cuda_major_ver}~=2.2',  # ">=2.2.0,<3"
                     ]
 
     return _build_meta.get_requires_for_build_wheel(config_settings) + cuqnt_require

diff --git a/python/builder/utils.py b/python/builder/utils.py
@@ -3,7 +3,6 @@
 # SPDX-License-Identifier: BSD-3-Clause
 
 import os
-import platform
 import re
 import site
 import sys
@@ -57,13 +56,10 @@ def check_cuda_version():
 # We support CUDA 11/12 starting 23.03
 cuda_ver = check_cuda_version()
 if cuda_ver == '11.0':
-    cutensor_ver = cuda_ver
     cuda_major_ver = '11'
 elif '11.0' < cuda_ver < '12.0':
-    cutensor_ver = '11'
     cuda_major_ver = '11'
 elif '12.0' <= cuda_ver < '13.0':
-    cutensor_ver = '12'
     cuda_major_ver = '12'
 else:
     raise RuntimeError(f"Unsupported CUDA version: {cuda_ver}")
@@ -79,20 +75,6 @@ def run(self):
         building_wheel = True
         super().run()
 
-    def finalize_options(self):
-        super().finalize_options()
-        self.root_is_pure = False
-
-    def get_tag(self):
-        # hack: passing --build-options in cmdline no longer works with PEP 517 backend,
-        # so we need to overwrite --plat-name here
-        # refs:
-        #   - https://github.com/pypa/build/issues/480
-        #   - https://github.com/scikit-build/ninja-python-distributions/pull/85
-        impl_tag, abi_tag, _ = super().get_tag()
-        plat_tag = f"manylinux2014_{platform.machine()}"
-        return impl_tag, abi_tag, plat_tag
-
 
 class build_ext(_build_ext):
 
@@ -131,28 +113,14 @@ def _set_library_roots(self):
                 else:
                     cutensornet_root = cuquantum_root
 
-        # search order:
-        # 1. installed "cutensor" package
-        # 2. env var
-        for path in py_paths:
-            path = os.path.join(path, 'cutensor')
-            if os.path.isdir(os.path.join(path, 'include')):
-                cutensor_root = path
-                break
-        else:
-            try:
-                cutensor_root = os.environ['CUTENSOR_ROOT']
-            except KeyError as e:
-                raise RuntimeError('cuTENSOR is not found, please set $CUTENSOR_ROOT') from e
-
-        return custatevec_root, cutensornet_root, cutensor_root
+        return custatevec_root, cutensornet_root
 
     def _prep_includes_libs_rpaths(self):
         """
         Set global vars cusv_incl_dir, cutn_incl_dir, cusv_lib_dir, cutn_lib_dir,
         cusv_lib, cutn_lib, and extra_linker_flags.
         """
-        custatevec_root, cutensornet_root, cutensor_root = self._set_library_roots()
+        custatevec_root, cutensornet_root = self._set_library_roots()
 
         global cusv_incl_dir, cutn_incl_dir
         cusv_incl_dir = [os.path.join(cuda_path, 'include'),
@@ -165,22 +133,20 @@ def _prep_includes_libs_rpaths(self):
         cusv_lib_dir = [os.path.join(custatevec_root, 'lib'),
                         os.path.join(custatevec_root, 'lib64')]
         cutn_lib_dir = [os.path.join(cutensornet_root, 'lib'),
-                        os.path.join(cutensornet_root, 'lib64'),
-                        os.path.join(cutensor_root, 'lib'),  # wheel
-                        os.path.join(cutensor_root, 'lib', cutensor_ver)]  # tarball
+                        os.path.join(cutensornet_root, 'lib64')]
 
         global cusv_lib, cutn_lib, extra_linker_flags
         if not building_wheel:
             # Note: with PEP-517 the editable mode would not build a wheel for installation
             # (and we purposely do not support PEP-660).
             cusv_lib = ['custatevec']
-            cutn_lib = ['cutensornet', 'cutensor']
+            cutn_lib = ['cutensornet']
             extra_linker_flags = []
         else:
             # Note: soname = library major version
-            # We don't need to link to cuBLAS/cuSOLVER at build time (TODO: perhaps cuTENSOR too...?)
+            # We don't need to link to cuBLAS/cuSOLVER/cuTensor at build time
             cusv_lib = [':libcustatevec.so.1']
-            cutn_lib = [':libcutensornet.so.2', ':libcutensor.so.1']
+            cutn_lib = [':libcutensornet.so.2']
             # The rpaths must be adjusted given the following full-wheel installation:
             # - cuquantum-python: site-packages/cuquantum/{custatevec, cutensornet}/  [=$ORIGIN]
             # - cusv & cutn:      site-packages/cuquantum/lib/
@@ -201,7 +167,6 @@ def _prep_includes_libs_rpaths(self):
         print("CUDA path:", cuda_path)
         print("cuStateVec path:", custatevec_root)
         print("cuTensorNet path:", cutensornet_root)
-        print("cuTENSOR path:", cutensor_root)
         print("*"*80+"\n")
 
     def build_extension(self, ext):

diff --git a/python/cuquantum/__init__.py b/python/cuquantum/__init__.py
@@ -16,9 +16,14 @@
         custatevec.Pauli,
         custatevec.MatrixLayout,
         custatevec.MatrixType,
+        custatevec.MatrixMapType,
         custatevec.Collapse,
         custatevec.SamplerOutput,
         custatevec.DeviceNetworkType,
+        cutensornet.NetworkAttribute,
+        custatevec.CommunicatorType,
+        custatevec.DataTransferType,
+        custatevec.StateVectorType,
         cutensornet.ContractionOptimizerInfoAttribute,
         cutensornet.ContractionOptimizerConfigAttribute,
         cutensornet.ContractionAutotunePreferenceAttribute,
@@ -32,6 +37,9 @@
         cutensornet.TensorSVDPartition,
         cutensornet.TensorSVDInfoAttribute,
         cutensornet.GateSplitAlgo,
+        cutensornet.StatePurity,
+        cutensornet.MarginalAttribute,
+        cutensornet.SamplerAttribute,
         ):
     cutensornet._internal.enum_utils.add_enum_class_doc(enum, chomp="_ATTRIBUTE|_PREFERENCE_ATTRIBUTE")
 

diff --git a/python/cuquantum/_version.py b/python/cuquantum/_version.py
@@ -5,4 +5,4 @@
 # Note: cuQuantum Python follows the cuQuantum SDK version, which is now
 # switched to YY.MM and is different from individual libraries' (semantic)
 # versioning scheme.
-__version__ = '23.03.0'
+__version__ = '23.06.0'
diff --git a/python/cuquantum/custatevec/custatevec.pxd b/python/cuquantum/custatevec/custatevec.pxd
@@ -14,6 +14,14 @@ from cuquantum.utils cimport (DataType, DeviceAllocType, DeviceFreeType, int2,
 
 
 cdef extern from '<custatevec.h>' nogil:
+    # cuStateVec consts
+    const int CUSTATEVEC_VER_MAJOR
+    const int CUSTATEVEC_VER_MINOR
+    const int CUSTATEVEC_VER_PATCH
+    const int CUSTATEVEC_VERSION
+    const int CUSTATEVEC_ALLOCATOR_NAME_LEN
+    const int CUSTATEVEC_MAX_SEGMENT_MASK_SIZE
+
     # cuStateVec types
     ctypedef void* _Handle 'custatevecHandle_t'
     ctypedef int64_t _Index 'custatevecIndex_t'
@@ -24,10 +32,7 @@ cdef extern from '<custatevec.h>' nogil:
         void* ctx
         DeviceAllocType device_alloc
         DeviceFreeType device_free
-
-        # Cython limitation: cannot use C defines in declaring a static array,
-        # so we just have to hard-code CUSTATEVEC_ALLOCATOR_NAME_LEN here...
-        char name[64]
+        char name[CUSTATEVEC_ALLOCATOR_NAME_LEN]
     ctypedef void(*LoggerCallbackData 'custatevecLoggerCallbackData_t')(
         int32_t logLevel,
         const char* functionName,
@@ -69,6 +74,10 @@ cdef extern from '<custatevec.h>' nogil:
         CUSTATEVEC_MATRIX_TYPE_UNITARY
         CUSTATEVEC_MATRIX_TYPE_HERMITIAN
 
+    ctypedef enum _MatrixMapType 'custatevecMatrixMapType_t':
+        CUSTATEVEC_MATRIX_MAP_TYPE_BROADCAST
+        CUSTATEVEC_MATRIX_MAP_TYPE_MATRIX_INDEXED
+
     ctypedef enum _CollapseOp 'custatevecCollapseOp_t':
         CUSTATEVEC_COLLAPSE_NONE
         CUSTATEVEC_COLLAPSE_NORMALIZE_AND_ZERO
@@ -92,6 +101,11 @@ cdef extern from '<custatevec.h>' nogil:
         CUSTATEVEC_DATA_TRANSFER_TYPE_RECV
         CUSTATEVEC_DATA_TRANSFER_TYPE_SEND_RECV
 
+    ctypedef enum _StateVectorType 'custatevecStateVectorType_t':
+        CUSTATEVEC_STATE_VECTOR_TYPE_ZERO
+        CUSTATEVEC_STATE_VECTOR_TYPE_UNIFORM
+        CUSTATEVEC_STATE_VECTOR_TYPE_GHZ
+        CUSTATEVEC_STATE_VECTOR_TYPE_W
 
     # cuStateVec consts
     int CUSTATEVEC_VER_MAJOR