Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: possible performance regression in points_in_polygon() #1413

Closed
jameslamb opened this issue Jul 22, 2024 · 5 comments · Fixed by #1418
Closed

[BUG]: possible performance regression in points_in_polygon() #1413

jameslamb opened this issue Jul 22, 2024 · 5 comments · Fixed by #1418
Labels
bug Something isn't working

Comments

@jameslamb
Copy link
Member

jameslamb commented Jul 22, 2024

Version

24.08

On which installation method(s) does this occur?

Conda

Describe the issue

See the write-up at #1407 (comment).

Since around July 12, 2024, the nyc_taxi_years_correlation.ipynb started taking several hours to complete (on v24.08, using 24.08 cudf and other RAPIDS nightlies). Prior to that, on the exact same hardware, it completed in under 8 minutes.

I was able to reproduce this interactively, on a machine with 8 V100s and CUDA 12.2.

I strongly suspect that this indicates a performance regression, maybe of the form "some change(s) in cudf cause a cuspatial codepath that could previously execute on the GPU to fall back to the CPU", although I don't have profiling output to provide as evidence.

Minimum reproducible example

From #1407 (comment).

Download the input data.

if [ ! -f "tzones_lonlat.json" ]; then
    curl "https://data.cityofnewyork.us/api/geospatial/d3c5-ddgc?method=export&format=GeoJSON" -o tzones_lonlat.json;
else
    echo "tzones_lonlat.json found";
fi
if [ ! -f "taxi2016.csv" ]; then
    curl https://storage.googleapis.com/anaconda-public-data/nyc-taxi/csv/2016/yellow_tripdata_2016-01.csv -o taxi2016.csv;
else
    echo "taxi2016.csv found";
fi   

Then, in a Python 3.11 session (with v24.08 of cuspatial and all its RAPIDS dependencies).

import cuspatial
import geopandas as gpd
import cudf
import numpy as np

taxi2016 = cudf.read_csv("taxi2016.csv")
tzones = gpd.GeoDataFrame.from_file('tzones_lonlat.json')
taxi_zones = cuspatial.from_geopandas(tzones).geometry
taxi_zone_rings = cuspatial.GeoSeries.from_polygons_xy(
    taxi_zones.polygons.xy,
    taxi_zones.polygons.ring_offset,
    taxi_zones.polygons.part_offset,
    cudf.Series(range(len(taxi_zones.polygons.part_offset)))
)

def make_geoseries_from_lonlat(lon, lat):
    lonlat = cudf.DataFrame({"lon": lon, "lat": lat}).interleave_columns()
    return cuspatial.GeoSeries.from_points_xy(lonlat)

pickup2016 = make_geoseries_from_lonlat(taxi2016['pickup_longitude'] , taxi2016['pickup_latitude'])
dropoff2016 = make_geoseries_from_lonlat(taxi2016['dropoff_longitude'] , taxi2016['dropoff_latitude'])

pip_iterations = list(np.arange(0, 263, 31))
pip_iterations.append(263)
print(pip_iterations)

taxi2016['PULocationID'] = 264
taxi2016['DOLocationID'] = 264

start = pip_iterations[0]
end = pip_iterations[1]

zone = taxi_zone_rings[start:end]

# find all pickups in that zone
pickups = cuspatial.point_in_polygon(pickup2016, zone)
print(pickups)
print("---")
dropoffs = cuspatial.point_in_polygon(dropoff2016, zone)
print(dropoffs)

That one combination of polygons completed successfully, but took 21 to complete. It's the 2 points_in_polygon() calls that took around 20 of those 21 minutes.

And in the notebook, 10 such combinations are processed.

"pip_iterations = list(np.arange(0, 263, 31))\n",
"pip_iterations.append(263)\n",

[0, 31, 62, 93, 124, 155, 186, 217, 248, 263]

"for i in range(len(pip_iterations)-1):\n",
" start = pip_iterations[i]\n",
" end = pip_iterations[i+1]\n",

So conservatively, it might take 3.5 hours for the notebook to finish in my setup. And that's making a LOT of assumptions.

Relevant log output

N/A

Environment details

Both these environments:

  • CI conda-notebooks-tests environment with V100s (CUDA 12.2)
  • Ubuntu 22.04 box with 8 V100s (CUDA 12.2)

Using cudf (and other RAPIDS dependencies) nightly conda packages as of July 12, 2024.

output of 'conda info', 'conda env export', and 'nvidia-smi' (click me)
     active environment : test
    active env location : /opt/conda/envs/test
            shell level : 1
       user config file : /github/home/.condarc
 populated config files : /opt/conda/.condarc
          conda version : 24.5.0
    conda-build version : 24.5.1
         python version : 3.11.9.final.0
                 solver : libmamba (default)
       virtual packages : __archspec=1=broadwell
                          __conda=24.5.0=0
                          __cuda=12.4=0
                          __glibc=2.35=0
                          __linux=5.4.0=0
                          __unix=0=0
       base environment : /opt/conda  (writable)
      conda av data dir : /opt/conda/etc/conda
  conda av metadata url : None
           channel URLs : https://conda.anaconda.org/rapidsai/linux-64
                          https://conda.anaconda.org/rapidsai/noarch
                          https://conda.anaconda.org/rapidsai-nightly/linux-64
                          https://conda.anaconda.org/rapidsai-nightly/noarch
                          https://conda.anaconda.org/dask/label/dev/linux-64
                          https://conda.anaconda.org/dask/label/dev/noarch
                          https://conda.anaconda.org/pytorch/linux-64
                          https://conda.anaconda.org/pytorch/noarch
                          https://conda.anaconda.org/conda-forge/linux-64
                          https://conda.anaconda.org/conda-forge/noarch
                          https://conda.anaconda.org/nvidia/linux-64
                          https://conda.anaconda.org/nvidia/noarch
          package cache : /opt/conda/pkgs
                          /github/home/.conda/pkgs
       envs directories : /opt/conda/envs
                          /github/home/.conda/envs
               platform : linux-64
             user-agent : conda/24.5.0 requests/2.32.3 CPython/3.11.9 Linux/5.4.0-177-generic ubuntu/22.04.4 glibc/2.35 solver/libmamba conda-libmamba-solver/24.1.0 libmambapy/1.5.8
                UID:GID : 0:0
             netrc file : None
           offline mode : False

==> /opt/conda/.condarc <==
auto_update_conda: False
channels:
  - rapidsai
  - rapidsai-nightly
  - dask/label/dev
  - pytorch
  - conda-forge
  - nvidia
always_yes: True
number_channel_notices: 0
conda-build:
  set_build_id: False
  root_dir: /tmp/conda-bld-workspace
  output_folder: /tmp/conda-bld-output

==> envvars <==
allow_softlinks: False

# packages in environment at /opt/conda/envs/test:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
anyio                     4.4.0              pyhd8ed1ab_0    conda-forge
argon2-cffi               23.1.0             pyhd8ed1ab_0    conda-forge
argon2-cffi-bindings      21.2.0          py311h459d7ec_4    conda-forge
arrow                     1.3.0              pyhd8ed1ab_0    conda-forge
asttokens                 2.4.1              pyhd8ed1ab_0    conda-forge
async-lru                 2.0.4              pyhd8ed1ab_0    conda-forge
attrs                     23.2.0             pyh71513ae_0    conda-forge
aws-c-auth                0.7.22              hbd3ac97_10    conda-forge
aws-c-cal                 0.7.1                h87b94db_1    conda-forge
aws-c-common              0.9.23               h4ab18f5_0    conda-forge
aws-c-compression         0.2.18               he027950_7    conda-forge
aws-c-event-stream        0.4.2               h7671281_15    conda-forge
aws-c-http                0.8.2                he17ee6b_6    conda-forge
aws-c-io                  0.14.10              h826b7d6_1    conda-forge
aws-c-mqtt                0.10.4               hcd6a914_8    conda-forge
aws-c-s3                  0.6.0                h365ddd8_2    conda-forge
aws-c-sdkutils            0.1.16               he027950_3    conda-forge
aws-checksums             0.1.18               he027950_7    conda-forge
aws-crt-cpp               0.27.3               hda66527_2    conda-forge
aws-sdk-cpp               1.11.329             h46c3b66_9    conda-forge
azure-core-cpp            1.12.0               h830ed8b_0    conda-forge
azure-identity-cpp        1.8.0                hdb0d106_1    conda-forge
azure-storage-blobs-cpp   12.11.0              ha67cba7_1    conda-forge
azure-storage-common-cpp  12.6.0               he3f277c_1    conda-forge
azure-storage-files-datalake-cpp 12.10.0              h29b5301_1    conda-forge
babel                     2.14.0             pyhd8ed1ab_0    conda-forge
beautifulsoup4            4.12.3             pyha770c72_0    conda-forge
bleach                    6.1.0              pyhd8ed1ab_0    conda-forge
blosc                     1.21.6               hef167b5_0    conda-forge
bokeh                     3.5.0              pyhd8ed1ab_0    conda-forge
branca                    0.7.2              pyhd8ed1ab_0    conda-forge
brotli                    1.1.0                hd590300_1    conda-forge
brotli-bin                1.1.0                hd590300_1    conda-forge
brotli-python             1.1.0           py311hb755f60_1    conda-forge
bzip2                     1.0.8                h4bc722e_7    conda-forge
c-ares                    1.32.2               h4bc722e_0    conda-forge
ca-certificates           2024.7.4             hbcca054_0    conda-forge
cached-property           1.5.2                hd8ed1ab_1    conda-forge
cached_property           1.5.2              pyha770c72_1    conda-forge
cachetools                5.4.0              pyhd8ed1ab_0    conda-forge
cairo                     1.18.0               h3faef2a_0    conda-forge
certifi                   2024.7.4           pyhd8ed1ab_0    conda-forge
cffi                      1.16.0          py311hb3a22ac_0    conda-forge
cfitsio                   4.3.1                hbdc6101_0    conda-forge
charset-normalizer        3.3.2              pyhd8ed1ab_0    conda-forge
click                     8.1.7           unix_pyh707e725_0    conda-forge
click-plugins             1.1.1                      py_0    conda-forge
cligj                     0.7.2              pyhd8ed1ab_1    conda-forge
cloudpickle               3.0.0              pyhd8ed1ab_0    conda-forge
comm                      0.2.2              pyhd8ed1ab_0    conda-forge
contourpy                 1.2.1           py311h9547e67_0    conda-forge
cuda-cccl_linux-64        12.2.140             ha770c72_0    conda-forge
cuda-crt-dev_linux-64     12.2.140             ha770c72_1    conda-forge
cuda-crt-tools            12.2.140             ha770c72_1    conda-forge
cuda-cudart               12.2.140             hd3aeb46_0    conda-forge
cuda-cudart-dev           12.2.140             hd3aeb46_0    conda-forge
cuda-cudart-dev_linux-64  12.2.140             h59595ed_0    conda-forge
cuda-cudart-static        12.2.140             hd3aeb46_0    conda-forge
cuda-cudart-static_linux-64 12.2.140             h59595ed_0    conda-forge
cuda-cudart_linux-64      12.2.140             h59595ed_0    conda-forge
cuda-nvcc-dev_linux-64    12.2.140             ha770c72_1    conda-forge
cuda-nvcc-impl            12.2.140             hd3aeb46_1    conda-forge
cuda-nvcc-tools           12.2.140             hd3aeb46_1    conda-forge
cuda-nvrtc                12.2.140             hd3aeb46_0    conda-forge
cuda-nvvm-dev_linux-64    12.2.140             ha770c72_1    conda-forge
cuda-nvvm-impl            12.2.140             h59595ed_1    conda-forge
cuda-nvvm-tools           12.2.140             h59595ed_1    conda-forge
cuda-profiler-api         12.2.140             ha770c72_0    conda-forge
cuda-python               12.5.0          py311h817de4b_1    conda-forge
cuda-version              12.2                 he2b69de_3    conda-forge
cudf                      24.08.00a322    cuda12_py311_240717_g093bcc94cc_322    rapidsai-nightly
cuml                      24.08.00a35     cuda12_py311_240716_g98721e239_35    rapidsai-nightly
cuproj                    24.08.00a20     cuda12_py311_240717_ga2d8ce19_20    file:///tmp/python_channel
cupy                      13.2.0          py311he5a987b_0    conda-forge
cupy-core                 13.2.0          py311h3bdf873_0    conda-forge
curl                      8.8.0                he654da7_1    conda-forge
cuspatial                 24.08.00a20     cuda12_py311_240717_ga2d8ce19_20    file:///tmp/python_channel
cycler                    0.12.1             pyhd8ed1ab_0    conda-forge
cytoolz                   0.12.3          py311h459d7ec_0    conda-forge
dask                      2024.7.0           pyhd8ed1ab_0    conda-forge
dask-core                 2024.7.0           pyhd8ed1ab_0    conda-forge
dask-cuda                 24.08.00a12     py311_240717_gc31aaac_12    rapidsai-nightly
dask-cudf                 24.08.00a322    cuda12_py311_240717_g093bcc94cc_322    rapidsai-nightly
dask-expr                 1.1.7              pyhd8ed1ab_0    conda-forge
debugpy                   1.8.2           py311h4332511_0    conda-forge
decorator                 5.1.1              pyhd8ed1ab_0    conda-forge
defusedxml                0.7.1              pyhd8ed1ab_0    conda-forge
distributed               2024.7.0           pyhd8ed1ab_0    conda-forge
distributed-ucxx          0.39.00a        py3.11_240717_g36284cb_10    rapidsai-nightly
dlpack                    0.8                  h59595ed_3    conda-forge
entrypoints               0.4                pyhd8ed1ab_0    conda-forge
exceptiongroup            1.2.2              pyhd8ed1ab_0    conda-forge
executing                 2.0.1              pyhd8ed1ab_0    conda-forge
expat                     2.6.2                h59595ed_0    conda-forge
fastrlock                 0.8.2           py311hb755f60_2    conda-forge
fiona                     1.9.5           py311hf8e0aa6_2    conda-forge
fmt                       10.2.1               h00ab1b0_0    conda-forge
folium                    0.17.0             pyhd8ed1ab_0    conda-forge
font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
font-ttf-inconsolata      3.000                h77eed37_0    conda-forge
font-ttf-source-code-pro  2.038                h77eed37_0    conda-forge
font-ttf-ubuntu           0.83                 h77eed37_2    conda-forge
fontconfig                2.14.2               h14ed4e7_0    conda-forge
fonts-conda-ecosystem     1                             0    conda-forge
fonts-conda-forge         1                             0    conda-forge
fonttools                 4.53.1          py311h61187de_0    conda-forge
fqdn                      1.5.1              pyhd8ed1ab_0    conda-forge
freetype                  2.12.1               h267a509_2    conda-forge
freexl                    2.0.0                h743c826_0    conda-forge
fsspec                    2024.6.1           pyhff2d567_0    conda-forge
gdal                      3.8.1           py311h39b4e0e_3    conda-forge
geopandas                 0.14.4             pyhd8ed1ab_0    conda-forge
geopandas-base            0.14.4             pyha770c72_0    conda-forge
geos                      3.12.1               h59595ed_0    conda-forge
geotiff                   1.7.1               hf074850_14    conda-forge
gettext                   0.22.5               h59595ed_2    conda-forge
gettext-tools             0.22.5               h59595ed_2    conda-forge
gflags                    2.2.2             he1b5a44_1004    conda-forge
giflib                    5.2.2                hd590300_0    conda-forge
glog                      0.7.1                hbabe93e_0    conda-forge
h11                       0.14.0             pyhd8ed1ab_0    conda-forge
h2                        4.1.0              pyhd8ed1ab_0    conda-forge
hdf4                      4.2.15               h2a13503_7    conda-forge
hdf5                      1.14.3          nompi_hdf9ad27_105    conda-forge
hpack                     4.0.0              pyh9f0ad1d_0    conda-forge
httpcore                  1.0.5              pyhd8ed1ab_0    conda-forge
httpx                     0.27.0             pyhd8ed1ab_0    conda-forge
hyperframe                6.0.1              pyhd8ed1ab_0    conda-forge
icu                       73.2                 h59595ed_0    conda-forge
idna                      3.7                pyhd8ed1ab_0    conda-forge
imagecodecs-lite          2019.12.3       py311h18e1886_8    conda-forge
imageio                   2.34.2             pyh12aca89_0    conda-forge
importlib-metadata        8.0.0              pyha770c72_0    conda-forge
importlib_metadata        8.0.0                hd8ed1ab_0    conda-forge
importlib_resources       6.4.0              pyhd8ed1ab_0    conda-forge
ipykernel                 6.29.5             pyh3099207_0    conda-forge
ipython                   8.26.0             pyh707e725_0    conda-forge
ipywidgets                8.1.3              pyhd8ed1ab_0    conda-forge
isoduration               20.11.0            pyhd8ed1ab_0    conda-forge
jedi                      0.19.1             pyhd8ed1ab_0    conda-forge
jinja2                    3.1.4              pyhd8ed1ab_0    conda-forge
joblib                    1.4.2              pyhd8ed1ab_0    conda-forge
json-c                    0.17                 h1220068_1    conda-forge
json5                     0.9.25             pyhd8ed1ab_0    conda-forge
jsonpointer               3.0.0           py311h38be061_0    conda-forge
jsonschema                4.23.0             pyhd8ed1ab_0    conda-forge
jsonschema-specifications 2023.12.1          pyhd8ed1ab_0    conda-forge
jsonschema-with-format-nongpl 4.23.0               hd8ed1ab_0    conda-forge
jupyter-lsp               2.2.5              pyhd8ed1ab_0    conda-forge
jupyter_client            8.6.2              pyhd8ed1ab_0    conda-forge
jupyter_core              5.7.2           py311h38be061_0    conda-forge
jupyter_events            0.10.0             pyhd8ed1ab_0    conda-forge
jupyter_server            2.14.2             pyhd8ed1ab_0    conda-forge
jupyter_server_terminals  0.5.3              pyhd8ed1ab_0    conda-forge
jupyterlab                4.2.3              pyhd8ed1ab_0    conda-forge
jupyterlab_pygments       0.3.0              pyhd8ed1ab_1    conda-forge
jupyterlab_server         2.27.3             pyhd8ed1ab_0    conda-forge
jupyterlab_widgets        3.0.11             pyhd8ed1ab_0    conda-forge
kealib                    1.5.3                hee9dde6_1    conda-forge
keyutils                  1.6.1                h166bdaf_0    conda-forge
kiwisolver                1.4.5           py311h9547e67_1    conda-forge
krb5                      1.21.3               h659f571_0    conda-forge
lazy_loader               0.4                pyhd8ed1ab_0    conda-forge
lcms2                     2.16                 hb7c19ff_0    conda-forge
ld_impl_linux-64          2.40                 hf3520f5_7    conda-forge
lerc                      4.0.0                h27087fc_0    conda-forge
libabseil                 20240116.2      cxx17_he02047a_1    conda-forge
libaec                    1.1.3                h59595ed_0    conda-forge
libarchive                3.7.4                hfca40fe_0    conda-forge
libarrow                  16.1.0          h34456a7_14_cpu    conda-forge
libarrow-acero            16.1.0          he02047a_14_cpu    conda-forge
libarrow-dataset          16.1.0          he02047a_14_cpu    conda-forge
libarrow-substrait        16.1.0          hc9a23c6_14_cpu    conda-forge
libasprintf               0.22.5               h661eb56_2    conda-forge
libasprintf-devel         0.22.5               h661eb56_2    conda-forge
libblas                   3.9.0           22_linux64_openblas    conda-forge
libbrotlicommon           1.1.0                hd590300_1    conda-forge
libbrotlidec              1.1.0                hd590300_1    conda-forge
libbrotlienc              1.1.0                hd590300_1    conda-forge
libcblas                  3.9.0           22_linux64_openblas    conda-forge
libcrc32c                 1.1.2                h9c3ff4c_0    conda-forge
libcublas                 12.2.5.6             hd3aeb46_0    conda-forge
libcublas-dev             12.2.5.6             hd3aeb46_0    conda-forge
libcudf                   24.08.00a322    cuda12_240717_g093bcc94cc_322    rapidsai-nightly
libcufft                  11.0.8.103           hd3aeb46_0    conda-forge
libcufile                 1.7.2.10             hd3aeb46_0    conda-forge
libcufile-dev             1.7.2.10             hd3aeb46_0    conda-forge
libcuml                   24.08.00a35     cuda12_240716_g98721e239_35    rapidsai-nightly
libcumlprims              24.08.00a       cuda12_240717_g6a1017c_7    rapidsai-nightly
libcurand                 10.3.3.141           hd3aeb46_0    conda-forge
libcurand-dev             10.3.3.141           hd3aeb46_0    conda-forge
libcurl                   8.8.0                hca28451_1    conda-forge
libcusolver               11.5.2.141           hd3aeb46_0    conda-forge
libcusolver-dev           11.5.2.141           hd3aeb46_0    conda-forge
libcusparse               12.1.2.141           hd3aeb46_0    conda-forge
libcusparse-dev           12.1.2.141           hd3aeb46_0    conda-forge
libcuspatial              24.08.00a20     cuda12_240717_ga2d8ce19_20    file:///tmp/cpp_channel
libdeflate                1.19                 hd590300_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 hd590300_2    conda-forge
libevent                  2.1.12               hf998b51_1    conda-forge
libexpat                  2.6.2                h59595ed_0    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 14.1.0               h77fa898_0    conda-forge
libgdal                   3.8.1                h4b8bffa_3    conda-forge
libgettextpo              0.22.5               h59595ed_2    conda-forge
libgettextpo-devel        0.22.5               h59595ed_2    conda-forge
libgfortran-ng            14.1.0               h69a702a_0    conda-forge
libgfortran5              14.1.0               hc5f4f2c_0    conda-forge
libglib                   2.78.4               h783c2da_0    conda-forge
libgomp                   14.1.0               h77fa898_0    conda-forge
libgoogle-cloud           2.26.0               h26d7fe4_0    conda-forge
libgoogle-cloud-storage   2.26.0               ha262f82_0    conda-forge
libgrpc                   1.62.2               h15f2491_0    conda-forge
libiconv                  1.17                 hd590300_2    conda-forge
libjpeg-turbo             3.0.0                hd590300_1    conda-forge
libkml                    1.3.0             hbbc8833_1020    conda-forge
libkvikio                 24.08.00a       cuda12_240717_gab3778c_18    rapidsai-nightly
liblapack                 3.9.0           22_linux64_openblas    conda-forge
libllvm14                 14.0.6               hcd5def8_4    conda-forge
libnetcdf                 4.9.2           nompi_h135f659_114    conda-forge
libnghttp2                1.58.0               h47da74e_1    conda-forge
libnl                     3.9.0                hd590300_0    conda-forge
libnsl                    2.0.1                hd590300_0    conda-forge
libnvjitlink              12.2.140             hd3aeb46_0    conda-forge
libopenblas               0.3.27          pthreads_hac2b453_1    conda-forge
libparquet                16.1.0          h9e5060d_14_cpu    conda-forge
libpng                    1.6.43               h2797004_0    conda-forge
libpq                     16.3                 ha72fbe1_0    conda-forge
libprotobuf               4.25.3               h08a7969_0    conda-forge
libraft                   24.08.00a43     cuda12_240717_gab5e1287_43    rapidsai-nightly
libraft-headers           24.08.00a43     cuda12_240717_gab5e1287_43    rapidsai-nightly
libraft-headers-only      24.08.00a43     cuda12_240717_gab5e1287_43    rapidsai-nightly
libre2-11                 2023.09.01           h5a48ba9_2    conda-forge
librmm                    24.08.00a27     cuda12_240717_gf91ca6f2_27    rapidsai-nightly
librttopo                 1.1.0               h8917695_15    conda-forge
libsodium                 1.0.18               h36c2ea0_1    conda-forge
libspatialindex           2.0.0                he02047a_0    conda-forge
libspatialite             5.1.0                h72606ae_3    conda-forge
libsqlite                 3.46.0               hde9e2c9_0    conda-forge
libssh2                   1.11.0               h0841786_0    conda-forge
libstdcxx-ng              14.1.0               hc0a3c3a_0    conda-forge
libthrift                 0.19.0               hb90f79a_1    conda-forge
libtiff                   4.6.0                ha9c0a0a_2    conda-forge
libucxx                   0.39.00a        cuda12_240717_g36284cb_10    rapidsai-nightly
libutf8proc               2.8.0                h166bdaf_0    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libwebp-base              1.4.0                hd590300_0    conda-forge
libxcb                    1.15                 h0b41bf4_0    conda-forge
libxcrypt                 4.4.36               hd590300_1    conda-forge
libxml2                   2.12.7               h4c95cb1_3    conda-forge
libzip                    1.10.1               h2629f0a_3    conda-forge
libzlib                   1.3.1                h4ab18f5_1    conda-forge
llvmlite                  0.43.0          py311hbde99c3_0    conda-forge
locket                    1.0.0              pyhd8ed1ab_0    conda-forge
lz4                       4.3.3           py311h38e4bf4_0    conda-forge
lz4-c                     1.9.4                hcb278e6_0    conda-forge
lzo                       2.10              hd590300_1001    conda-forge
mapclassify               2.6.1              pyhd8ed1ab_0    conda-forge
markdown-it-py            3.0.0              pyhd8ed1ab_0    conda-forge
markupsafe                2.1.5           py311h459d7ec_0    conda-forge
matplotlib-base           3.9.1           py311hffb96ce_0    conda-forge
matplotlib-inline         0.1.7              pyhd8ed1ab_0    conda-forge
mdurl                     0.1.2              pyhd8ed1ab_0    conda-forge
minizip                   4.0.7                h401b404_0    conda-forge
mistune                   3.0.2              pyhd8ed1ab_0    conda-forge
msgpack-python            1.0.8           py311h52f7536_0    conda-forge
munkres                   1.1.4              pyh9f0ad1d_0    conda-forge
nbclient                  0.10.0             pyhd8ed1ab_0    conda-forge
nbconvert-core            7.16.4             pyhd8ed1ab_1    conda-forge
nbformat                  5.10.4             pyhd8ed1ab_0    conda-forge
nccl                      2.22.3.1             hbc370b7_0    conda-forge
ncurses                   6.5                  h59595ed_0    conda-forge
nest-asyncio              1.6.0              pyhd8ed1ab_0    conda-forge
networkx                  3.3                pyhd8ed1ab_1    conda-forge
notebook                  7.2.1              pyhd8ed1ab_0    conda-forge
notebook-shim             0.2.4              pyhd8ed1ab_0    conda-forge
nspr                      4.35                 h27087fc_0    conda-forge
nss                       3.102                h593d115_0    conda-forge
numba                     0.60.0          py311h4bc866e_0    conda-forge
numpy                     1.26.4          py311h64a7726_0    conda-forge
nvcomp                    3.0.6                h10b603f_0    conda-forge
nvtx                      0.2.10          py311h459d7ec_0    conda-forge
openjpeg                  2.5.2                h488ebb8_0    conda-forge
openssl                   3.3.1                h4bc722e_2    conda-forge
orc                       2.0.1                h17fec99_1    conda-forge
overrides                 7.7.0              pyhd8ed1ab_0    conda-forge
packaging                 24.1               pyhd8ed1ab_0    conda-forge
pandas                    2.2.2           py311h14de704_1    conda-forge
pandocfilters             1.5.0              pyhd8ed1ab_0    conda-forge
parso                     0.8.4              pyhd8ed1ab_0    conda-forge
partd                     1.4.2              pyhd8ed1ab_0    conda-forge
pcre2                     10.42                hcad00b1_0    conda-forge
pexpect                   4.9.0              pyhd8ed1ab_0    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pillow                    10.3.0          py311h18e6fac_0    conda-forge
pip                       24.0               pyhd8ed1ab_0    conda-forge
pixman                    0.43.2               h59595ed_0    conda-forge
pkgutil-resolve-name      1.3.10             pyhd8ed1ab_1    conda-forge
platformdirs              4.2.2              pyhd8ed1ab_0    conda-forge
poppler                   23.12.0              h590f24d_0    conda-forge
poppler-data              0.4.12               hd8ed1ab_0    conda-forge
postgresql                16.3                 h8e811e2_0    conda-forge
proj                      9.3.0                h1d62c97_2    conda-forge
prometheus_client         0.20.0             pyhd8ed1ab_0    conda-forge
prompt-toolkit            3.0.47             pyha770c72_0    conda-forge
psutil                    6.0.0           py311h331c9d8_0    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
pure_eval                 0.2.2              pyhd8ed1ab_0    conda-forge
pyarrow                   16.1.0          py311hbd00459_4    conda-forge
pyarrow-core              16.1.0          py311h8c3dac4_4_cpu    conda-forge
pyarrow-hotfix            0.6                pyhd8ed1ab_0    conda-forge
pycparser                 2.22               pyhd8ed1ab_0    conda-forge
pydeck                    0.8.0              pyhd8ed1ab_0    conda-forge
pygments                  2.18.0             pyhd8ed1ab_0    conda-forge
pylibraft                 24.08.00a43     cuda12_py311_240717_gab5e1287_43    rapidsai-nightly
pynvjitlink               0.3.0           py311hd269673_0    rapidsai
pynvml                    11.4.1             pyhd8ed1ab_0    conda-forge
pyparsing                 3.1.2              pyhd8ed1ab_0    conda-forge
pyproj                    3.6.1           py311h1facc83_4    conda-forge
pysocks                   1.7.1              pyha2e5f31_6    conda-forge
python                    3.11.9          hb806964_0_cpython    conda-forge
python-dateutil           2.9.0              pyhd8ed1ab_0    conda-forge
python-fastjsonschema     2.20.0             pyhd8ed1ab_0    conda-forge
python-json-logger        2.0.7              pyhd8ed1ab_0    conda-forge
python-tzdata             2024.1             pyhd8ed1ab_0    conda-forge
python_abi                3.11                    4_cp311    conda-forge
pytz                      2024.1             pyhd8ed1ab_0    conda-forge
pywavelets                1.6.0           py311h18e1886_0    conda-forge
pyyaml                    6.0.1           py311h459d7ec_1    conda-forge
pyzmq                     26.0.3          py311h08a0b41_0    conda-forge
qhull                     2020.2               h434a139_5    conda-forge
raft-dask                 24.08.00a43     cuda12_py311_240717_gab5e1287_43    rapidsai-nightly
rapids-dask-dependency    24.08.00a5                 py_0    rapidsai-nightly
rdma-core                 52.0                 he02047a_0    conda-forge
re2                       2023.09.01           h7f4b329_2    conda-forge
readline                  8.2                  h8228510_1    conda-forge
referencing               0.35.1             pyhd8ed1ab_0    conda-forge
requests                  2.32.3             pyhd8ed1ab_0    conda-forge
rfc3339-validator         0.1.4              pyhd8ed1ab_0    conda-forge
rfc3986-validator         0.1.1              pyh9f0ad1d_0    conda-forge
rich                      13.7.1             pyhd8ed1ab_0    conda-forge
rmm                       24.08.00a27     cuda12_py311_240717_gf91ca6f2_27    rapidsai-nightly
rpds-py                   0.19.0          py311hb3a8bbb_0    conda-forge
rtree                     1.3.0           py311h51bcefd_1    conda-forge
s2n                       1.4.17               he19d79f_0    conda-forge
scikit-image              0.20.0          py311h2872171_1    conda-forge
scikit-learn              1.5.1           py311hd632256_0    conda-forge
scipy                     1.14.0          py311h517d4fd_1    conda-forge
send2trash                1.8.3              pyh0d859eb_0    conda-forge
setuptools                70.3.0             pyhd8ed1ab_0    conda-forge
shapely                   2.0.4           py311h0bed3d6_1    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
snappy                    1.2.1                ha2e4443_0    conda-forge
sniffio                   1.3.1              pyhd8ed1ab_0    conda-forge
sortedcontainers          2.4.0              pyhd8ed1ab_0    conda-forge
soupsieve                 2.5                pyhd8ed1ab_1    conda-forge
spdlog                    1.12.0               hd2e6256_2    conda-forge
sqlite                    3.46.0               h6d4b2fc_0    conda-forge
stack_data                0.6.2              pyhd8ed1ab_0    conda-forge
tblib                     3.0.0              pyhd8ed1ab_0    conda-forge
terminado                 0.18.1             pyh0d859eb_0    conda-forge
threadpoolctl             3.5.0              pyhc1e730c_0    conda-forge
tifffile                  2020.6.3                   py_0    conda-forge
tiledb                    2.18.2               h99f50a1_1    conda-forge
tinycss2                  1.3.0              pyhd8ed1ab_0    conda-forge
tk                        8.6.13          noxft_h4845f30_101    conda-forge
tomli                     2.0.1              pyhd8ed1ab_0    conda-forge
toolz                     0.12.1             pyhd8ed1ab_0    conda-forge
tornado                   6.4.1           py311h331c9d8_0    conda-forge
traitlets                 5.14.3             pyhd8ed1ab_0    conda-forge
treelite                  4.2.1           py311he8f9275_0    conda-forge
types-python-dateutil     2.9.0.20240316     pyhd8ed1ab_0    conda-forge
typing-extensions         4.12.2               hd8ed1ab_0    conda-forge
typing_extensions         4.12.2             pyha770c72_0    conda-forge
typing_utils              0.1.0              pyhd8ed1ab_0    conda-forge
tzcode                    2024a                h3f72095_0    conda-forge
tzdata                    2024a                h0c530f3_0    conda-forge
ucx                       1.15.0               hda83522_8    conda-forge
ucx-py                    0.39.00a7       py311_240717_g3741610_7    rapidsai-nightly
ucxx                      0.39.00a        cuda12_py3.11_240717_g36284cb_10    rapidsai-nightly
uri-template              1.3.0              pyhd8ed1ab_0    conda-forge
uriparser                 0.9.8                hac33072_0    conda-forge
urllib3                   2.2.2              pyhd8ed1ab_1    conda-forge
wcwidth                   0.2.13             pyhd8ed1ab_0    conda-forge
webcolors                 24.6.0             pyhd8ed1ab_0    conda-forge
webencodings              0.5.1              pyhd8ed1ab_2    conda-forge
websocket-client          1.8.0              pyhd8ed1ab_0    conda-forge
wheel                     0.43.0             pyhd8ed1ab_1    conda-forge
widgetsnbextension        4.0.11             pyhd8ed1ab_0    conda-forge
xerces-c                  3.2.5                hac6953d_0    conda-forge
xorg-kbproto              1.0.7             h7f98852_1002    conda-forge
xorg-libice               1.1.1                hd590300_0    conda-forge
xorg-libsm                1.2.4                h7391055_0    conda-forge
xorg-libx11               1.8.9                h8ee46fc_0    conda-forge
xorg-libxau               1.0.11               hd590300_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xorg-libxext              1.3.4                h0b41bf4_2    conda-forge
xorg-libxrender           0.9.11               hd590300_0    conda-forge
xorg-renderproto          0.11.1            h7f98852_1002    conda-forge
xorg-xextproto            7.3.0             h0b41bf4_1003    conda-forge
xorg-xproto               7.0.31            h7f98852_1007    conda-forge
xyzservices               2024.6.0           pyhd8ed1ab_0    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
yaml                      0.2.5                h7f98852_2    conda-forge
zeromq                    4.3.5                h75354e8_4    conda-forge
zict                      3.0.0              pyhd8ed1ab_0    conda-forge
zipp                      3.19.2             pyhd8ed1ab_0    conda-forge
zlib                      1.3.1                h4ab18f5_1    conda-forge
zstandard                 0.23.0          py311h5cd10c7_0    conda-forge
zstd                      1.5.6                ha6fb4c9_0    conda-forge
/__w/cuspatial/cuspatial/notebooks /__w/cuspatial/cuspatial
Wed Jul 17 15:37:47 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Tesla V100-PCIE-32GB           Off |   00000000:85:00.0 Off |                    0 |
| N/A   24C    P0             24W /  250W |       0MiB /  32768MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

(example build link)

Other/Misc.

Other symptoms that led to this were documented in #1406.

That was closed by just skipping the most expensive notebooks, in #1407.

@harrism
Copy link
Member

harrism commented Jul 22, 2024

@trxcllnt recently modified point_in_polygon. Could those changes have caused this?

@jameslamb
Copy link
Member Author

Are you referring to #1381?

It could be related, but I don't think it'd be the root cause by itself. Those changes were made 2+ months ago, and as recently as #1404 (2 weeks ago), the conda-notebook-tests CI job here was completing in around 9 minutes (build link).

@isVoid
Copy link
Contributor

isVoid commented Jul 22, 2024

Also that PR modified the quadtree PiP algo, but the algo in question here is the non-quadtree version.

@harrism
Copy link
Member

harrism commented Jul 30, 2024

I did some profiling using pyspy. This is not a complete profile, I have just been running for about 4.5 minutes using py-spy top -- python test.py (test.py contains the code above).

Collecting samples from 'python test.py' (python v3.10.14)
Total Samples 38284
GIL: 100.00%, Active: 100.00%, Threads: 1

  %Own   %Total  OwnTime  TotalTime  Function (filename:line)                                                                                                                                                                                        
 40.00%  79.00%   159.3s    272.1s   compute_index (numba/misc/dummyarray.py:111)
 18.00%  39.00%   57.19s    112.8s   <genexpr> (numba/misc/dummyarray.py:111)
 21.00%  21.00%   55.60s    55.60s   get_offset (numba/misc/dummyarray.py:83)
  8.00%   8.00%   20.65s    20.65s   iter_contiguous_extent (numba/misc/dummyarray.py:275)
  0.00%   0.00%   17.83s    17.83s   iter_contiguous_extent (numba/misc/dummyarray.py:270)
 10.00%  89.00%   15.99s    166.7s   iter_contiguous_extent (numba/misc/dummyarray.py:274)
  0.00%   0.00%   15.00s    136.3s   iter_contiguous_extent (numba/misc/dummyarray.py:269)
  0.00%   0.00%    8.25s     8.25s   iter_contiguous_extent (numba/misc/dummyarray.py:268)
  2.00%   2.00%    8.06s     8.06s   iter_contiguous_extent (numba/misc/dummyarray.py:273)
  0.00% 100.00%    6.88s    375.5s   __getitem__ (numba/cuda/cudadrv/devicearray.py:630)
  0.00%   0.00%    5.59s    170.6s   __getitem__ (numba/misc/dummyarray.py:239)
  1.00% 100.00%    2.61s    198.0s   _do_getitem (numba/cuda/cudadrv/devicearray.py:642)
  0.00%   0.00%    2.58s    165.0s   reshape (numba/misc/dummyarray.py:351)
  0.00%   0.00%    2.32s     2.33s   read_csv (cudf/io/csv.py:96)
  0.00%   0.00%   0.800s     2.61s   _call_with_frames_removed (<frozen importlib._bootstrap>:241)
  0.00%   0.00%   0.210s    0.210s   point_in_polygon (cuspatial/core/spatial/join.py:82)
  0.00%   0.00%   0.180s    0.180s   _compile_bytecode (<frozen importlib._bootstrap_external>:672)
  0.00%   0.00%   0.150s    0.180s   inner (contextlib.py:79)
  0.00%   0.00%   0.140s    0.140s   append (numba/core/byteflow.py:1743)
  0.00%   0.00%   0.130s    0.130s   __init__ (fiona/collection.py:243)
  0.00%   0.00%   0.130s    0.130s   <listcomp> (shapely/geometry/polygon.py:91)

Nearly all the time is spent in Numba. I used py-spy to output this svg (but only ran it for about a minute). But this flame plot gives an idea of where Numba is being called.

profile

@harrism
Copy link
Member

harrism commented Jul 30, 2024

@mroeschke since you have touched a lot of places in cuSpatial and cuDF recently can you tell us if this code perhaps is now running in numba but didn't used to? That could explain the huge performance regression we are seeing.

@rapids-bot rapids-bot bot closed this as completed in fe3b0c9 Jul 31, 2024
raydouglass pushed a commit to rapidsai/cudf that referenced this issue Aug 1, 2024
…16436)

#16277 removed a universal cast to
a `cupy.array` in `_from_array`. Although the typing suggested this
method should only accept `np.ndarray` or `cupy.ndarray`, this method is
called on any object implementing the `__cuda_array_inferface__` or
`__array_interface__` (e.g. `numba.DeviceArray`) which caused a
performance regression in cuspatial
rapidsai/cuspatial#1413

closes #16434


```python
In [1]: import cupy, numba.cuda

In [2]: import cudf

In [3]: cupy_array = cupy.ones((10_000, 100))

In [4]: %timeit cudf.DataFrame(cupy_array)
3.88 ms ± 52 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [5]: %timeit cudf.DataFrame(numba.cuda.to_device(cupy_array))
3.99 ms ± 35.4 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
```

---------

Co-authored-by: Bradley Dice <bdice@bradleydice.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants