Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes for pandas 2, latest cudf, and wheel building #4144

Merged
merged 9 commits into from
Feb 6, 2024

Conversation

vyasr
Copy link
Contributor

@vyasr vyasr commented Feb 5, 2024

This PR contains a number of different fixes currently required to get cugraph tests passing:

  • There are two main changes for pandas 2 compatibility:
    • pandas renamed DataFrame.applymap to DataFrame.map so creating the renumbering map with a column map caused problems for attribute-based column access renumber_map.map. Those columns are now renamed to renumber_map.
    • Empty columns now default to str rather than float, so tests that assumed we could access the values as cupy arrays failed because cudf's string columns cannot be converted to cupy arrays. These columns are now always cast to float in the tests before the cupy conversion.
  • cugraph-dgl and cugraph-pyg's wheel builds were not downloading the latest cugraph/pylibcugraph wheels to run tests. As a result, the above pandas 2 fixes didn't take when running the dgl and pyg tests. I updated the wheel building scripts to account for this discrepancy.
  • Move chars column to parent data buffer in strings column cudf#14202 made a breaking change to how characters are encoded in strings columns in cudf, which broke cugraph_etl. This PR fixes the code that depended on the old APIs.

This code also includes a small patch to the cugraph_etl CMake so that it exports the correct package name (previously it was using cugraph).

@vyasr vyasr added bug Something isn't working non-breaking Non-breaking change labels Feb 5, 2024
@vyasr vyasr self-assigned this Feb 5, 2024
@vyasr vyasr requested a review from a team as a code owner February 5, 2024 19:07
@github-actions github-actions bot added the python label Feb 5, 2024
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to me. We’ll want to get this merged soon to unblock CUDA 12.2 work on branch-24.04.

@jakirkham
Copy link
Member

It looks like the ARM builds are failing. Do we know what is going on there?

@bdice
Copy link
Contributor

bdice commented Feb 5, 2024

Both ARM and x86-64 builds are failing. cuGraph is relying on functions from libcudf that were deprecated in 24.02 and removed in 24.04: rapidsai/cudf#14848

@vyasr
Copy link
Contributor Author

vyasr commented Feb 5, 2024

Yeah I started a Slack thread to get help on those fixes.

@vyasr vyasr requested a review from a team as a code owner February 5, 2024 23:27
@github-actions github-actions bot added the CMake label Feb 5, 2024
@vyasr vyasr mentioned this pull request Feb 5, 2024
@vyasr vyasr changed the title Fix pandas 2 compatibility Fixes for pandas 2, latest cudf, and wheel building Feb 6, 2024
@vyasr vyasr added breaking Breaking change and removed non-breaking Non-breaking change labels Feb 6, 2024
Copy link
Collaborator

@ChuckHastings ChuckHastings left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this!

Copy link
Contributor

@rlratzel rlratzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for taking care of this!

@@ -11,6 +11,11 @@ python_package_name=$(echo ${package_name}|sed 's/-/_/g')
mkdir -p ./dist
RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"

# Download wheels built during this job.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this. It's unfortunate that we have to do it this way vs. something closer to what we do for conda testing (add a channel pointing to the local builds). I haven't verified one way or the other, but I'm curious if we have similar problems in our other wheel test environments, seems like we would.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I set up wheels originally I did this for every package that needs it. I know cugraph added a bunch of wheels afterwards, so it's probably worth an audit.

I don't think this is all that different from the conda case. The main difference is that there are multiple install lines. That was initially there to avoid needing to use --pre to force installation of pre-releases, but actually with the newer versioning strategies that I put in place in wheels that probably won't be an issue any more, i.e. it's untested but we ought to be able to use a single install command here now.

@rlratzel
Copy link
Contributor

rlratzel commented Feb 6, 2024

/merge

@rapids-bot rapids-bot bot merged commit bf5aa60 into rapidsai:branch-24.04 Feb 6, 2024
118 checks passed
@vyasr vyasr deleted the fix/pandas_2_compat branch February 6, 2024 21:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change bug Something isn't working ci CMake cuGraph python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants