Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Dask estimators serialization prior to training #6065

Open
wants to merge 3 commits into
base: branch-24.10
Choose a base branch
from

Conversation

viclafargue
Copy link
Contributor

Partially answers #6046

@viclafargue viclafargue requested a review from a team as a code owner September 10, 2024 08:31
@github-actions github-actions bot added the Cython / Python Cython or Python issue label Sep 10, 2024
Copy link
Member

@divyegala divyegala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a tiny test to make sure that the serialization and deserialization are both successful for un-trained models? Maybe also train the model after deserialization

@dantegd
Copy link
Member

dantegd commented Sep 25, 2024

@viclafargue seems like the test ran into an issue in the pytest in some jobs:

=================================== FAILURES ===================================
________________________ test_serialize_before_training ________________________

client = <Client: 'tcp://127.0.0.1:45687' processes=1 threads=1, memory=251.77 GiB>

    def test_serialize_before_training(client):
        X, y = make_regression(n_samples=1000, n_features=20, random_state=0)
        X, y = da.from_array(X), da.from_array(y)
    
        model = LinearRegression(client=client)
>       pickled_model = pickle.dumps(model)

test_dask_serialization.py:90: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <cuml.dask.linear_model.linear_regression.LinearRegression object at 0x7fca696b4510>

    def __getstate__(self):
>       internal_model = self._get_internal_model().result()
E       AttributeError: 'NoneType' object has no attribute 'result'

/opt/conda/envs/test/lib/python3.11/site-packages/cuml/dask/common/base.py:60: AttributeError

@viclafargue
Copy link
Contributor Author

@viclafargue seems like the test ran into an issue in the pytest in some jobs:

That's really strange, might possibly be missing something, but isn't there an issue with the CI?

@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.

Project coverage is 67.06%. Comparing base (f818527) to head (809daea).
Report is 1 commits behind head on branch-24.10.

Files with missing lines Patch % Lines
python/cuml/cuml/dask/common/base.py 0.00% 3 Missing ⚠️
Additional details and impacted files
@@               Coverage Diff                @@
##           branch-24.10    #6065      +/-   ##
================================================
- Coverage         68.23%   67.06%   -1.18%     
================================================
  Files               195      195              
  Lines             12922    12924       +2     
================================================
- Hits               8817     8667     -150     
- Misses             4105     4257     +152     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@viclafargue viclafargue added bug Something isn't working non-breaking Non-breaking change labels Sep 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Cython / Python Cython or Python issue non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants