Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] n_neighbors should be smaller than the graph degree computed by nn descent #6091

Open
thorstenwagner opened this issue Oct 1, 2024 · 0 comments
Labels
? - Needs Triage Need team to review and classify bug Something isn't working

Comments

@thorstenwagner
Copy link

thorstenwagner commented Oct 1, 2024

I updated to the latest cuml (from 23.12). I'm fitting a umap to dataset with 32 features and 400k samples.

With 23.12 I did that with n_neighbors=200 and n_components=2 and it worked. With the latest version (24.08) I get:

Traceback (most recent call last):
  File "/mnt/data/twagner/Projects/TomoTwin/results/test_runs/test.py", line 11, in <module>
    reducer.fit(np.random.randn(52000,32))
  File "/opt/user_software/miniconda3_envs/tomotwin2/lib/python3.11/site-packages/cuml/internals/api_decorators.py", line 188, in wrapper
    ret = func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/user_software/miniconda3_envs/tomotwin2/lib/python3.11/site-packages/cuml/internals/api_decorators.py", line 393, in dispatch
    return self.dispatch_func(func_name, gpu_func, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/user_software/miniconda3_envs/tomotwin2/lib/python3.11/site-packages/cuml/internals/api_decorators.py", line 190, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "base.pyx", line 687, in cuml.internals.base.UniversalBase.dispatch_func
  File "umap.pyx", line 668, in cuml.manifold.umap.UMAP.fit
RuntimeError: RAFT failure at file=/opt/conda/conda-bld/work/cpp/src/umap/knn_graph/algo.cuh line=115: n_neighbors should be smaller than the graph degree computed by nn descent
Obtained 25 stack frames

The magic n_neighbors number when it starts working is 64, which seems to be the default according this documentation: https://docs.rapids.ai/api/cuvs/stable/cpp_api/neighbors_nn_descent/

Here is a script to reproduce the issue:

import cuml
import numpy as np
reducer = cuml.UMAP(
    n_neighbors=200,
    n_components=2,
    n_epochs=None,  # means automatic selection
    min_dist=0.0,
    random_state=19,
    metric="euclidean"
)
reducer.fit(np.random.randn(400000,32))
print("Done")

Interestingly, when I reduce the number of samples from 400k to 50k it also works.

Any ideas what I'm doing wrong?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage Need team to review and classify bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant