Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arbitrary precision Mallows Model under Hamming distance + solved numpy float type deprecation bug #3

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

alopezrivera
Copy link

@alopezrivera alopezrivera commented May 1, 2024

Hi!

First of all thank you for your work! This package has been really useful to understand distance-based statistical models for permutations.

Arbitrary precision sampling for Mallows Model under Hamming distance

This PR implements arbitrary precision math (using the Python mpmath library) in the sampling of permutations using the Mallows Model under the Hamming distance. This makes it possible to sample from very long permutations without running into overflows of the built-in Python integer type, which result in 0s or NaNs appearing in the calculation of the number of permutations at large Hamming distances.

I am working on large Traveling-Salesman-like problems (1000s of destinations) where it's interesting to sample around known "good" solutions (see RAAN walks in multi-target trajectory optimization for spacecraft), so this capability has been quite useful.

I have tested permutations of up to 5000 elements. Sampling 10000 permutations of 5000 elements takes approximately 1 minute. You can see the resulting distance histogram and the code used to generate it below.
image

import numpy as np
import matplotlib.pyplot as plt

import mallows_hamming as mh

theta = 7
n_samples = 10000
problem_size = 5000

sample = mh.sample(n=problem_size, m=n_samples, theta=theta, precision=1000)
distances = np.array([mh.distance(perm) for perm in sample])

bins = np.arange(- 0.5, problem_size + 1, 1)
plt.hist(distances, bins=bins, alpha=0.25, label='Arbitrary precision')
plt.legend(loc='best')
plt.show()

As expected, the new sample function returns identical output to that of the original one for smaller permutations:
image

Bug fix: Solved NumPy float type deprecation bug

This PR replaces np.float by np.float64 in the code to solve the following Numpy deprecation error:

AttributeError: module 'numpy' has no attribute 'float'.
`np.float` was a deprecated alias for the builtin `float`. To avoid this error in existing code, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations. Did you mean: 'cfloat'?

Cheers!

@alopezrivera alopezrivera changed the title Solved numpy float type deprecation bug Arbitrary precision Mallows Model under Hamming distance + solved numpy float type deprecation bug May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant