Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using different metrics in k-medoids (sklearn), it does not work for all #62

Open
cmlp1957 opened this issue Apr 28, 2020 · 1 comment

Comments

@cmlp1957
Copy link

cmlp1957 commented Apr 28, 2020

Hi,

First of all I would like to state that it is great to have k-medoids available in sk-learn. Many thanks for the effort done.

I have managed to use k-medoids from sk-learn only using ‘cosine’ as metric. When I use another metric I get the following type of message:

c:\users\cmlp\appdata\local\programs\python\python37\lib\site-packages\sklearn_extra\cluster\_k_medoids.py:235: UserWarning: Cluster 2 is empty! self.labels_[self.medoid_indices_[2]] may not be labeled with its corresponding cluster (2).
  "its corresponding cluster ({k}).".format(k=k)

I tried the following metrics:

metrics = ['braycurtis', 'canberra', 'chebyshev', 'cityblock', 'correlation', 'cosine', 'dice', 'euclidean',
'hamming', 'jaccard', 'kulsinski','matching', 'minkowski', 'rogerstanimoto',
'russellrao', 'seuclidean', 'sokalmichener', 'sokalsneath', 'sqeuclidean', 'yule']

But for some reason I am not able to generate the clusters using these other metrics. Since it is able to work swiftly for the 'cosine' metric I have the impression that the problem has not to do with the data. Where could than the problem be?

All the features I am using are categorical but two that are float. Before using k-medoids I used one-hot encoder for categorical features and minmax rescaling for the float variables.

Many thanks for your time,

Casiano Manrique

@rth
Copy link
Contributor

rth commented Jun 26, 2020

Could you provide a minimal example? I have tried some of those metrics (including cityblock, canberra, minkowski, sqeuclidean) on plot_kmedoids_digits.py and they seem to be working fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants