Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CLARANS and FastCLARANS #6

Open
kno10 opened this issue Dec 11, 2023 · 0 comments
Open

Add CLARANS and FastCLARANS #6

kno10 opened this issue Dec 11, 2023 · 0 comments
Labels
enhancement New feature or request needs funding Issues that would need funding to be completed

Comments

@kno10
Copy link
Owner

kno10 commented Dec 11, 2023

As these methods require distance functions and a data matrix, it makes sense to first implement CLARA #5

CLARANS modifies PAM with a random sampling follows:

  • choose a subset of medoid + non-medoid pairs uniformly
  • compute the best swap in this subset only
  • perform the best swap

hence it approximates PAM.

We can easily (c.f., the FastPAM papers) integrate the FastPAM optimizations:

FastCLARANS:

  • choose a uniform subset of non-medoids
  • compute the loss regarding all medoids in O(N) per non-medoid using the FastPAM optimizations

It makes less sense to also integrate the "Eager" optimizations - eventually we can just run FasterPAM on shuffled input with a limit on the number of distance computations to achieve a similar approximation. Hence I do not believe CLARANS and FastCLARANS are of much practical relevance anymore.

The FastCLARANS reference implementation in Java is in ELKI: FastCLARANS but this contains data structures optimized for Java that we can do differently in Rust without the same overhead.

@kno10 kno10 added enhancement New feature or request help wanted Extra attention is needed needs funding Issues that would need funding to be completed labels Dec 11, 2023
@kno10 kno10 removed the help wanted Extra attention is needed label Dec 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request needs funding Issues that would need funding to be completed
Projects
None yet
Development

No branches or pull requests

1 participant