You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The HDBSCAN algorithm doesn't naturally have a predict method for determining the cluster assignment of a new point, as strictly speaking a new point could change the clustering of all the rest of the data. In Tribuo (and also in this Python scikit-learn-contrib HDBSCAN implementation) the prediction method is approximate and based on the nearest neighbour keypoints. It would be possible to export that nearest neighbour search into ONNX, but we've not done that for any of Tribuo's nearest neighbour prediction models (K-NN, K-Means, HDBSCAN) yet, and also as ONNX doesn't naturally have a nearest neighbour op it would bake in an exhaustive search of the keypoints (e.g. in the way this scikit-learn K-NN ONNX converter does - https://github.com/onnx/sklearn-onnx/blob/main/skl2onnx/operator_converters/nearest_neighbours.py#L64). We'd accept contributions to add that kind of export support, otherwise it's on the backlog of ONNX model export features and we'll get to it at some point in the future.
Tribuo is unlikely to ever support importing a HDBSCAN model into the HdbscanModel class, though with a small amount of additional work we could support loading in an ONNX clustering model (currently Tribuo is missing the ClusterID version of this output adaptor class which would be straightforward to add).
Ask the question
Is it possible to export HDBSCAN model via ONNX? No such functionality exists in Python as far as I am aware.
Is your question about a specific ML algorithm or approach?
This is about clustering, specifically the HDBSCAN method
Is your question about a specific Tribuo class?
HdbscanModel.java
System details
Additional context
It would be great if I can get some general pointers on how export/import of HDBSCAN could be achieved.
The text was updated successfully, but these errors were encountered: