-
Personally I've found HDBSCAN produces good results since it places outliers in the -1 Topic. However River lacks this feature. Is there any way to ignore outliers when using River? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
The difficulty here is with the definition of an outlier. In the context of online clustering, what may be an outlier at timestep t might not be at timestep t+10 when it has seen much more data. Vice versa is also possible. I do believe there are algorithms in River that take outliers into account, like DenStream, but I have not used it myself yet. There might need to be some tweaking necessary there in order to generate the |
Beta Was this translation helpful? Give feedback.
The difficulty here is with the definition of an outlier. In the context of online clustering, what may be an outlier at timestep t might not be at timestep t+10 when it has seen much more data. Vice versa is also possible. I do believe there are algorithms in River that take outliers into account, like DenStream, but I have not used it myself yet. There might need to be some tweaking necessary there in order to generate the
-1
clusters.