You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using both relative_validity_ and the full validity_index function from hdbscan.validity. @lmcinnes if they give different optimal parameters, is there a reason to prefer one over the other? Perhaps validity_index because the other one is approximate?
My application is in NLP clustering of embedding vectors, and one of the things I'm testing are different embedding vectors with different dimensionalities. Is it valid to use either of those metrics to compare across embeddings for the same dataset, or only across the hdbscan parameters themselves?
Thank you so much!
The text was updated successfully, but these errors were encountered:
I'm using both
relative_validity_
and the fullvalidity_index
function fromhdbscan.validity
. @lmcinnes if they give different optimal parameters, is there a reason to prefer one over the other? Perhapsvalidity_index
because the other one is approximate?My application is in NLP clustering of embedding vectors, and one of the things I'm testing are different embedding vectors with different dimensionalities. Is it valid to use either of those metrics to compare across embeddings for the same dataset, or only across the hdbscan parameters themselves?
Thank you so much!
The text was updated successfully, but these errors were encountered: