r - Where can I find a good set of benchmark clustering datasets with ground truth labels? -
i looking clustering dataset "ground truth" labels known natural clustering, preferably high dimensionality.
i found candidates here (http://cs.joensuu.fi/sipu/datasets/), glass , iris data-sets have labels points. found code generate gaussian datasets (syndeca). main reason want compare distance metrics clustering methods. it's difficult use external (extrinsic) evaluation criteria many of biased towards euclidean distances; , there many choose from.
thanks!
there many data sets @ uci machine learning repository.
Comments
Post a Comment