论文标题

基于密度估计的新索引用于聚类评估

A New Index for Clustering Evaluation Based on Density Estimation

论文作者

Liu, Gangli

论文摘要

引入了用于集群内部评估的新索引。该索引定义为两个子指标的混合物。第一个子指数$ i_a $称为模棱两可的索引;第二个子指数$ i_s $称为相似性索引。两个子指数的计算基于对数据分区的每个群集的密度估计。进行了一项实验以测试新指数的性能,并与其他六个内部聚类评估指数 - Calinski-Harabasz指数,Silhouette系数,Davies-Bouldin Index,CDBW,DBCV和Viasckde进行了比较。结果表明,新指数可显着改善其他内部聚类评估指数。

A new index for internal evaluation of clustering is introduced. The index is defined as a mixture of two sub-indices. The first sub-index $ I_a $ is called the Ambiguous Index; the second sub-index $ I_s $ is called the Similarity Index. Calculation of the two sub-indices is based on density estimation to each cluster of a partition of the data. An experiment is conducted to test the performance of the new index, and compared with six other internal clustering evaluation indices -- Calinski-Harabasz index, Silhouette coefficient, Davies-Bouldin index, CDbw, DBCV, and VIASCKDE, on a set of 145 datasets. The result shows the new index significantly improves other internal clustering evaluation indices.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源