|
聚类成员簇个数的选择方法研究 |
Research on the Selection Method of Cluster Number of Cluster Members |
|
DOI:10.16018/j.cnki.cn32-1650/n.202204010 |
中文关键词: 机器学习 聚类分析 聚类集成 簇个数选择 |
英文关键词: machine learning cluster analysis cluster ensemble cluster number selection |
基金项目: |
|
摘要点击次数: 42 |
全文下载次数: 63 |
中文摘要: |
聚类成员簇个数的选择对聚类集成的结果有显著影响,但目前尚无此方面的系统研究。因此,通过系统比较不同的簇个数选择方法,确定较优的聚类成员簇个数选择方法。首先对5种常见的簇个数设置方法及其性能进行分析与比较。结果表明,当簇个数k等于实际标签数k*时,得到的聚类集成结果最好。进一步探索更优的簇个数设置方法,从k*~2k*中选取6个较短的区间作为簇个数的选取范围,并将聚类集成的结果与使用k*得到的结果进行比较。在基准数据集上的实验结果表明,当聚类成员簇个数与真实标签数据集中的类数相等时,获得的聚类集成效果最好。 |
英文摘要: |
The clusternumber of cluster members has a significant impact on the results of cluster ensemble. However, there is no systematic research in this area at present. Therefore, by systematically comparing different cluster number selection methods, the better cluster number of cluster members selection method is determined. Firstly, five common cluster number setting methods and their performances are analyzed and compared. Experimental results show that the best the performance of clustering ensemble is obtained when the number of clusters is equal to the real number of labels k*. A better method is further explored to set the number of clusters, select 6 shorter intervals from k*~2k* as the selection range of the number of clusters, and the results of cluster ensemble with those obtained by using k* is compared. The experimental results on the benchmark dataset show that when the cluster number of cluster members is equal to the cluster number of clusters in the real label dataset, the cluster ensemble is the best. |
查看全文
查看/发表评论 下载PDF阅读器 |
关闭 |