Improved point center algorithm for K-Means clustering to increase software defect prediction

Riski Annisa, D. Rosiyadi, D. Riana
{"title":"Improved point center algorithm for K-Means clustering to increase software defect prediction","authors":"Riski Annisa, D. Rosiyadi, D. Riana","doi":"10.26555/IJAIN.V6I3.484","DOIUrl":null,"url":null,"abstract":"The k-means is a clustering algorithm that is often and easy to use. This algorithm is susceptible to randomly chosen centroid points so that it cannot produce optimal results. This research aimed to improve the k-means algorithm’s performance by applying a proposed algorithm called point center. The proposed algorithm overcame the random centroid value in k-means and then applied it to predict software defects modules’ errors. The point center algorithm was proposed to determine the initial centroid value for the k-means algorithm optimization. Then, the selection of X and Y variables determined the cluster center members. The ten datasets were used to perform the testing, of which nine datasets were used for predicting software defects. The proposed center point algorithm showed the lowest errors. It also improved the k-means algorithm’s performance by an average of 12.82% cluster errors in the software compared to the centroid value obtained randomly on the simple k-means algorithm. The findings are beneficial and contribute to developing a clustering model to handle data, such as to predict software defect modules more accurately.","PeriodicalId":52195,"journal":{"name":"International Journal of Advances in Intelligent Informatics","volume":"64 1","pages":"328-339"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Advances in Intelligent Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.26555/IJAIN.V6I3.484","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

The k-means is a clustering algorithm that is often and easy to use. This algorithm is susceptible to randomly chosen centroid points so that it cannot produce optimal results. This research aimed to improve the k-means algorithm’s performance by applying a proposed algorithm called point center. The proposed algorithm overcame the random centroid value in k-means and then applied it to predict software defects modules’ errors. The point center algorithm was proposed to determine the initial centroid value for the k-means algorithm optimization. Then, the selection of X and Y variables determined the cluster center members. The ten datasets were used to perform the testing, of which nine datasets were used for predicting software defects. The proposed center point algorithm showed the lowest errors. It also improved the k-means algorithm’s performance by an average of 12.82% cluster errors in the software compared to the centroid value obtained randomly on the simple k-means algorithm. The findings are beneficial and contribute to developing a clustering model to handle data, such as to predict software defect modules more accurately.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
改进K-Means聚类的点中心算法,提高软件缺陷预测能力
k-means是一种常用且易于使用的聚类算法。该算法容易受到随机选取的质心点的影响,无法产生最优结果。本研究旨在通过提出一种称为点中心的算法来提高k-means算法的性能。该算法克服了k-means中质心值的随机性,并将其应用于软件缺陷模块的误差预测。提出用点中心算法确定k-means算法优化的初始质心值。然后,选择X和Y变量确定集群中心成员。这10个数据集被用于执行测试,其中9个数据集被用于预测软件缺陷。所提出的中心点算法误差最小。与简单k-means算法随机获得的质心值相比,该算法的软件聚类误差平均提高了12.82%。这些发现是有益的,并且有助于开发聚类模型来处理数据,例如更准确地预测软件缺陷模块。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
International Journal of Advances in Intelligent Informatics
International Journal of Advances in Intelligent Informatics Computer Science-Computer Vision and Pattern Recognition
CiteScore
3.00
自引率
0.00%
发文量
0
期刊最新文献
Emergency sign language recognition from variant of convolutional neural network (CNN) and long short term memory (LSTM) models Self-supervised few-shot learning for real-time traffic sign classification Hybrid machine learning model based on feature decomposition and entropy optimization for higher accuracy flood forecasting Imputation of missing microclimate data of coffee-pine agroforestry with machine learning Scientific reference style using rule-based machine learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1