基于规则的数据挖掘在心血管疾病诊断特征选择中的K-Fold交叉验证

Dwi Normawati, Dewi Pramudi Ismi
{"title":"基于规则的数据挖掘在心血管疾病诊断特征选择中的K-Fold交叉验证","authors":"Dwi Normawati, Dewi Pramudi Ismi","doi":"10.31763/simple.v1i2.3","DOIUrl":null,"url":null,"abstract":"Coronary heart disease is a disease that often causes human death, occurs when there is atherosclerosis blocking blood flow to the heart muscle in the coronary arteries. The doctor's referral method for diagnosing coronary heart disease is coronary angiography, but it is invasive, high risk and expensive. The purpose of this study is to analyze the effect of implementing the k-Fold Cross Validation (CV) dataset on the rule-based feature selection to diagnose coronary heart disease, using the Cleveland heart disease dataset. The research conducted a feature selection using a medical expert-based (MFS) and computer-based method, namely the Variable Precision Rough Set (VPRS), which is the development of the Rough Set theory. Evaluation of classification performance using the k-Fold method of 10-Fold, 5-Fold and 3-Fold. The results of the study are the number of attributes of the feature selection results are different in each Fold, both for the VPRS and MFS methods, for accuracy values obtained from the average accuracy resulting from 10-Fold, 5-Fold and 3-Fold. The result was the highest accuracy value in the VPRS method 76.34% with k = 5, while the MTF accuracy was 71.281% with k = 3. So, the k-fold implementation for this case is less effective, because the division of data is still structured, according to the order of records that apply in each fold, while the amount of testing data is too small and too structured. This affects the results of the accuracy because the testing rules are not thoroughly represented","PeriodicalId":115994,"journal":{"name":"Signal and Image Processing Letters","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"K-Fold Cross Validation for Selection of Cardiovascular Disease Diagnosis Features by Applying Rule-Based Datamining\",\"authors\":\"Dwi Normawati, Dewi Pramudi Ismi\",\"doi\":\"10.31763/simple.v1i2.3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Coronary heart disease is a disease that often causes human death, occurs when there is atherosclerosis blocking blood flow to the heart muscle in the coronary arteries. The doctor's referral method for diagnosing coronary heart disease is coronary angiography, but it is invasive, high risk and expensive. The purpose of this study is to analyze the effect of implementing the k-Fold Cross Validation (CV) dataset on the rule-based feature selection to diagnose coronary heart disease, using the Cleveland heart disease dataset. The research conducted a feature selection using a medical expert-based (MFS) and computer-based method, namely the Variable Precision Rough Set (VPRS), which is the development of the Rough Set theory. Evaluation of classification performance using the k-Fold method of 10-Fold, 5-Fold and 3-Fold. The results of the study are the number of attributes of the feature selection results are different in each Fold, both for the VPRS and MFS methods, for accuracy values obtained from the average accuracy resulting from 10-Fold, 5-Fold and 3-Fold. The result was the highest accuracy value in the VPRS method 76.34% with k = 5, while the MTF accuracy was 71.281% with k = 3. So, the k-fold implementation for this case is less effective, because the division of data is still structured, according to the order of records that apply in each fold, while the amount of testing data is too small and too structured. This affects the results of the accuracy because the testing rules are not thoroughly represented\",\"PeriodicalId\":115994,\"journal\":{\"name\":\"Signal and Image Processing Letters\",\"volume\":\"76 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Signal and Image Processing Letters\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.31763/simple.v1i2.3\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal and Image Processing Letters","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31763/simple.v1i2.3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

摘要

冠心病是一种经常导致人类死亡的疾病,发生在动脉粥样硬化阻塞冠状动脉中流向心脏肌肉的血液时。医生推荐的诊断冠心病的方法是冠状动脉造影,但它是有创的、高风险的、昂贵的。本研究的目的是利用克利夫兰心脏病数据集,分析实施k-Fold交叉验证(CV)数据集对基于规则的特征选择诊断冠心病的影响。本研究采用基于医学专家(MFS)和基于计算机的方法进行特征选择,即变精度粗糙集(VPRS),这是粗糙集理论的发展。使用10-Fold、5-Fold和3-Fold的k-Fold方法评价分类性能。研究结果表明,无论是VPRS方法还是MFS方法,从10-Fold、5-Fold和3-Fold的平均精度得到的精度值,在每个Fold中特征选择结果的属性数量都是不同的。结果表明,k = 5时,VPRS法的准确率最高,为76.34%;k = 3时,MTF法的准确率最高,为71.281%。因此,在这种情况下,k-fold实现的效果较差,因为数据的划分仍然是结构化的,根据每个折叠中应用的记录的顺序,而测试数据的数量太小且过于结构化。这影响了准确性的结果,因为测试规则没有完全表示出来
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
K-Fold Cross Validation for Selection of Cardiovascular Disease Diagnosis Features by Applying Rule-Based Datamining
Coronary heart disease is a disease that often causes human death, occurs when there is atherosclerosis blocking blood flow to the heart muscle in the coronary arteries. The doctor's referral method for diagnosing coronary heart disease is coronary angiography, but it is invasive, high risk and expensive. The purpose of this study is to analyze the effect of implementing the k-Fold Cross Validation (CV) dataset on the rule-based feature selection to diagnose coronary heart disease, using the Cleveland heart disease dataset. The research conducted a feature selection using a medical expert-based (MFS) and computer-based method, namely the Variable Precision Rough Set (VPRS), which is the development of the Rough Set theory. Evaluation of classification performance using the k-Fold method of 10-Fold, 5-Fold and 3-Fold. The results of the study are the number of attributes of the feature selection results are different in each Fold, both for the VPRS and MFS methods, for accuracy values obtained from the average accuracy resulting from 10-Fold, 5-Fold and 3-Fold. The result was the highest accuracy value in the VPRS method 76.34% with k = 5, while the MTF accuracy was 71.281% with k = 3. So, the k-fold implementation for this case is less effective, because the division of data is still structured, according to the order of records that apply in each fold, while the amount of testing data is too small and too structured. This affects the results of the accuracy because the testing rules are not thoroughly represented
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
IoT-Based Low-Cost Children Tracker System A Review on Stability Challenges and Probable Solution of Perovskite–Silicon Tandem Solar Cells Numerical Simulation of Highly Efficient Cs2TiI6 based Pb Free Perovskites Solar Cell with the Help of Optimized ETL and HTL Using SCAPS-1D Software Smart fan using room temperature sensor and human movement Electricity power monitoring based on internet of things
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1