基于特征的医学数据库处理方法

Ritu Chauhan, Harleen Kaur, Sukrati Sharma
{"title":"基于特征的医学数据库处理方法","authors":"Ritu Chauhan, Harleen Kaur, Sukrati Sharma","doi":"10.1145/2979779.2979873","DOIUrl":null,"url":null,"abstract":"Medical data mining is an emerging field employed to discover hidden knowledge within the large datasets for early medical diagnosis of disease. Usually, large databases comprise of numerous features which may have missing values, noise and outliers. However, such features can mislead to future medical diagnosis. Moreover to deal with irrelevant and redundant features among large databases, proper pre processing data techniques needs be applied. In, past studies data mining technique such as feature selection is efficiently applied to deal with irrelevant, noisy and redundant features. This paper explains application of data mining techniques using feature selection for pancreatic cancer patients to conduct machine learning studies on collected patient records. We have evaluated different feature selection techniques such as Correlation-Based Filter Method (CFS) and Wrapper Subset Evaluation using Naive Bayes and J48 (an implementation of C4.5) classifier on medical databases to analyze varied data mining algorithms which can effectively classify medical data for future medical diagnosis. Further, experimental techniques have been used to measure the effectiveness and efficiency of feature selection algorithms. The experimental analysis conducted has proven beneficiary to determine machine learning methods for effective analysis of pancreatic cancer diagnosis.","PeriodicalId":298730,"journal":{"name":"Proceedings of the International Conference on Advances in Information Communication Technology & Computing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A Feature Based Approach for Medical Databases\",\"authors\":\"Ritu Chauhan, Harleen Kaur, Sukrati Sharma\",\"doi\":\"10.1145/2979779.2979873\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Medical data mining is an emerging field employed to discover hidden knowledge within the large datasets for early medical diagnosis of disease. Usually, large databases comprise of numerous features which may have missing values, noise and outliers. However, such features can mislead to future medical diagnosis. Moreover to deal with irrelevant and redundant features among large databases, proper pre processing data techniques needs be applied. In, past studies data mining technique such as feature selection is efficiently applied to deal with irrelevant, noisy and redundant features. This paper explains application of data mining techniques using feature selection for pancreatic cancer patients to conduct machine learning studies on collected patient records. We have evaluated different feature selection techniques such as Correlation-Based Filter Method (CFS) and Wrapper Subset Evaluation using Naive Bayes and J48 (an implementation of C4.5) classifier on medical databases to analyze varied data mining algorithms which can effectively classify medical data for future medical diagnosis. Further, experimental techniques have been used to measure the effectiveness and efficiency of feature selection algorithms. The experimental analysis conducted has proven beneficiary to determine machine learning methods for effective analysis of pancreatic cancer diagnosis.\",\"PeriodicalId\":298730,\"journal\":{\"name\":\"Proceedings of the International Conference on Advances in Information Communication Technology & Computing\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-08-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the International Conference on Advances in Information Communication Technology & Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2979779.2979873\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the International Conference on Advances in Information Communication Technology & Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2979779.2979873","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

医学数据挖掘是一个新兴的领域,用于发现隐藏在大数据集中的知识,用于疾病的早期医学诊断。通常,大型数据库包含许多特征,这些特征可能有缺失值、噪声和异常值。然而,这些特征可能会误导未来的医学诊断。此外,为了处理大型数据库中不相关和冗余的特征,需要采用适当的数据预处理技术。在过去的研究中,数据挖掘技术如特征选择被有效地用于处理不相关、有噪声和冗余的特征。本文介绍了数据挖掘技术在胰腺癌患者特征选择中的应用,对收集到的患者记录进行机器学习研究。我们在医疗数据库上评估了不同的特征选择技术,如基于关联的过滤方法(CFS)和使用朴素贝叶斯和J48 (C4.5的实现)分类器的包装子集评估,以分析各种数据挖掘算法,这些算法可以有效地对医疗数据进行分类,为未来的医疗诊断提供帮助。此外,还利用实验技术来衡量特征选择算法的有效性和效率。所进行的实验分析已被证明有利于确定有效分析胰腺癌诊断的机器学习方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A Feature Based Approach for Medical Databases
Medical data mining is an emerging field employed to discover hidden knowledge within the large datasets for early medical diagnosis of disease. Usually, large databases comprise of numerous features which may have missing values, noise and outliers. However, such features can mislead to future medical diagnosis. Moreover to deal with irrelevant and redundant features among large databases, proper pre processing data techniques needs be applied. In, past studies data mining technique such as feature selection is efficiently applied to deal with irrelevant, noisy and redundant features. This paper explains application of data mining techniques using feature selection for pancreatic cancer patients to conduct machine learning studies on collected patient records. We have evaluated different feature selection techniques such as Correlation-Based Filter Method (CFS) and Wrapper Subset Evaluation using Naive Bayes and J48 (an implementation of C4.5) classifier on medical databases to analyze varied data mining algorithms which can effectively classify medical data for future medical diagnosis. Further, experimental techniques have been used to measure the effectiveness and efficiency of feature selection algorithms. The experimental analysis conducted has proven beneficiary to determine machine learning methods for effective analysis of pancreatic cancer diagnosis.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Genetic Algorithm with Mixed Crossover approach for Travelling Salesman Problem An Empirical Study on Fault Prediction using Token-Based Approach Implementing an Authentication Mechanism for Machine Deletion on the Cloud Multi-agent Web Service Composition using Partially Observable Markov Decision Process Forecasting Stock Market Movements Using Various Kernel Functions in Support Vector Machine
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1