{"title":"基于特征的医学数据库处理方法","authors":"Ritu Chauhan, Harleen Kaur, Sukrati Sharma","doi":"10.1145/2979779.2979873","DOIUrl":null,"url":null,"abstract":"Medical data mining is an emerging field employed to discover hidden knowledge within the large datasets for early medical diagnosis of disease. Usually, large databases comprise of numerous features which may have missing values, noise and outliers. However, such features can mislead to future medical diagnosis. Moreover to deal with irrelevant and redundant features among large databases, proper pre processing data techniques needs be applied. In, past studies data mining technique such as feature selection is efficiently applied to deal with irrelevant, noisy and redundant features. This paper explains application of data mining techniques using feature selection for pancreatic cancer patients to conduct machine learning studies on collected patient records. We have evaluated different feature selection techniques such as Correlation-Based Filter Method (CFS) and Wrapper Subset Evaluation using Naive Bayes and J48 (an implementation of C4.5) classifier on medical databases to analyze varied data mining algorithms which can effectively classify medical data for future medical diagnosis. Further, experimental techniques have been used to measure the effectiveness and efficiency of feature selection algorithms. The experimental analysis conducted has proven beneficiary to determine machine learning methods for effective analysis of pancreatic cancer diagnosis.","PeriodicalId":298730,"journal":{"name":"Proceedings of the International Conference on Advances in Information Communication Technology & Computing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A Feature Based Approach for Medical Databases\",\"authors\":\"Ritu Chauhan, Harleen Kaur, Sukrati Sharma\",\"doi\":\"10.1145/2979779.2979873\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Medical data mining is an emerging field employed to discover hidden knowledge within the large datasets for early medical diagnosis of disease. Usually, large databases comprise of numerous features which may have missing values, noise and outliers. However, such features can mislead to future medical diagnosis. Moreover to deal with irrelevant and redundant features among large databases, proper pre processing data techniques needs be applied. In, past studies data mining technique such as feature selection is efficiently applied to deal with irrelevant, noisy and redundant features. This paper explains application of data mining techniques using feature selection for pancreatic cancer patients to conduct machine learning studies on collected patient records. We have evaluated different feature selection techniques such as Correlation-Based Filter Method (CFS) and Wrapper Subset Evaluation using Naive Bayes and J48 (an implementation of C4.5) classifier on medical databases to analyze varied data mining algorithms which can effectively classify medical data for future medical diagnosis. Further, experimental techniques have been used to measure the effectiveness and efficiency of feature selection algorithms. The experimental analysis conducted has proven beneficiary to determine machine learning methods for effective analysis of pancreatic cancer diagnosis.\",\"PeriodicalId\":298730,\"journal\":{\"name\":\"Proceedings of the International Conference on Advances in Information Communication Technology & Computing\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-08-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the International Conference on Advances in Information Communication Technology & Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2979779.2979873\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the International Conference on Advances in Information Communication Technology & Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2979779.2979873","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Medical data mining is an emerging field employed to discover hidden knowledge within the large datasets for early medical diagnosis of disease. Usually, large databases comprise of numerous features which may have missing values, noise and outliers. However, such features can mislead to future medical diagnosis. Moreover to deal with irrelevant and redundant features among large databases, proper pre processing data techniques needs be applied. In, past studies data mining technique such as feature selection is efficiently applied to deal with irrelevant, noisy and redundant features. This paper explains application of data mining techniques using feature selection for pancreatic cancer patients to conduct machine learning studies on collected patient records. We have evaluated different feature selection techniques such as Correlation-Based Filter Method (CFS) and Wrapper Subset Evaluation using Naive Bayes and J48 (an implementation of C4.5) classifier on medical databases to analyze varied data mining algorithms which can effectively classify medical data for future medical diagnosis. Further, experimental techniques have been used to measure the effectiveness and efficiency of feature selection algorithms. The experimental analysis conducted has proven beneficiary to determine machine learning methods for effective analysis of pancreatic cancer diagnosis.