{"title":"基于互信息和f分的混合随机森林特征选择模型用于早产儿分类","authors":"Himani S. Deshpande, Leena Ragha","doi":"10.1504/ijmei.2023.127257","DOIUrl":null,"url":null,"abstract":"Every woman's body is unique and will have some features playing a vital role contributing towards a healthy pregnancy and manually it is difficult to decide the important features to be observed to prevent the pregnancy complications. In this proposal we have consider 21 physical features of 903 women of varied age groups, economy status and health conditions. Variation and information-based random forest (VIBRF) hybrid model using mutual information and F-score is applied to evaluate each feature looking into the variation within the feature and mutual information across the features. We experimented using various classifiers, and it is observed that Gaussian NB has shown most significant improvement in terms of prediction accuracy, from 31% with all features to 80% with our feature selection process. Though SVM prediction accuracy is 84% it is observed AUC drastically improved for GNB by 10%. As it is a medical application, it is important to achieve higher AUC and so through this experiment it is concluded that GNB performs better with proposed model.","PeriodicalId":39126,"journal":{"name":"International Journal of Medical Engineering and Informatics","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A hybrid random forest-based feature selection model using mutual information and F-score for preterm birth classification\",\"authors\":\"Himani S. Deshpande, Leena Ragha\",\"doi\":\"10.1504/ijmei.2023.127257\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Every woman's body is unique and will have some features playing a vital role contributing towards a healthy pregnancy and manually it is difficult to decide the important features to be observed to prevent the pregnancy complications. In this proposal we have consider 21 physical features of 903 women of varied age groups, economy status and health conditions. Variation and information-based random forest (VIBRF) hybrid model using mutual information and F-score is applied to evaluate each feature looking into the variation within the feature and mutual information across the features. We experimented using various classifiers, and it is observed that Gaussian NB has shown most significant improvement in terms of prediction accuracy, from 31% with all features to 80% with our feature selection process. Though SVM prediction accuracy is 84% it is observed AUC drastically improved for GNB by 10%. As it is a medical application, it is important to achieve higher AUC and so through this experiment it is concluded that GNB performs better with proposed model.\",\"PeriodicalId\":39126,\"journal\":{\"name\":\"International Journal of Medical Engineering and Informatics\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Medical Engineering and Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1504/ijmei.2023.127257\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Medical Engineering and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/ijmei.2023.127257","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}
A hybrid random forest-based feature selection model using mutual information and F-score for preterm birth classification
Every woman's body is unique and will have some features playing a vital role contributing towards a healthy pregnancy and manually it is difficult to decide the important features to be observed to prevent the pregnancy complications. In this proposal we have consider 21 physical features of 903 women of varied age groups, economy status and health conditions. Variation and information-based random forest (VIBRF) hybrid model using mutual information and F-score is applied to evaluate each feature looking into the variation within the feature and mutual information across the features. We experimented using various classifiers, and it is observed that Gaussian NB has shown most significant improvement in terms of prediction accuracy, from 31% with all features to 80% with our feature selection process. Though SVM prediction accuracy is 84% it is observed AUC drastically improved for GNB by 10%. As it is a medical application, it is important to achieve higher AUC and so through this experiment it is concluded that GNB performs better with proposed model.
期刊介绍:
IJMEI promotes an understanding of the structural/functional aspects of disease mechanisms and the application of technology towards the treatment/management of such diseases. It seeks to promote interdisciplinary collaboration between those interested in the theoretical and clinical aspects of medicine and to foster the application of computers and mathematics to problems arising from medical sciences. IJMEI includes authoritative review papers, the reporting of original research, and evaluation reports of new/existing techniques and devices. Each issue also contains a comprehensive information service. Topics covered include Hospital information/medical record systems, data protection/privacy Disease modelling/analysis, evidence-based clinical modelling/studies Computer-based patient/disease management systems Clinical trials/studies, outcome-based studies/analysis Electronic patient monitoring systems Nanotechnology in medicine, medical applications Tissue engineering, artificial organs, biomaterials design Healthcare standards, service standardisation Controlled medical terminology/vocabularies Nursing informatics, systems integration Healthcare/hospital management, economics Medical technology, intelligent instrumentation, telemedicine Medical/molecular imaging, disease management Bioinformatics, human genome studies/analysis Drug design.