通过 FeatureBoostThyro 提高分类性能:机器学习算法和特征选择的比较研究

D. Bhende, Gopal Sakarkar, Punam Khandar, Satyajit S. Uparkar, Arvind Bhave
{"title":"通过 FeatureBoostThyro 提高分类性能:机器学习算法和特征选择的比较研究","authors":"D. Bhende, Gopal Sakarkar, Punam Khandar, Satyajit S. Uparkar, Arvind Bhave","doi":"10.3991/ijoe.v20i04.45413","DOIUrl":null,"url":null,"abstract":"Early-stage prediction of a disease is an important and challenging task. The application of machine learning techniques is playing an important role in this era. Thyroid is one of the chronic endocrine diseases, and approximately 42 million people in India are affected by this disease. This paper presents a comprehensive investigation into the enhancement of classification performance through the novel ‘FeatureBoostThyro’ (FBT) model. The study evaluates various machine learning algorithms, including stochastic gradient descent (SGD), K nearest neighbor (KNN), logistic regression (LR), naive bayes (NB), and support vector machine (SVM), in conjunction with diverse feature selection methods. The research systematically explores the impact of feature selection techniques such as information gain, relief F, chi-square, gini index, forward selection, backward selection, recursive feature elimination, and LASSO on model performance across the chosen algorithms. The analysis reveals notable variations in performance metrics, including accuracy, precision, recall, and F1-score, providing valuable insights into the interplay between algorithm and feature selection. One main contribution of this research is the introduction of the FBT model, which consistently outperforms other models across various feature selection methods, making it a promising tool for addressing complex classification tasks. The findings contribute to a broader understanding of model selection and optimization in machine learning applications. The proposed model undergoes evaluation using two distinct datasets: the primary dataset acquired from Lata Mangeshkar Hospital in Nagpur and the secondary dataset obtained from the UCI dataset.","PeriodicalId":507997,"journal":{"name":"International Journal of Online and Biomedical Engineering (iJOE)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing Classification Performance through FeatureBoostThyro: A Comparative Study of Machine Learning Algorithms and Feature Selection\",\"authors\":\"D. Bhende, Gopal Sakarkar, Punam Khandar, Satyajit S. Uparkar, Arvind Bhave\",\"doi\":\"10.3991/ijoe.v20i04.45413\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Early-stage prediction of a disease is an important and challenging task. The application of machine learning techniques is playing an important role in this era. Thyroid is one of the chronic endocrine diseases, and approximately 42 million people in India are affected by this disease. This paper presents a comprehensive investigation into the enhancement of classification performance through the novel ‘FeatureBoostThyro’ (FBT) model. The study evaluates various machine learning algorithms, including stochastic gradient descent (SGD), K nearest neighbor (KNN), logistic regression (LR), naive bayes (NB), and support vector machine (SVM), in conjunction with diverse feature selection methods. The research systematically explores the impact of feature selection techniques such as information gain, relief F, chi-square, gini index, forward selection, backward selection, recursive feature elimination, and LASSO on model performance across the chosen algorithms. The analysis reveals notable variations in performance metrics, including accuracy, precision, recall, and F1-score, providing valuable insights into the interplay between algorithm and feature selection. One main contribution of this research is the introduction of the FBT model, which consistently outperforms other models across various feature selection methods, making it a promising tool for addressing complex classification tasks. The findings contribute to a broader understanding of model selection and optimization in machine learning applications. The proposed model undergoes evaluation using two distinct datasets: the primary dataset acquired from Lata Mangeshkar Hospital in Nagpur and the secondary dataset obtained from the UCI dataset.\",\"PeriodicalId\":507997,\"journal\":{\"name\":\"International Journal of Online and Biomedical Engineering (iJOE)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Online and Biomedical Engineering (iJOE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3991/ijoe.v20i04.45413\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Online and Biomedical Engineering (iJOE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3991/ijoe.v20i04.45413","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

疾病的早期预测是一项重要而具有挑战性的任务。机器学习技术的应用在这个时代发挥着重要作用。甲状腺是慢性内分泌疾病之一,印度约有 4200 万人受到这种疾病的影响。本文对通过新颖的 "FeatureBoostThyro"(FBT)模型提高分类性能进行了全面研究。研究评估了各种机器学习算法,包括随机梯度下降算法(SGD)、K 近邻算法(KNN)、逻辑回归算法(LR)、奈夫贝叶斯算法(NB)和支持向量机算法(SVM),并结合了各种特征选择方法。研究系统地探讨了信息增益、浮动 F、奇偶校验、基尼指数、前向选择、后向选择、递归特征消除和 LASSO 等特征选择技术对所选算法模型性能的影响。分析揭示了准确度、精确度、召回率和 F1 分数等性能指标的显著差异,为了解算法与特征选择之间的相互作用提供了宝贵的见解。本研究的主要贡献之一是引入了 FBT 模型,该模型在各种特征选择方法中的表现始终优于其他模型,使其成为解决复杂分类任务的一种有前途的工具。研究结果有助于更广泛地理解机器学习应用中的模型选择和优化。提议的模型使用两个不同的数据集进行评估:从那格浦尔的拉塔-曼格什卡医院获得的主要数据集和从 UCI 数据集获得的次要数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Enhancing Classification Performance through FeatureBoostThyro: A Comparative Study of Machine Learning Algorithms and Feature Selection
Early-stage prediction of a disease is an important and challenging task. The application of machine learning techniques is playing an important role in this era. Thyroid is one of the chronic endocrine diseases, and approximately 42 million people in India are affected by this disease. This paper presents a comprehensive investigation into the enhancement of classification performance through the novel ‘FeatureBoostThyro’ (FBT) model. The study evaluates various machine learning algorithms, including stochastic gradient descent (SGD), K nearest neighbor (KNN), logistic regression (LR), naive bayes (NB), and support vector machine (SVM), in conjunction with diverse feature selection methods. The research systematically explores the impact of feature selection techniques such as information gain, relief F, chi-square, gini index, forward selection, backward selection, recursive feature elimination, and LASSO on model performance across the chosen algorithms. The analysis reveals notable variations in performance metrics, including accuracy, precision, recall, and F1-score, providing valuable insights into the interplay between algorithm and feature selection. One main contribution of this research is the introduction of the FBT model, which consistently outperforms other models across various feature selection methods, making it a promising tool for addressing complex classification tasks. The findings contribute to a broader understanding of model selection and optimization in machine learning applications. The proposed model undergoes evaluation using two distinct datasets: the primary dataset acquired from Lata Mangeshkar Hospital in Nagpur and the secondary dataset obtained from the UCI dataset.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
XAI-PhD: Fortifying Trust of Phishing URL Detection Empowered by Shapley Additive Explanations Improving the Accuracy of Oncology Diagnosis: A Machine Learning-Based Approach to Cancer Prediction Social Robots, Mindfulness, and Kindergarten Blockchain of Things for Securing and Managing Water 4.0 Applications Intelligent Interconnected Healthcare System: Integrating IoT and Big Data for Personalized Patient Care
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1