Review of Computer Engineering Research最新文献

英文中文

Machine learning algorithms-based decision support model for diabetes 基于机器学习算法的糖尿病决策支持模型

Review of Computer Engineering Research

Pub Date : 2024-01-11 DOI: 10.18488/76.v11i1.3598

Karthick Kanagarathinam, R. Manikandan, T. S. Kumar

This research explores the application of machine learning (ML)-based risk prediction models in early diabetes disease detection for healthcare professionals. Diabetes affects millions of people worldwide. In light of significant advancements in biomedical sciences, vast volumes of data have been generated, including high-throughput genetic and diagnostic data sourced from extensive health records. Leveraging an initial diabetes risk prediction dataset from the University of California Irvine (UCI) ML repository, our research focused on supervised learning techniques, constituting 85% of the employed methods. The remaining 15% comprised unsupervised learning approaches, specifically association rules. A key contribution of this study lies in the development of an optimal prediction model utilizing supervised ML algorithms. The Boruta feature selection algorithm was employed to identify pertinent features, and the subsequent models were validated using a preprocessed dataset containing 10 attributes. Notably, the risk prediction models generated through random forest, extreme gradient boosting (XGBoost), and light gradient boosting machine (LightGBM) exhibited impressive average accuracies of 98.13%, 97.37%, and 97.22%, respectively, as determined via 10-fold cross-validation with 15 repetitions. Furthermore, these models achieved exceptional area under the ROC curve (AUC) values of 1, 0.99, and 0.99, respectively, showcasing their robustness and efficacy in diabetes risk prediction.

本研究探讨了基于机器学习（ML）的风险预测模型在医护人员早期糖尿病疾病检测中的应用。糖尿病影响着全球数百万人。随着生物医学的长足发展，产生了大量数据，包括从大量健康记录中获取的高通量遗传和诊断数据。利用加州大学欧文分校（UCI）ML 数据库中的初始糖尿病风险预测数据集，我们的研究侧重于监督学习技术，占所用方法的 85%。剩下的 15%则是无监督学习方法，特别是关联规则。本研究的主要贡献在于利用监督式 ML 算法开发了一个最佳预测模型。研究采用了 Boruta 特征选择算法来识别相关特征，并使用包含 10 个属性的预处理数据集对后续模型进行了验证。值得注意的是，通过随机森林、极端梯度提升（XGBoost）和轻梯度提升机（LightGBM）生成的风险预测模型表现出了令人印象深刻的平均准确率，分别为 98.13%、97.37% 和 97.22%。此外，这些模型的 ROC 曲线下面积（AUC）值分别达到了 1、0.99 和 0.99，显示了它们在糖尿病风险预测方面的稳健性和有效性。

{"title":"Machine learning algorithms-based decision support model for diabetes","authors":"Karthick Kanagarathinam, R. Manikandan, T. S. Kumar","doi":"10.18488/76.v11i1.3598","DOIUrl":"https://doi.org/10.18488/76.v11i1.3598","url":null,"abstract":"This research explores the application of machine learning (ML)-based risk prediction models in early diabetes disease detection for healthcare professionals. Diabetes affects millions of people worldwide. In light of significant advancements in biomedical sciences, vast volumes of data have been generated, including high-throughput genetic and diagnostic data sourced from extensive health records. Leveraging an initial diabetes risk prediction dataset from the University of California Irvine (UCI) ML repository, our research focused on supervised learning techniques, constituting 85% of the employed methods. The remaining 15% comprised unsupervised learning approaches, specifically association rules. A key contribution of this study lies in the development of an optimal prediction model utilizing supervised ML algorithms. The Boruta feature selection algorithm was employed to identify pertinent features, and the subsequent models were validated using a preprocessed dataset containing 10 attributes. Notably, the risk prediction models generated through random forest, extreme gradient boosting (XGBoost), and light gradient boosting machine (LightGBM) exhibited impressive average accuracies of 98.13%, 97.37%, and 97.22%, respectively, as determined via 10-fold cross-validation with 15 repetitions. Furthermore, these models achieved exceptional area under the ROC curve (AUC) values of 1, 0.99, and 0.99, respectively, showcasing their robustness and efficacy in diabetes risk prediction.","PeriodicalId":507768,"journal":{"name":"Review of Computer Engineering Research","volume":"25 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139534230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Software reliability prediction using ensemble learning with random hyperparameter optimization 利用随机超参数优化的集合学习进行软件可靠性预测

Review of Computer Engineering Research

Pub Date : 2024-01-10 DOI: 10.18488/76.v11i1.3597

G. Habtemariam, Sudhir Kumar Mohapatra, H. Seid

The paper investigates software reliability prediction by using ensemble learning with random hyperparameter optimization. Software reliability is a significant problem with software quality that developers face. It involves accurately predicting the next failure. In recent years, machine learning techniques and ensemble learning approaches have been applied to improve software reliability prediction. These approaches aim to analyze historical data and develop models that can accurately forecast when failures are likely to occur. The article proposes an ensemble learning regression model using Ridge, Bayesian Ridge, Support Vector Regressor (SVR), K-Nearest Neighbors Algorithm (KNN), Regression tree, Random Forest, Neural network, and Decision Tree as base learners. Ridge is used as a combiner model. Each base learner hyperparameter is tuned using a random search algorithm automatically. A random hyperparameter search optimization algorithm selects the hyperparameter and adjusts it for overfitting and underfitting. The base models are tuned to minimize bias and variance. The performances of the models are evaluated using standard error measures such as Mean Squared Error (MSE), Sum of Squared Error (SSE), and Normalized Root Mean Square Error (NRMSE). The proposed ensemble model is compared with existing models using a benchmark dataset. The Iyer,and Lee, and Musa datasets are used for the experiment. The dataset is scaled using standard methods like logarithmic scaling, lagging, and linear interpolation. The results of the statistical comparison show better performance by our proposed model as compared to existing models.

本文研究了利用随机超参数优化的集合学习进行软件可靠性预测的方法。软件可靠性是开发人员面临的一个重要的软件质量问题。它涉及准确预测下一次故障。近年来，机器学习技术和集合学习方法已被用于改进软件可靠性预测。这些方法旨在分析历史数据，并开发能准确预测故障可能发生时间的模型。文章提出了一种集合学习回归模型，使用 Ridge、贝叶斯 Ridge、支持向量回归算法（SVR）、K-近邻算法（KNN）、回归树、随机森林、神经网络和决策树作为基础学习器。Ridge 被用作组合模型。每个基础学习器的超参数都是通过随机搜索算法自动调整的。随机超参数搜索优化算法会选择超参数，并针对过拟合和欠拟合情况进行调整。对基本模型进行调整，以尽量减少偏差和方差。模型的性能使用标准误差指标进行评估，如均方误差（MSE）、平方误差之和（SSE）和归一化均方根误差（NRMSE）。利用基准数据集将所提出的集合模型与现有模型进行比较。实验使用的是 Iyer、Lee 和 Musa 数据集。数据集采用对数缩放、滞后和线性插值等标准方法进行缩放。统计比较结果表明，与现有模型相比，我们提出的模型性能更好。

{"title":"Software reliability prediction using ensemble learning with random hyperparameter optimization","authors":"G. Habtemariam, Sudhir Kumar Mohapatra, H. Seid","doi":"10.18488/76.v11i1.3597","DOIUrl":"https://doi.org/10.18488/76.v11i1.3597","url":null,"abstract":"The paper investigates software reliability prediction by using ensemble learning with random hyperparameter optimization. Software reliability is a significant problem with software quality that developers face. It involves accurately predicting the next failure. In recent years, machine learning techniques and ensemble learning approaches have been applied to improve software reliability prediction. These approaches aim to analyze historical data and develop models that can accurately forecast when failures are likely to occur. The article proposes an ensemble learning regression model using Ridge, Bayesian Ridge, Support Vector Regressor (SVR), K-Nearest Neighbors Algorithm (KNN), Regression tree, Random Forest, Neural network, and Decision Tree as base learners. Ridge is used as a combiner model. Each base learner hyperparameter is tuned using a random search algorithm automatically. A random hyperparameter search optimization algorithm selects the hyperparameter and adjusts it for overfitting and underfitting. The base models are tuned to minimize bias and variance. The performances of the models are evaluated using standard error measures such as Mean Squared Error (MSE), Sum of Squared Error (SSE), and Normalized Root Mean Square Error (NRMSE). The proposed ensemble model is compared with existing models using a benchmark dataset. The Iyer,and Lee, and Musa datasets are used for the experiment. The dataset is scaled using standard methods like logarithmic scaling, lagging, and linear interpolation. The results of the statistical comparison show better performance by our proposed model as compared to existing models.","PeriodicalId":507768,"journal":{"name":"Review of Computer Engineering Research","volume":"63 36","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139534770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Pneumonia and tuberculosis detection with chest x-ray images and medical records using deep learning techniques 利用深度学习技术通过胸部 X 光图像和病历检测肺炎和肺结核

Review of Computer Engineering Research

Pub Date : 2023-11-28 DOI: 10.18488/76.v10i4.3533

Sudhir Kumar Mohapatra, Mesfin Abebe, Lidia Mekuanint, Srinivas Prasad, Prasanta Kumar Bala, Sunil Kumar Dhala

Pneumonia and tuberculosis are the major public health problems worldwide. These diseases affect the lungs, and if they are not diagnosed properly in time, they can become a fatal health problem. Chest x-ray images are widely used to detect and diagnose Pneumonia and Tuberculosis disease. Detection of Pneumonia and Tuberculosis from chest x-ray images is difficult and requires experience due to the similar pathological features of the diseases. Sometimes a misdiagnosis of the disease occurs due to this similarity. Several researchers used deep learning and machine learning techniques to solve this misdiagnosis problem. However, these studies used the chest x-ray images only to develop Pneumonia and Tuberculosis disease detection models. But using the chest x-ray images alone cannot necessarily lead to accurate disease detection and classification. In the traditional or manual approach, medical records are required to support and correctly interpret the chest x-ray images in the appropriate clinical context. This study develops a multi-input Pneumonia and Tuberculosis detection model using chest x-ray images and medical records to follow the clinical procedure. The study applied a Convolutional Neural Network for the chest x-ray image data and a Multilayer perceptron for the medical record data to develop the models. We implemented feature-level concatenation to join the output feature vectors from the Convolutional Neural Network and a Multilayer perceptron for the development of the disease detection model. For the purpose of comparison, we also developed image-only and medical record-only models. Consequently, the image-only model gives an accuracy of 92.68%, the medical record-only model results in 98.72% accuracy, and the combined model accuracy is improved to 99.61%. In general, the study shows that the fusion of the chest x-ray and the medical records leads to better accuracy and is more similar to the clinical approach.

肺炎和肺结核是全球主要的公共卫生问题。这些疾病会影响肺部，如果不能及时得到正确诊断，就会成为致命的健康问题。胸部 X 光图像被广泛用于检测和诊断肺炎和肺结核疾病。由于肺炎和肺结核的病理特征相似，因此从胸部 X 光图像检测这两种疾病非常困难，而且需要经验。有时，这种相似性会导致疾病的误诊。一些研究人员使用深度学习和机器学习技术来解决这一误诊问题。不过，这些研究仅使用胸部 X 光图像来开发肺炎和肺结核疾病检测模型。但是，仅使用胸部 X 光图像并不一定能实现准确的疾病检测和分类。在传统或人工方法中，需要医疗记录的支持，并在适当的临床背景下正确解读胸部 X 光图像。本研究利用胸部 X 光图像和医疗记录开发了一个多输入肺炎和肺结核检测模型，以遵循临床程序。该研究对胸部 X 光图像数据采用卷积神经网络，对医疗记录数据采用多层感知器来开发模型。我们采用了特征级连接技术，将卷积神经网络和多层感知器的输出特征向量连接起来，以建立疾病检测模型。为了进行比较，我们还开发了纯图像模型和纯病历模型。结果，纯图像模型的准确率为 92.68%，纯病历模型的准确率为 98.72%，综合模型的准确率提高到 99.61%。总的来说，研究表明，胸部 X 光片和医疗记录的融合能带来更好的准确性，并且更接近临床方法。

{"title":"Pneumonia and tuberculosis detection with chest x-ray images and medical records using deep learning techniques","authors":"Sudhir Kumar Mohapatra, Mesfin Abebe, Lidia Mekuanint, Srinivas Prasad, Prasanta Kumar Bala, Sunil Kumar Dhala","doi":"10.18488/76.v10i4.3533","DOIUrl":"https://doi.org/10.18488/76.v10i4.3533","url":null,"abstract":"Pneumonia and tuberculosis are the major public health problems worldwide. These diseases affect the lungs, and if they are not diagnosed properly in time, they can become a fatal health problem. Chest x-ray images are widely used to detect and diagnose Pneumonia and Tuberculosis disease. Detection of Pneumonia and Tuberculosis from chest x-ray images is difficult and requires experience due to the similar pathological features of the diseases. Sometimes a misdiagnosis of the disease occurs due to this similarity. Several researchers used deep learning and machine learning techniques to solve this misdiagnosis problem. However, these studies used the chest x-ray images only to develop Pneumonia and Tuberculosis disease detection models. But using the chest x-ray images alone cannot necessarily lead to accurate disease detection and classification. In the traditional or manual approach, medical records are required to support and correctly interpret the chest x-ray images in the appropriate clinical context. This study develops a multi-input Pneumonia and Tuberculosis detection model using chest x-ray images and medical records to follow the clinical procedure. The study applied a Convolutional Neural Network for the chest x-ray image data and a Multilayer perceptron for the medical record data to develop the models. We implemented feature-level concatenation to join the output feature vectors from the Convolutional Neural Network and a Multilayer perceptron for the development of the disease detection model. For the purpose of comparison, we also developed image-only and medical record-only models. Consequently, the image-only model gives an accuracy of 92.68%, the medical record-only model results in 98.72% accuracy, and the combined model accuracy is improved to 99.61%. In general, the study shows that the fusion of the chest x-ray and the medical records leads to better accuracy and is more similar to the clinical approach.","PeriodicalId":507768,"journal":{"name":"Review of Computer Engineering Research","volume":"46 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139227160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Review of Computer Engineering Research

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀