首页 > 最新文献

Turkish Journal of Engineering最新文献

英文 中文
Effect of dimension reduction with PCA and machine learning algorithms on diabetes diagnosis performance 利用 PCA 和机器学习算法降维对糖尿病诊断性能的影响
Pub Date : 2024-07-05 DOI: 10.31127/tuje.1413087
Yavuz Bahadir Koca, Elif Aktepe
Diabetes, a long-term metabolic disorder, causes persistently high blood sugar and presents a significant global health challenge. Early diagnosis is of vital importance in mitigating the effects of diabetes. This study aims to investigate diabetes diagnosis and risk prediction using a comprehensive diabetes dataset created in 2023. The dataset contains clinical and anthropometric data of patients. Data simplification was successfully applied to clean unnecessary information and reduce data dimensionality. Additionally, methods like Principal Component Analysis were applied to decrease the number of variables in the dataset. These analyses rendered the dataset more manageable and improved its performance. In this study, a dataset encompassing health data of a total of 100,000 individuals was utilized. This dataset consists of 8 input features and 1 output feature. The primary objective is to determine the algorithm that exhibits the best performance for diabetes diagnosis. There was no missing data during the data preprocessing stage, and the necessary transformations were carried out successfully. Nine different machine learning algorithms were applied to the dataset in this study. Each algorithm employed various modelling approaches to evaluate its performance in diagnosing diabetes. The results demonstrate that machine learning models are successful in predicting the presence of diabetes and the risk of developing it in healthy individuals. Particularly, the random forest model provided superior results across all performance metrics. This study provides significant findings that can shed light on future research in diabetes diagnosis and risk prediction. Dimensionality reduction techniques have proven to be valuable in data analysis and have highlighted the potential to facilitate diabetes diagnosis, thereby enhancing the quality of life for patients.
糖尿病是一种长期代谢紊乱疾病,会导致持续高血糖,对全球健康构成重大挑战。早期诊断对减轻糖尿病的影响至关重要。本研究旨在利用 2023 年创建的糖尿病综合数据集研究糖尿病诊断和风险预测。数据集包含患者的临床和人体测量数据。数据简化被成功应用于清理不必要的信息和降低数据维度。此外,还采用了主成分分析等方法来减少数据集中的变量数量。这些分析使数据集更易于管理,并提高了数据集的性能。本研究使用了一个数据集,其中包含总共 100,000 人的健康数据。该数据集由 8 个输入特征和 1 个输出特征组成。主要目的是确定在糖尿病诊断方面表现最佳的算法。在数据预处理阶段没有数据缺失,并且成功进行了必要的转换。本研究对数据集采用了九种不同的机器学习算法。每种算法都采用了不同的建模方法,以评估其在诊断糖尿病方面的性能。结果表明,机器学习模型能成功预测健康人是否患有糖尿病以及患糖尿病的风险。特别是随机森林模型在所有性能指标上都取得了优异的结果。这项研究提供了重要发现,可为未来糖尿病诊断和风险预测研究提供启示。降维技术已被证明在数据分析中很有价值,并凸显了促进糖尿病诊断的潜力,从而提高患者的生活质量。
{"title":"Effect of dimension reduction with PCA and machine learning algorithms on diabetes diagnosis performance","authors":"Yavuz Bahadir Koca, Elif Aktepe","doi":"10.31127/tuje.1413087","DOIUrl":"https://doi.org/10.31127/tuje.1413087","url":null,"abstract":"Diabetes, a long-term metabolic disorder, causes persistently high blood sugar and presents a significant global health challenge. Early diagnosis is of vital importance in mitigating the effects of diabetes. This study aims to investigate diabetes diagnosis and risk prediction using a comprehensive diabetes dataset created in 2023. The dataset contains clinical and anthropometric data of patients. Data simplification was successfully applied to clean unnecessary information and reduce data dimensionality. Additionally, methods like Principal Component Analysis were applied to decrease the number of variables in the dataset. These analyses rendered the dataset more manageable and improved its performance. In this study, a dataset encompassing health data of a total of 100,000 individuals was utilized. This dataset consists of 8 input features and 1 output feature. The primary objective is to determine the algorithm that exhibits the best performance for diabetes diagnosis. There was no missing data during the data preprocessing stage, and the necessary transformations were carried out successfully. Nine different machine learning algorithms were applied to the dataset in this study. Each algorithm employed various modelling approaches to evaluate its performance in diagnosing diabetes. The results demonstrate that machine learning models are successful in predicting the presence of diabetes and the risk of developing it in healthy individuals. Particularly, the random forest model provided superior results across all performance metrics. This study provides significant findings that can shed light on future research in diabetes diagnosis and risk prediction. Dimensionality reduction techniques have proven to be valuable in data analysis and have highlighted the potential to facilitate diabetes diagnosis, thereby enhancing the quality of life for patients.","PeriodicalId":518565,"journal":{"name":"Turkish Journal of Engineering","volume":" 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141674291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detection of cotton leaf disease with machine learning model 用机器学习模型检测棉花叶病
Pub Date : 2024-04-18 DOI: 10.31127/tuje.1406755
Unain Hyder, Mir Rahib Hussain
This study aims to use a machine learning (ML) model to accurately classify four datasets of cotton crop leaves as either infected or healthy. Bacterial blight, Curly virus, Fussarium Wilt, and healthy leaves were used as the datasets for the study. ML is a useful tool in detecting cotton leaf diseases and can minimize the rate of disease. The problem is that without machine learning technique it is very difficult and time consuming to detect the diseases then to sort out this problem a machine learning model is proposed and to test the accuracy of the proposed model, the confusion matrix concept was used. The researchers have done their research works to diagnose the diseases by using (ML) model but the drawback of their research was that the results which were given by the different (ML) models were not accurate. The target of the study was to identify diseases affecting the cotton plant in the early stages using traditional techniques. However, utilizing various image processing techniques and machine learning algorithms, including a convolutional neural network, proved to be helpful in diagnosing the diseases. This technological approach can simplify the detection of damaged leaves and minimize the efforts of farmers in detecting those diseases. Cotton is a natural fiber produced on a large scale, and it is grown on 2.5% of overall agronomic land. The detection of cotton leaf diseases is crucial to maintain the crop's productivity and provide reliable earnings to farmers. A confusion matrix is N X N matrix used for evaluating the performance of a classification model, where N is the number of target classes. The matrix compares the actual target values with those predicted by machine learning model. This technique has four parameters to test the accuracy of the results which is given in my research work.
本研究旨在利用机器学习(ML)模型将四种数据集的棉花作物叶片准确地分为感染叶片和健康叶片。细菌性枯萎病、卷曲病毒、镰刀菌枯萎病和健康叶片被用作研究数据集。ML 是检测棉花叶片病害的有用工具,可以最大限度地降低病害发生率。问题在于,如果没有机器学习技术,检测病害就会非常困难和耗时,为了解决这个问题,我们提出了一个机器学习模型,并使用混淆矩阵概念来测试所提模型的准确性。研究人员已经完成了使用(ML)模型诊断疾病的研究工作,但他们研究的缺点是不同(ML)模型给出的结果并不准确。这项研究的目标是利用传统技术在早期阶段识别棉花植株的病害。然而,事实证明,利用各种图像处理技术和机器学习算法(包括卷积神经网络)有助于诊断疾病。这种技术方法可以简化受损叶片的检测工作,最大程度地减少农民检测这些病害的工作量。棉花是一种大规模生产的天然纤维,其种植面积占农田总面积的 2.5%。棉花叶片病害的检测对于保持作物产量和为农民提供可靠收益至关重要。混淆矩阵是用于评估分类模型性能的 N X N 矩阵,其中 N 是目标类别的数量。矩阵将实际目标值与机器学习模型的预测值进行比较。这项技术有四个参数来测试结果的准确性,我在研究工作中给出了这四个参数。
{"title":"Detection of cotton leaf disease with machine learning model","authors":"Unain Hyder, Mir Rahib Hussain","doi":"10.31127/tuje.1406755","DOIUrl":"https://doi.org/10.31127/tuje.1406755","url":null,"abstract":"This study aims to use a machine learning (ML) model to accurately classify four datasets of cotton crop leaves as either infected or healthy. Bacterial blight, Curly virus, Fussarium Wilt, and healthy leaves were used as the datasets for the study. ML is a useful tool in detecting cotton leaf diseases and can minimize the rate of disease. The problem is that without machine learning technique it is very difficult and time consuming to detect the diseases then to sort out this problem a machine learning model is proposed and to test the accuracy of the proposed model, the confusion matrix concept was used. The researchers have done their research works to diagnose the diseases by using (ML) model but the drawback of their research was that the results which were given by the different (ML) models were not accurate. The target of the study was to identify diseases affecting the cotton plant in the early stages using traditional techniques. However, utilizing various image processing techniques and machine learning algorithms, including a convolutional neural network, proved to be helpful in diagnosing the diseases. This technological approach can simplify the detection of damaged leaves and minimize the efforts of farmers in detecting those diseases. Cotton is a natural fiber produced on a large scale, and it is grown on 2.5% of overall agronomic land. The detection of cotton leaf diseases is crucial to maintain the crop's productivity and provide reliable earnings to farmers. A confusion matrix is N X N matrix used for evaluating the performance of a classification model, where N is the number of target classes. The matrix compares the actual target values with those predicted by machine learning model. This technique has four parameters to test the accuracy of the results which is given in my research work.","PeriodicalId":518565,"journal":{"name":"Turkish Journal of Engineering","volume":" 8","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140686711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A lasso regression-based forecasting model for daily gasoline consumption: Türkiye Case 基于套索回归的日汽油消耗量预测模型:土耳其案例
Pub Date : 2024-01-04 DOI: 10.31127/tuje.1354501
Ertugrul Ayyıldız, Mirac Murat
Gasoline is one of the most sought-after resources in the world, where the need for energy is indispensable and continuously increasing for human life today. A shortage of gasoline may negatively affect the economies of countries. Therefore, analyzes and estimates about gasoline consumption are critical. Better forecast performance on gasoline consumption can serve the policymakers, managers, researchers, and other gasoline sector stakeholders. Parallel to the world economy, gasoline consumption in Turkey is among the top among the most consumed energy source. Therefore, it is aimed at forecasting the amount of daily gasoline consumption in Turkey in this study. For this purpose, a lasso regression-based forecasting methodology is proposed. The forecasting approach used for daily gasoline consumption consisting of 3 main stages: i) cleaning the data ii) extracting and selecting features iii) forecasting the future of daily gasoline consumption time series via the proposed models. Besides, Ridge Regression is used to compare the performance of the proposed model.
汽油是世界上最抢手的资源之一,当今人类生活对能源的需求不可或缺且不断增长。汽油短缺可能会对各国经济产生负面影响。因此,对汽油消耗量的分析和估计至关重要。更好地预测汽油消耗量可以为政策制定者、管理者、研究人员和其他汽油行业利益相关者提供服务。与世界经济同步,土耳其的汽油消耗量在消耗量最大的能源中名列前茅。因此,本研究旨在预测土耳其的汽油日消费量。为此,提出了一种基于套索回归的预测方法。用于日汽油消耗量的预测方法包括 3 个主要阶段:i) 清理数据 ii) 提取和选择特征 iii) 通过建议的模型预测日汽油消耗量时间序列的未来。此外,还使用了岭回归来比较建议模型的性能。
{"title":"A lasso regression-based forecasting model for daily gasoline consumption: Türkiye Case","authors":"Ertugrul Ayyıldız, Mirac Murat","doi":"10.31127/tuje.1354501","DOIUrl":"https://doi.org/10.31127/tuje.1354501","url":null,"abstract":"Gasoline is one of the most sought-after resources in the world, where the need for energy is indispensable and continuously increasing for human life today. A shortage of gasoline may negatively affect the economies of countries. Therefore, analyzes and estimates about gasoline consumption are critical. Better forecast performance on gasoline consumption can serve the policymakers, managers, researchers, and other gasoline sector stakeholders. Parallel to the world economy, gasoline consumption in Turkey is among the top among the most consumed energy source. Therefore, it is aimed at forecasting the amount of daily gasoline consumption in Turkey in this study. For this purpose, a lasso regression-based forecasting methodology is proposed. The forecasting approach used for daily gasoline consumption consisting of 3 main stages: i) cleaning the data ii) extracting and selecting features iii) forecasting the future of daily gasoline consumption time series via the proposed models. Besides, Ridge Regression is used to compare the performance of the proposed model.","PeriodicalId":518565,"journal":{"name":"Turkish Journal of Engineering","volume":"68 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140532042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Turkish Journal of Engineering
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1