{"title":"Enhanced the prediction approach of diabetes using an autoencoder with regularization and deep neural network","authors":"H. A. Ismael, Nabeel Al-A'araji, B. K. Shukur","doi":"10.21533/pen.v10i6.3394","DOIUrl":null,"url":null,"abstract":"Diabetes mellitus is considered one of the foremost common and extreme diseases worldwide. A precise and early diagnosis of diabetes is essential to avoid complications and is of crucial importance to the medical care that patients get. To achieve that, we need to develop a model to predict diabetes. There are many prediction models, but they suffer from some problems such as the accuracy of prediction being poor and the time complexity. The prediction process is highly dependent on important features. So, in this paper, we proposed a new model called (CAER-DNN) that depends on an unsupervised technique for generating newly important features and a deep neural network for the prediction process. The unsupervised technique is called complete autoencoder with regularization techniques (CAER) that uses to reconstruct the original features (newly learned features). It is focused too much on training the most important learned features and misses out on less important features. Thus, improving the performance of the prediction process. These important features are used as input to the deep neural network for the prediction of diabetes. Our model is applied to two sets of data including Pima Indian and Mendeley diabetic datasets. Based on the 10-fold cross-validation technique Pima Indian dataset achieves high performance in evaluation measures (f1-score 97.38%, accuracy, recall 97.25%, specificity 97.59%, precision 97.53%,). While the Mendeley diabetes dataset achieved high performance in evaluation measures (f1-score 94.51%, accuracy 98.48, recall 91.74%, accuracy-balance 98.21%, precision 98.21%) based on the holdout technique. compared with other existing machine learning and deep learning techniques our model outperformed existing techniques.","PeriodicalId":37519,"journal":{"name":"Periodicals of Engineering and Natural Sciences","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Periodicals of Engineering and Natural Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21533/pen.v10i6.3394","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Engineering","Score":null,"Total":0}
引用次数: 0
Abstract
Diabetes mellitus is considered one of the foremost common and extreme diseases worldwide. A precise and early diagnosis of diabetes is essential to avoid complications and is of crucial importance to the medical care that patients get. To achieve that, we need to develop a model to predict diabetes. There are many prediction models, but they suffer from some problems such as the accuracy of prediction being poor and the time complexity. The prediction process is highly dependent on important features. So, in this paper, we proposed a new model called (CAER-DNN) that depends on an unsupervised technique for generating newly important features and a deep neural network for the prediction process. The unsupervised technique is called complete autoencoder with regularization techniques (CAER) that uses to reconstruct the original features (newly learned features). It is focused too much on training the most important learned features and misses out on less important features. Thus, improving the performance of the prediction process. These important features are used as input to the deep neural network for the prediction of diabetes. Our model is applied to two sets of data including Pima Indian and Mendeley diabetic datasets. Based on the 10-fold cross-validation technique Pima Indian dataset achieves high performance in evaluation measures (f1-score 97.38%, accuracy, recall 97.25%, specificity 97.59%, precision 97.53%,). While the Mendeley diabetes dataset achieved high performance in evaluation measures (f1-score 94.51%, accuracy 98.48, recall 91.74%, accuracy-balance 98.21%, precision 98.21%) based on the holdout technique. compared with other existing machine learning and deep learning techniques our model outperformed existing techniques.