Pub Date : 2023-04-03DOI: 10.35377/saucis...1223054
Hüseyin Güney
Machine learning (ML) has been frequently used to build intelligent systems in many problem domains, including cybersecurity. For malicious network activity detection, ML-based intrusion detection systems (IDSs) are promising due to their ability to classify attacks autonomously after learning process. However, this is a challenging task due to the vast number of available methods in the current literature, including ML classification algorithms and preprocessing techniques. For analysis the impact of preprocessing techniques on the ML algorithm, this study has conducted extensive experiments, using support vector machines (SVM), the classifier and the FS technique, several normalisation techniques, and a grid-search classifier optimisation algorithm. These methods were sequentially tested on three publicly available network intrusion datasets, NSL-KDD, UNSW-NB15, and CICIDS2017. Subsequently, the results were analysed to investigate the impact of each model and to extract the insights for building intelligent and efficient IDS. The results exhibited that data preprocessing significantly improves classification performance and log-scaling normalisation outperformed other techniques for intrusion detection datasets. Additionally, the results suggested that the embedded SVM-FS is accurate and classifier optimisation can improve performance of classifier-dependent FS techniques. However, feature selection in classifier optimisation is a critical problem that must be addressed. In conclusion, this study provides insights for building ML-based NIDS by revealing important information about data preprocessing.
{"title":"Preprocessing Impact Analysis for Machine Learning-Based Network Intrusion Detection","authors":"Hüseyin Güney","doi":"10.35377/saucis...1223054","DOIUrl":"https://doi.org/10.35377/saucis...1223054","url":null,"abstract":"Machine learning (ML) has been frequently used to build intelligent systems in many problem domains, including cybersecurity. For malicious network activity detection, ML-based intrusion detection systems (IDSs) are promising due to their ability to classify attacks autonomously after learning process. However, this is a challenging task due to the vast number of available methods in the current literature, including ML classification algorithms and preprocessing techniques. For analysis the impact of preprocessing techniques on the ML algorithm, this study has conducted extensive experiments, using support vector machines (SVM), the classifier and the FS technique, several normalisation techniques, and a grid-search classifier optimisation algorithm. These methods were sequentially tested on three publicly available network intrusion datasets, NSL-KDD, UNSW-NB15, and CICIDS2017. Subsequently, the results were analysed to investigate the impact of each model and to extract the insights for building intelligent and efficient IDS. The results exhibited that data preprocessing significantly improves classification performance and log-scaling normalisation outperformed other techniques for intrusion detection datasets. Additionally, the results suggested that the embedded SVM-FS is accurate and classifier optimisation can improve performance of classifier-dependent FS techniques. However, feature selection in classifier optimisation is a critical problem that must be addressed. In conclusion, this study provides insights for building ML-based NIDS by revealing important information about data preprocessing.","PeriodicalId":257636,"journal":{"name":"Sakarya University Journal of Computer and Information Sciences","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115403621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-30DOI: 10.35377/saucis...1207742
B. Parlak
In a standard text classification (TC) study, preprocessing is one of the key components to improve performance. This study aims to look at how preprocessing effects TC according to news text, text language, and feature selection. All potential combinations of commonly used preprocessing techniques are compared on one domain, namely news data, and in two different news datasets for this aim. Preprocessing technique contributions to classification performance at multiple feature sizes, possible interconnections among these techniques, and technique dependency on corresponding languages are all evaluated in this way. Using best combinations of preprocessing techniques rather than using or not using them all, experimental studies on public datasets reveals that, choosing best combinations of preprocessing techniques can improve classification accuracy significantly.
{"title":"The Effects of Preprocessing on Turkish and English News Data","authors":"B. Parlak","doi":"10.35377/saucis...1207742","DOIUrl":"https://doi.org/10.35377/saucis...1207742","url":null,"abstract":"In a standard text classification (TC) study, preprocessing is one of the key components to improve performance. This study aims to look at how preprocessing effects TC according to news text, text language, and feature selection. All potential combinations of commonly used preprocessing techniques are compared on one domain, namely news data, and in two different news datasets for this aim. Preprocessing technique contributions to classification performance at multiple feature sizes, possible interconnections among these techniques, and technique dependency on corresponding languages are all evaluated in this way. Using best combinations of preprocessing techniques rather than using or not using them all, experimental studies on public datasets reveals that, choosing best combinations of preprocessing techniques can improve classification accuracy significantly.","PeriodicalId":257636,"journal":{"name":"Sakarya University Journal of Computer and Information Sciences","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125977033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-22DOI: 10.35377/saucis...1259584
Deep learning is a powerful technique that has been applied to the task of stroke detection using medical imaging. Stroke is a medical condition that occurs when the blood supply to the brain is interrupted, which can cause brain damage and other serious complications. Detection of stroke is important in order to minimize damage and improve patient outcomes. One of the most common imaging modalities used for stroke detection is CT(Computed Tomography). CT can provide detailed images of the brain and can be used to identify the presence and location of a stroke. Deep learning models, particularly convolutional neural networks (CNNs), have shown promise for the task of stroke detection using CT images. These models can learn to automatically identify patterns in the images that are indicative of a stroke, such as the presence of an infarct or hemorrhage. Some examples of deep learning models used for stroke detection in CT images are U-Net, which is commonly used for medical image segmentation tasks, and CNNs, which have been trained to classify brain CT images into normal or abnormal. The purpose of this study is to identify the type of stroke from brain CT images taken without the administration of a contrast agent, i.e. occlusive (ischemic) or hemorrhagic (hemorrhagic). Stroke images were collected and a dataset was constructed with medical specialists. Deep learning classification models were evaluated with hyperparameter optimization techniques. And the result segmented with improved Unet model to visualize the stroke in CT images. Classification models were compared and VGG16 achieved %94 success. Unet model was achieved %60 IOU and detected the ischemia and hemorrhage differences.
{"title":"Ischemia and Hemorrhage detection in CT images with Hyper parameter optimization of classification models and Improved UNet Segmentation Model","authors":"","doi":"10.35377/saucis...1259584","DOIUrl":"https://doi.org/10.35377/saucis...1259584","url":null,"abstract":"Deep learning is a powerful technique that has been applied to the task of stroke detection using medical imaging. Stroke is a medical condition that occurs when the blood supply to the brain is interrupted, which can cause brain damage and other serious complications. Detection of stroke is important in order to minimize damage and improve patient outcomes. One of the most common imaging modalities used for stroke detection is CT(Computed Tomography). CT can provide detailed images of the brain and can be used to identify the presence and location of a stroke. Deep learning models, particularly convolutional neural networks (CNNs), have shown promise for the task of stroke detection using CT images. These models can learn to automatically identify patterns in the images that are indicative of a stroke, such as the presence of an infarct or hemorrhage. Some examples of deep learning models used for stroke detection in CT images are U-Net, which is commonly used for medical image segmentation tasks, and CNNs, which have been trained to classify brain CT images into normal or abnormal. \u0000The purpose of this study is to identify the type of stroke from brain CT images taken without the administration of a contrast agent, i.e. occlusive (ischemic) or hemorrhagic (hemorrhagic). Stroke images were collected and a dataset was constructed with medical specialists. Deep learning classification models were evaluated with hyperparameter optimization techniques. And the result segmented with improved Unet model to visualize the stroke in CT images. Classification models were compared and VGG16 achieved %94 success. Unet model was achieved %60 IOU and detected the ischemia and hemorrhage differences.","PeriodicalId":257636,"journal":{"name":"Sakarya University Journal of Computer and Information Sciences","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124427531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-17DOI: 10.35377/saucis...1233094
Hilal Atici, H. Koçer, A. Sivrikaya, M. Dağlı
Urine sediment tests are important in the diagnosis of abnormal diseases related to the urinary tract. The formation of cells such as red blood cells and white blood cells in the urine of patients is important for the diagnosis of the disease. Therefore, cells need to be fully identified in clinical urinalysis. Urinalysis with human eyes; Since it is subjective, time consuming and causing errors, methods have been developed to automate microscopic analysis with the help of computer and software systems. In this study, the YOLO-v7 algorithm, which gives successful results in image processing technology, was used as a method and model. The dataset used in the study was created by using microscopic images of urine sediment taken from the Biochemistry Laboratory of the Faculty of Medicine, Selcuk University. Seven different cell segmentation and classification studies have been carried out, including WBC, RBC, WBCC, Epithelial, Flat Epithelial, Mucs and Bubbles, which have clinical value for the diagnosis of the disease. Experimental studies were carried out with the YOLO-v7 algorithm and the results were presented. The contributions of this study can be summarized as follows. (1) In this study, which is proposed for segmentation of cells on the urine cell images in the Urine Sediment dataset, for the experimental studies carried out with the YOLO model, whose performance was evaluated; Precision, Recall, mAP(0.5) and F1-Score(%) segmentation performance metrics were calculated as 0.384, 0.759, 0.432 and 0.510, respectively. (2) A computer-aided support system to assist physicians in segmenting urine cells is presented as a secondary tool. Classification accuracy for WBC, RBC, WBCC, Epithelial, Flat Epithelial, Mucs and Bubbles cells was calculated as 0.78, 0.94, 0.90, 0.57, 0.92, 0.68 and 0.97, respectively. A mean classification success of 0.822 was achieved for all classes. Thus, it has been seen that the Yolov7 model can be used by experts as a tool for recognizing cells in the urine sediment. As a result, it has been shown that suitable deep learning models can be used to recognize the biometric properties of urinary sediment cells. With the model created using deep learning libraries, urine sediment cells can be easily classified, and it is possible to define many different cells if there is a dataset with sufficient number of images.
{"title":"Analysis of Urine Sediment Images for Detection and Classification of Cells","authors":"Hilal Atici, H. Koçer, A. Sivrikaya, M. Dağlı","doi":"10.35377/saucis...1233094","DOIUrl":"https://doi.org/10.35377/saucis...1233094","url":null,"abstract":"Urine sediment tests are important in the diagnosis of abnormal diseases related to the urinary tract. The formation of cells such as red blood cells and white blood cells in the urine of patients is important for the diagnosis of the disease. Therefore, cells need to be fully identified in clinical urinalysis. Urinalysis with human eyes; Since it is subjective, time consuming and causing errors, methods have been developed to automate microscopic analysis with the help of computer and software systems. In this study, the YOLO-v7 algorithm, which gives successful results in image processing technology, was used as a method and model. The dataset used in the study was created by using microscopic images of urine sediment taken from the Biochemistry Laboratory of the Faculty of Medicine, Selcuk University. Seven different cell segmentation and classification studies have been carried out, including WBC, RBC, WBCC, Epithelial, Flat Epithelial, Mucs and Bubbles, which have clinical value for the diagnosis of the disease. Experimental studies were carried out with the YOLO-v7 algorithm and the results were presented. The contributions of this study can be summarized as follows. (1) In this study, which is proposed for segmentation of cells on the urine cell images in the Urine Sediment dataset, for the experimental studies carried out with the YOLO model, whose performance was evaluated; Precision, Recall, mAP(0.5) and F1-Score(%) segmentation performance metrics were calculated as 0.384, 0.759, 0.432 and 0.510, respectively. (2) A computer-aided support system to assist physicians in segmenting urine cells is presented as a secondary tool. Classification accuracy for WBC, RBC, WBCC, Epithelial, Flat Epithelial, Mucs and Bubbles cells was calculated as 0.78, 0.94, 0.90, 0.57, 0.92, 0.68 and 0.97, respectively. A mean classification success of 0.822 was achieved for all classes. Thus, it has been seen that the Yolov7 model can be used by experts as a tool for recognizing cells in the urine sediment. As a result, it has been shown that suitable deep learning models can be used to recognize the biometric properties of urinary sediment cells. With the model created using deep learning libraries, urine sediment cells can be easily classified, and it is possible to define many different cells if there is a dataset with sufficient number of images.","PeriodicalId":257636,"journal":{"name":"Sakarya University Journal of Computer and Information Sciences","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115476400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-24DOI: 10.35377/saucis...1170902
Gozde YOLCU ÖZTEL, İ. Öztel
Correctly determining the driving area and pedestrians is crucial for intelligent vehicles to reduce fatal road accidents risk. But these are challenging tasks in the computer vision field. Various weather, road conditions, etc., make them difficult. This paper presents a vision-based road segmentation and pedestrian detection system. First, the roads are segmented using a deep learning based consecutive triple filter size (CTFS) approach. Then, pedestrians on the segmented roads are detected using the transfer learning approach. The CTFS approach can create feature maps for small and big features. The proposed system is a reliable, low-cost road segmentation and pedestrian detection system for intelligent vehicles.
{"title":"Deep Learning-based Road Segmentation & Pedestrian Detection System for Intelligent Vehicles","authors":"Gozde YOLCU ÖZTEL, İ. Öztel","doi":"10.35377/saucis...1170902","DOIUrl":"https://doi.org/10.35377/saucis...1170902","url":null,"abstract":"Correctly determining the driving area and pedestrians is crucial for intelligent vehicles to reduce fatal road accidents risk. But these are challenging tasks in the computer vision field. Various weather, road conditions, etc., make them difficult. This paper presents a vision-based road segmentation and pedestrian detection system. First, the roads are segmented using a deep learning based consecutive triple filter size (CTFS) approach. Then, pedestrians on the segmented roads are detected using the transfer learning approach. The CTFS approach can create feature maps for small and big features. The proposed system is a reliable, low-cost road segmentation and pedestrian detection system for intelligent vehicles.","PeriodicalId":257636,"journal":{"name":"Sakarya University Journal of Computer and Information Sciences","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131607451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-02DOI: 10.35377/saucis...1210687
Ü. Yilmaz, Ümit Güler
Geostationary Satellites (GEO) are being used commonly in communication market. The service providers uplink or downlink the signal by using their dedicated antennas (whether with or without tracking capability) to the GEO satellite. The satellite down-converts and amplifies the signal before sending back to the end users on Earth. Normally, the user set and adjust the uplink antenna to follow the GEO satellite movement as much as possible. As soon as there is no reduction in the link budget, this pointing assumed to be successful. On the other hand, the input power of the satellite, together with satellite longitude vs latitude, can give reasonable ideas about the accuracy of the ground antenna pointing. In the study, ground station pointing performance is shown with two different cases. One with tracking and one without tracking capability.
{"title":"ON ORBIT DEMONSTRATION OF POINTING ACCURACY OF GROUND ANTENNAS (WITH AND WITHOUT TRACKING CAPABILITY) BY A FLYING GEO SATELLITE","authors":"Ü. Yilmaz, Ümit Güler","doi":"10.35377/saucis...1210687","DOIUrl":"https://doi.org/10.35377/saucis...1210687","url":null,"abstract":"Geostationary Satellites (GEO) are being used commonly in communication market. The service providers uplink or downlink the signal by using their dedicated antennas (whether with or without tracking capability) to the GEO satellite. The satellite down-converts and amplifies the signal before sending back to the end users on Earth. Normally, the user set and adjust the uplink antenna to follow the GEO satellite movement as much as possible. As soon as there is no reduction in the link budget, this pointing assumed to be successful. On the other hand, the input power of the satellite, together with satellite longitude vs latitude, can give reasonable ideas about the accuracy of the ground antenna pointing. In the study, ground station pointing performance is shown with two different cases. One with tracking and one without tracking capability.","PeriodicalId":257636,"journal":{"name":"Sakarya University Journal of Computer and Information Sciences","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128185035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-19DOI: 10.35377/saucis...1209519
Pınar Cihan
The increase in environmental problems such as climate change and air pollution caused by global warming has risen the popularity of electric vehicles (EVs) used in the smart grid environment. The increasing number of EVs can affect the grid in terms of power loss and voltage bias by changing the existing demand profile. Effective predicting of EVs energy demand ensures reliability and robustness of grid use, as well as aiding investment planning and resource allocation for charging infrastructures. In this study, the electricity demand amounts in two different cities are modeled by Support Vector Regression, Random Forest, Gauss Process, and Multilayer Perceptron algorithms. The findings reveal that electric vehicle owners usually start to charge their vehicles during the daytime, the COVID-19 pandemic causes a serious decrease in EVs energy demand, and the support vector regression (SVR) is more successful in energy demand forecasting. Furthermore, the results indicate that the decrease in electricity demand during the COVID-19 pandemic caused reduces in the prediction accuracy of the SVR model (decrease of 17.1% in training and 12.6% in test performance, P
{"title":"Time-series Forecasting of Energy Demand and Impact of the COVID-19 Pandemic on Model Performance in Electric Vehicles","authors":"Pınar Cihan","doi":"10.35377/saucis...1209519","DOIUrl":"https://doi.org/10.35377/saucis...1209519","url":null,"abstract":"The increase in environmental problems such as climate change and air pollution caused by global warming has risen the popularity of electric vehicles (EVs) used in the smart grid environment. The increasing number of EVs can affect the grid in terms of power loss and voltage bias by changing the existing demand profile. Effective predicting of EVs energy demand ensures reliability and robustness of grid use, as well as aiding investment planning and resource allocation for charging infrastructures. In this study, the electricity demand amounts in two different cities are modeled by Support Vector Regression, Random Forest, Gauss Process, and Multilayer Perceptron algorithms. The findings reveal that electric vehicle owners usually start to charge their vehicles during the daytime, the COVID-19 pandemic causes a serious decrease in EVs energy demand, and the support vector regression (SVR) is more successful in energy demand forecasting. Furthermore, the results indicate that the decrease in electricity demand during the COVID-19 pandemic caused reduces in the prediction accuracy of the SVR model (decrease of 17.1% in training and 12.6% in test performance, P","PeriodicalId":257636,"journal":{"name":"Sakarya University Journal of Computer and Information Sciences","volume":"141 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123752071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-02DOI: 10.35377/saucis...1172027
I. Kervanci, Fatih Akay
Machine learning and deep learning algorithms produce very different results with different examples of their hyperparameters. Algorithm parameters require optimization because they aren't specific for all problems. In this paper Long Short-Term Memory (LSTM), eight different hyperparameters (go-backward, epoch, batch size, dropout, activation function, optimizer, learning rate and, number of layers) were used to examine to daily and hourly Bitcoin datasets. The effects of each parameter on the daily dataset on the results were evaluated and explained These parameters were examined with hparam properties of Tensorboard. As a result, it was seen that examining all combinations of parameters with hparam produced the best test Mean Square Error (MSE) values with hourly dataset 0.000043633 and daily dataset 0.00073843. Both datasets produced better results with the tanh activation function. Finally, when the results are interpreted, the daily dataset produces better results with a small learning rate and small dropout values, whereas the hourly dataset produces better results with a large learning rate and large dropout values.
{"title":"LSTM Hyperparameters optimization with Hparam parameters for Bitcoin Price Prediction","authors":"I. Kervanci, Fatih Akay","doi":"10.35377/saucis...1172027","DOIUrl":"https://doi.org/10.35377/saucis...1172027","url":null,"abstract":"Machine learning and deep learning algorithms produce very different results with different examples of their hyperparameters. Algorithm parameters require optimization because they aren't specific for all problems. In this paper Long Short-Term Memory (LSTM), eight different hyperparameters (go-backward, epoch, batch size, dropout, activation function, optimizer, learning rate and, number of layers) were used to examine to daily and hourly Bitcoin datasets. The effects of each parameter on the daily dataset on the results were evaluated and explained These parameters were examined with hparam properties of Tensorboard. As a result, it was seen that examining all combinations of parameters with hparam produced the best test Mean Square Error (MSE) values with hourly dataset 0.000043633 and daily dataset 0.00073843. Both datasets produced better results with the tanh activation function. Finally, when the results are interpreted, the daily dataset produces better results with a small learning rate and small dropout values, whereas the hourly dataset produces better results with a large learning rate and large dropout values.","PeriodicalId":257636,"journal":{"name":"Sakarya University Journal of Computer and Information Sciences","volume":"432 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115953938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-22DOI: 10.35377/saucis...1196934
Oğuzhan Katar, Ilhan Firat Kilincer
White blood cells (WBCs), which are part of the immune system, help our body fight infections and other diseases. Certain diseases can cause our body to produce fewer WBCs than it needs. For this reason, WBCs are of great importance in the field of medical imaging. Artificial intelligence-based computer systems can assist experts in the analysis of WBCs. In this study, an approach is proposed for the automatic classification of WBCs over five different classes using a pre-trained model. ResNet-50, VGG-19, and MobileNet-V3-Small pre-trained models were trained with ImageNet weights. In the training, validation, and testing processes of the models, a public dataset containing 16,633 images and not having an even class distribution was used. While the ResNet-50 model reached 98.79% accuracy, the VGG-19 model reached 98.19% accuracy, the MobileNet-V3-Small model reached the highest accuracy rate with 98.86%. When the predictions of the MobileNet-V3-Small model are examined, it is seen that it is not affected by class dominance and can classify even the least sampled class images in the dataset correctly. WBCs were classified with high accuracy using the proposed pre-trained deep learning models. Experts can effectively use the proposed approach in the process of analyzing WBCs.
{"title":"Automatic Classification of White Blood Cells Using Pre-Trained Deep Models","authors":"Oğuzhan Katar, Ilhan Firat Kilincer","doi":"10.35377/saucis...1196934","DOIUrl":"https://doi.org/10.35377/saucis...1196934","url":null,"abstract":"White blood cells (WBCs), which are part of the immune system, help our body fight infections and other diseases. Certain diseases can cause our body to produce fewer WBCs than it needs. For this reason, WBCs are of great importance in the field of medical imaging. Artificial intelligence-based computer systems can assist experts in the analysis of WBCs. In this study, an approach is proposed for the automatic classification of WBCs over five different classes using a pre-trained model. ResNet-50, VGG-19, and MobileNet-V3-Small pre-trained models were trained with ImageNet weights. In the training, validation, and testing processes of the models, a public dataset containing 16,633 images and not having an even class distribution was used. While the ResNet-50 model reached 98.79% accuracy, the VGG-19 model reached 98.19% accuracy, the MobileNet-V3-Small model reached the highest accuracy rate with 98.86%. When the predictions of the MobileNet-V3-Small model are examined, it is seen that it is not affected by class dominance and can classify even the least sampled class images in the dataset correctly. WBCs were classified with high accuracy using the proposed pre-trained deep learning models. Experts can effectively use the proposed approach in the process of analyzing WBCs.","PeriodicalId":257636,"journal":{"name":"Sakarya University Journal of Computer and Information Sciences","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132011008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-06DOI: 10.35377/saucis...1210786
Fatma Akalın, M. Orhan, M. Buyukavci
Hodgkin-type lymphoma is a disease with unique histological, immunophenotypic, and clinical features. This disease occurs in nearly 30% of all lymphomas. Its treatable is high. However, the treatment plan is specified after the stage and risk status are determined. For this reason, it is an important process for doctors to decide on the stage of the disease correctly. Some of the data used for this decision are the patient's history, detailed physical examination, laboratory findings, imaging methods and bone marrow biopsy results. Hybrid FDG-PET is the other method used in the medical world. This method is used in diagnosis, evaluation of response given to treatment, staging and restaging process. However, it is radiation-based. Therefore it has the possibility of producing undesirable results in the future. In this study, an artificial intelligence-based computer-assisted decision support system is done to reduce the number of used medical methods and radiation exposure. Data were obtained from the NCBI-GEO dataset. The evaluation of these data, which contains missing values, is handled in two ways. Firstly, samples with missing values in the initial evaluation are deleted from the dataset. Then, these data are trained with “trainlm” function in artificial neural network architecture. However, reducing the error value of the estimates is important. For this, the artificial neural network architecture is retrained with the artificial bee colony algorithm, particle swarm optimization algorithm and invasive weed algorithm, respectively. Secondly, the same operations are performed again on the dataset containing missing values. As a result of the training, the maximum performance was obtained for invasive weed and particle swarm optimization algorithms with 1,45547E+14 and 1,23103E+14 average error rates, respectively.
霍奇金淋巴瘤是一种具有独特组织学、免疫表型和临床特征的疾病。这种疾病发生在所有淋巴瘤的近30%。它的可治疗性很高。然而,治疗方案是在确定阶段和风险状态后确定的。因此,医生正确判断疾病的阶段是一个重要的过程。用于此决定的一些数据包括患者的病史、详细的体格检查、实验室结果、成像方法和骨髓活检结果。混合FDG-PET是医学界使用的另一种方法。该方法用于诊断,治疗反应评价,分期和再分期过程。然而,它是基于辐射的。因此,它有可能在未来产生不良后果。在这项研究中,基于人工智能的计算机辅助决策支持系统,以减少使用的医疗方法和辐射暴露的数量。数据来自NCBI-GEO数据集。这些包含缺失值的数据的求值可以通过两种方式处理。首先,从数据集中删除初始评估中缺失值的样本。然后,使用人工神经网络架构中的“trainlm”函数对这些数据进行训练。然而,减少估计的误差值是很重要的。为此,分别使用人工蜂群算法、粒子群优化算法和入侵杂草算法对人工神经网络架构进行再训练。其次,对包含缺失值的数据集再次执行相同的操作。结果表明,入侵杂草和粒子群优化算法的平均错误率分别为1,45547 e +14和1,23103e +14,性能最佳。
{"title":"A Decision Support System For Detecting Stage In Hodgkin Lymphoma Patients Using Artificial Neural Network and Optimization Algorithms","authors":"Fatma Akalın, M. Orhan, M. Buyukavci","doi":"10.35377/saucis...1210786","DOIUrl":"https://doi.org/10.35377/saucis...1210786","url":null,"abstract":"Hodgkin-type lymphoma is a disease with unique histological, immunophenotypic, and clinical features. This disease occurs in nearly 30% of all lymphomas. Its treatable is high. However, the treatment plan is specified after the stage and risk status are determined. For this reason, it is an important process for doctors to decide on the stage of the disease correctly. Some of the data used for this decision are the patient's history, detailed physical examination, laboratory findings, imaging methods and bone marrow biopsy results. Hybrid FDG-PET is the other method used in the medical world. This method is used in diagnosis, evaluation of response given to treatment, staging and restaging process. However, it is radiation-based. Therefore it has the possibility of producing undesirable results in the future. In this study, an artificial intelligence-based computer-assisted decision support system is done to reduce the number of used medical methods and radiation exposure. Data were obtained from the NCBI-GEO dataset. The evaluation of these data, which contains missing values, is handled in two ways. Firstly, samples with missing values in the initial evaluation are deleted from the dataset. Then, these data are trained with “trainlm” function in artificial neural network architecture. However, reducing the error value of the estimates is important. For this, the artificial neural network architecture is retrained with the artificial bee colony algorithm, particle swarm optimization algorithm and invasive weed algorithm, respectively. Secondly, the same operations are performed again on the dataset containing missing values. As a result of the training, the maximum performance was obtained for invasive weed and particle swarm optimization algorithms with 1,45547E+14 and 1,23103E+14 average error rates, respectively.","PeriodicalId":257636,"journal":{"name":"Sakarya University Journal of Computer and Information Sciences","volume":"171 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123497416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}