Sakarya University Journal of Computer and Information Sciences最新文献_第2页

Preprocessing Impact Analysis for Machine Learning-Based Network Intrusion Detection 基于机器学习的网络入侵检测预处理影响分析

Sakarya University Journal of Computer and Information Sciences

Pub Date : 2023-04-03 DOI: 10.35377/saucis...1223054

Hüseyin Güney

Machine learning (ML) has been frequently used to build intelligent systems in many problem domains, including cybersecurity. For malicious network activity detection, ML-based intrusion detection systems (IDSs) are promising due to their ability to classify attacks autonomously after learning process. However, this is a challenging task due to the vast number of available methods in the current literature, including ML classification algorithms and preprocessing techniques. For analysis the impact of preprocessing techniques on the ML algorithm, this study has conducted extensive experiments, using support vector machines (SVM), the classifier and the FS technique, several normalisation techniques, and a grid-search classifier optimisation algorithm. These methods were sequentially tested on three publicly available network intrusion datasets, NSL-KDD, UNSW-NB15, and CICIDS2017. Subsequently, the results were analysed to investigate the impact of each model and to extract the insights for building intelligent and efficient IDS. The results exhibited that data preprocessing significantly improves classification performance and log-scaling normalisation outperformed other techniques for intrusion detection datasets. Additionally, the results suggested that the embedded SVM-FS is accurate and classifier optimisation can improve performance of classifier-dependent FS techniques. However, feature selection in classifier optimisation is a critical problem that must be addressed. In conclusion, this study provides insights for building ML-based NIDS by revealing important information about data preprocessing.

机器学习(ML)经常被用于在许多问题领域构建智能系统，包括网络安全。对于恶意网络活动检测，基于机器学习的入侵检测系统(ids)由于能够在学习过程后自主分类攻击而具有很大的应用前景。然而，由于目前文献中大量可用的方法，包括ML分类算法和预处理技术，这是一项具有挑战性的任务。为了分析预处理技术对ML算法的影响，本研究进行了广泛的实验，使用了支持向量机(SVM)、分类器和FS技术、几种归一化技术以及网格搜索分类器优化算法。这些方法依次在三个公开可用的网络入侵数据集NSL-KDD、UNSW-NB15和CICIDS2017上进行了测试。随后，对结果进行分析，以调查每个模型的影响，并提取构建智能高效IDS的见解。结果表明，数据预处理显著提高了分类性能，对数尺度规范化优于入侵检测数据集的其他技术。此外，结果表明，嵌入式SVM-FS是准确的，分类器优化可以提高分类器依赖的FS技术的性能。然而，分类器优化中的特征选择是一个必须解决的关键问题。总之，本研究通过揭示有关数据预处理的重要信息，为构建基于ml的NIDS提供了见解。

{"title":"Preprocessing Impact Analysis for Machine Learning-Based Network Intrusion Detection","authors":"Hüseyin Güney","doi":"10.35377/saucis...1223054","DOIUrl":"https://doi.org/10.35377/saucis...1223054","url":null,"abstract":"Machine learning (ML) has been frequently used to build intelligent systems in many problem domains, including cybersecurity. For malicious network activity detection, ML-based intrusion detection systems (IDSs) are promising due to their ability to classify attacks autonomously after learning process. However, this is a challenging task due to the vast number of available methods in the current literature, including ML classification algorithms and preprocessing techniques. For analysis the impact of preprocessing techniques on the ML algorithm, this study has conducted extensive experiments, using support vector machines (SVM), the classifier and the FS technique, several normalisation techniques, and a grid-search classifier optimisation algorithm. These methods were sequentially tested on three publicly available network intrusion datasets, NSL-KDD, UNSW-NB15, and CICIDS2017. Subsequently, the results were analysed to investigate the impact of each model and to extract the insights for building intelligent and efficient IDS. The results exhibited that data preprocessing significantly improves classification performance and log-scaling normalisation outperformed other techniques for intrusion detection datasets. Additionally, the results suggested that the embedded SVM-FS is accurate and classifier optimisation can improve performance of classifier-dependent FS techniques. However, feature selection in classifier optimisation is a critical problem that must be addressed. In conclusion, this study provides insights for building ML-based NIDS by revealing important information about data preprocessing.","PeriodicalId":257636,"journal":{"name":"Sakarya University Journal of Computer and Information Sciences","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115403621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The Effects of Preprocessing on Turkish and English News Data 预处理对土耳其语和英语新闻数据的影响

Sakarya University Journal of Computer and Information Sciences

Pub Date : 2023-03-30 DOI: 10.35377/saucis...1207742

B. Parlak

In a standard text classification (TC) study, preprocessing is one of the key components to improve performance. This study aims to look at how preprocessing effects TC according to news text, text language, and feature selection. All potential combinations of commonly used preprocessing techniques are compared on one domain, namely news data, and in two different news datasets for this aim. Preprocessing technique contributions to classification performance at multiple feature sizes, possible interconnections among these techniques, and technique dependency on corresponding languages are all evaluated in this way. Using best combinations of preprocessing techniques rather than using or not using them all, experimental studies on public datasets reveals that, choosing best combinations of preprocessing techniques can improve classification accuracy significantly.

在标准文本分类(TC)研究中，预处理是提高分类性能的关键组成部分之一。本研究旨在探讨预处理如何根据新闻文本、文本语言和特征选择来影响新闻翻译。为了达到这个目的，在一个领域，即新闻数据和两个不同的新闻数据集中，比较所有常用预处理技术的潜在组合。预处理技术对多特征尺寸下的分类性能的贡献，这些技术之间可能的相互联系，以及技术对相应语言的依赖性都以这种方式进行了评估。在公共数据集上的实验研究表明，选择预处理技术的最佳组合而不是全部使用或不使用预处理技术，可以显著提高分类精度。

引用次数: 2

Ischemia and Hemorrhage detection in CT images with Hyper parameter optimization of classification models and Improved UNet Segmentation Model 基于超参数优化分类模型和改进UNet分割模型的CT图像缺血出血检测

Sakarya University Journal of Computer and Information Sciences

Pub Date : 2023-03-22 DOI: 10.35377/saucis...1259584

Deep learning is a powerful technique that has been applied to the task of stroke detection using medical imaging. Stroke is a medical condition that occurs when the blood supply to the brain is interrupted, which can cause brain damage and other serious complications. Detection of stroke is important in order to minimize damage and improve patient outcomes. One of the most common imaging modalities used for stroke detection is CT(Computed Tomography). CT can provide detailed images of the brain and can be used to identify the presence and location of a stroke. Deep learning models, particularly convolutional neural networks (CNNs), have shown promise for the task of stroke detection using CT images. These models can learn to automatically identify patterns in the images that are indicative of a stroke, such as the presence of an infarct or hemorrhage. Some examples of deep learning models used for stroke detection in CT images are U-Net, which is commonly used for medical image segmentation tasks, and CNNs, which have been trained to classify brain CT images into normal or abnormal. The purpose of this study is to identify the type of stroke from brain CT images taken without the administration of a contrast agent, i.e. occlusive (ischemic) or hemorrhagic (hemorrhagic). Stroke images were collected and a dataset was constructed with medical specialists. Deep learning classification models were evaluated with hyperparameter optimization techniques. And the result segmented with improved Unet model to visualize the stroke in CT images. Classification models were compared and VGG16 achieved %94 success. Unet model was achieved %60 IOU and detected the ischemia and hemorrhage differences.

深度学习是一项强大的技术，已应用于使用医学成像进行中风检测的任务。中风是一种医学疾病，当大脑的血液供应中断时，就会发生中风，这会导致脑损伤和其他严重的并发症。中风的检测对于减少损害和改善病人的预后是很重要的。用于脑卒中检测的最常见的成像方式之一是CT(计算机断层扫描)。CT可以提供大脑的详细图像，并可用于识别中风的存在和位置。深度学习模型，特别是卷积神经网络(cnn)，已经在使用CT图像进行脑卒中检测的任务中显示出了希望。这些模型可以学习自动识别图像中指示中风的模式，例如梗塞或出血的存在。用于CT图像中风检测的深度学习模型的一些例子是U-Net，它通常用于医学图像分割任务，以及cnn，它们已被训练用于将脑CT图像分为正常或异常。本研究的目的是在不使用造影剂的情况下，从脑CT图像中识别中风的类型，即闭塞性(缺血性)或出血性(出血性)。收集中风图像，并与医学专家一起构建数据集。采用超参数优化技术对深度学习分类模型进行评价。并利用改进的Unet模型对结果进行分割，实现脑卒中在CT图像中的可视化。比较了不同的分类模型，VGG16的成功率为94%。Unet模型达到%60 IOU，检测缺血和出血差异。

{"title":"Ischemia and Hemorrhage detection in CT images with Hyper parameter optimization of classification models and Improved UNet Segmentation Model","authors":"","doi":"10.35377/saucis...1259584","DOIUrl":"https://doi.org/10.35377/saucis...1259584","url":null,"abstract":"Deep learning is a powerful technique that has been applied to the task of stroke detection using medical imaging. Stroke is a medical condition that occurs when the blood supply to the brain is interrupted, which can cause brain damage and other serious complications. Detection of stroke is important in order to minimize damage and improve patient outcomes. One of the most common imaging modalities used for stroke detection is CT(Computed Tomography). CT can provide detailed images of the brain and can be used to identify the presence and location of a stroke. Deep learning models, particularly convolutional neural networks (CNNs), have shown promise for the task of stroke detection using CT images. These models can learn to automatically identify patterns in the images that are indicative of a stroke, such as the presence of an infarct or hemorrhage. Some examples of deep learning models used for stroke detection in CT images are U-Net, which is commonly used for medical image segmentation tasks, and CNNs, which have been trained to classify brain CT images into normal or abnormal. \u0000The purpose of this study is to identify the type of stroke from brain CT images taken without the administration of a contrast agent, i.e. occlusive (ischemic) or hemorrhagic (hemorrhagic). Stroke images were collected and a dataset was constructed with medical specialists. Deep learning classification models were evaluated with hyperparameter optimization techniques. And the result segmented with improved Unet model to visualize the stroke in CT images. Classification models were compared and VGG16 achieved %94 success. Unet model was achieved %60 IOU and detected the ischemia and hemorrhage differences.","PeriodicalId":257636,"journal":{"name":"Sakarya University Journal of Computer and Information Sciences","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124427531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Analysis of Urine Sediment Images for Detection and Classification of Cells 用于细胞检测和分类的尿液沉积物图像分析

Sakarya University Journal of Computer and Information Sciences

Pub Date : 2023-03-17 DOI: 10.35377/saucis...1233094

Hilal Atici, H. Koçer, A. Sivrikaya, M. Dağlı

Urine sediment tests are important in the diagnosis of abnormal diseases related to the urinary tract. The formation of cells such as red blood cells and white blood cells in the urine of patients is important for the diagnosis of the disease. Therefore, cells need to be fully identified in clinical urinalysis. Urinalysis with human eyes; Since it is subjective, time consuming and causing errors, methods have been developed to automate microscopic analysis with the help of computer and software systems. In this study, the YOLO-v7 algorithm, which gives successful results in image processing technology, was used as a method and model. The dataset used in the study was created by using microscopic images of urine sediment taken from the Biochemistry Laboratory of the Faculty of Medicine, Selcuk University. Seven different cell segmentation and classification studies have been carried out, including WBC, RBC, WBCC, Epithelial, Flat Epithelial, Mucs and Bubbles, which have clinical value for the diagnosis of the disease. Experimental studies were carried out with the YOLO-v7 algorithm and the results were presented. The contributions of this study can be summarized as follows. (1) In this study, which is proposed for segmentation of cells on the urine cell images in the Urine Sediment dataset, for the experimental studies carried out with the YOLO model, whose performance was evaluated; Precision, Recall, mAP(0.5) and F1-Score(%) segmentation performance metrics were calculated as 0.384, 0.759, 0.432 and 0.510, respectively. (2) A computer-aided support system to assist physicians in segmenting urine cells is presented as a secondary tool. Classification accuracy for WBC, RBC, WBCC, Epithelial, Flat Epithelial, Mucs and Bubbles cells was calculated as 0.78, 0.94, 0.90, 0.57, 0.92, 0.68 and 0.97, respectively. A mean classification success of 0.822 was achieved for all classes. Thus, it has been seen that the Yolov7 model can be used by experts as a tool for recognizing cells in the urine sediment. As a result, it has been shown that suitable deep learning models can be used to recognize the biometric properties of urinary sediment cells. With the model created using deep learning libraries, urine sediment cells can be easily classified, and it is possible to define many different cells if there is a dataset with sufficient number of images.

尿沉渣试验在诊断与泌尿道有关的异常疾病中具有重要意义。患者尿液中红细胞和白细胞等细胞的形成对疾病的诊断很重要。因此，在临床尿液分析中需要充分识别细胞。人眼尿液分析;由于它是主观的、耗时的和容易引起错误的，人们已经开发出在计算机和软件系统的帮助下使显微分析自动化的方法。本研究采用在图像处理技术上取得成功的YOLO-v7算法作为方法和模型。研究中使用的数据集是通过使用从塞尔丘克大学医学院生物化学实验室采集的尿液沉积物的显微图像创建的。开展了WBC、RBC、WBCC、Epithelial、Flat Epithelial、Mucs和Bubbles等7种不同的细胞分割和分类研究，对本病的诊断具有临床价值。利用YOLO-v7算法进行了实验研究，并给出了实验结果。本研究的贡献可以总结如下。(1)本研究提出对尿沉积物数据集中的尿细胞图像进行细胞分割，对YOLO模型进行了实验研究，并对其性能进行了评价;精密度、召回率、mAP(0.5)和F1-Score(%)分割性能指标分别为0.384、0.759、0.432和0.510。(2)辅助医生分割尿细胞的计算机辅助支持系统作为辅助工具。WBC、RBC、WBCC、Epithelial、Flat Epithelial、Mucs和Bubbles细胞的分类准确率分别为0.78、0.94、0.90、0.57、0.92、0.68和0.97。所有类别的平均分类成功率为0.822。因此，已经看到Yolov7模型可以被专家用作识别尿液沉积物中细胞的工具。因此，研究表明，合适的深度学习模型可以用于识别尿沉积物细胞的生物特征。使用深度学习库创建的模型可以很容易地对尿液沉积物细胞进行分类，如果有一个具有足够数量图像的数据集，则可以定义许多不同的细胞。

{"title":"Analysis of Urine Sediment Images for Detection and Classification of Cells","authors":"Hilal Atici, H. Koçer, A. Sivrikaya, M. Dağlı","doi":"10.35377/saucis...1233094","DOIUrl":"https://doi.org/10.35377/saucis...1233094","url":null,"abstract":"Urine sediment tests are important in the diagnosis of abnormal diseases related to the urinary tract. The formation of cells such as red blood cells and white blood cells in the urine of patients is important for the diagnosis of the disease. Therefore, cells need to be fully identified in clinical urinalysis. Urinalysis with human eyes; Since it is subjective, time consuming and causing errors, methods have been developed to automate microscopic analysis with the help of computer and software systems. In this study, the YOLO-v7 algorithm, which gives successful results in image processing technology, was used as a method and model. The dataset used in the study was created by using microscopic images of urine sediment taken from the Biochemistry Laboratory of the Faculty of Medicine, Selcuk University. Seven different cell segmentation and classification studies have been carried out, including WBC, RBC, WBCC, Epithelial, Flat Epithelial, Mucs and Bubbles, which have clinical value for the diagnosis of the disease. Experimental studies were carried out with the YOLO-v7 algorithm and the results were presented. The contributions of this study can be summarized as follows. (1) In this study, which is proposed for segmentation of cells on the urine cell images in the Urine Sediment dataset, for the experimental studies carried out with the YOLO model, whose performance was evaluated; Precision, Recall, mAP(0.5) and F1-Score(%) segmentation performance metrics were calculated as 0.384, 0.759, 0.432 and 0.510, respectively. (2) A computer-aided support system to assist physicians in segmenting urine cells is presented as a secondary tool. Classification accuracy for WBC, RBC, WBCC, Epithelial, Flat Epithelial, Mucs and Bubbles cells was calculated as 0.78, 0.94, 0.90, 0.57, 0.92, 0.68 and 0.97, respectively. A mean classification success of 0.822 was achieved for all classes. Thus, it has been seen that the Yolov7 model can be used by experts as a tool for recognizing cells in the urine sediment. As a result, it has been shown that suitable deep learning models can be used to recognize the biometric properties of urinary sediment cells. With the model created using deep learning libraries, urine sediment cells can be easily classified, and it is possible to define many different cells if there is a dataset with sufficient number of images.","PeriodicalId":257636,"journal":{"name":"Sakarya University Journal of Computer and Information Sciences","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115476400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep Learning-based Road Segmentation & Pedestrian Detection System for Intelligent Vehicles 基于深度学习的智能车辆道路分割与行人检测系统

Sakarya University Journal of Computer and Information Sciences

Pub Date : 2023-02-24 DOI: 10.35377/saucis...1170902

Gozde YOLCU ÖZTEL, İ. Öztel

Correctly determining the driving area and pedestrians is crucial for intelligent vehicles to reduce fatal road accidents risk. But these are challenging tasks in the computer vision field. Various weather, road conditions, etc., make them difficult. This paper presents a vision-based road segmentation and pedestrian detection system. First, the roads are segmented using a deep learning based consecutive triple filter size (CTFS) approach. Then, pedestrians on the segmented roads are detected using the transfer learning approach. The CTFS approach can create feature maps for small and big features. The proposed system is a reliable, low-cost road segmentation and pedestrian detection system for intelligent vehicles.

正确判断行驶区域和行人对智能汽车降低致命交通事故风险至关重要。但在计算机视觉领域，这些都是具有挑战性的任务。各种各样的天气、道路状况等，都使他们很难行走。提出了一种基于视觉的道路分割与行人检测系统。首先，使用基于深度学习的连续三重过滤器大小(CTFS)方法对道路进行分割。然后，使用迁移学习方法检测分段道路上的行人。CTFS方法可以为小功能和大功能创建功能映射。该系统是一种可靠、低成本的智能车辆道路分割和行人检测系统。

引用次数: 0

ON ORBIT DEMONSTRATION OF POINTING ACCURACY OF GROUND ANTENNAS (WITH AND WITHOUT TRACKING CAPABILITY) BY A FLYING GEO SATELLITE 轨道卫星对地面天线(带跟踪能力和不带跟踪能力)指向精度的在轨论证

Sakarya University Journal of Computer and Information Sciences

Pub Date : 2023-02-02 DOI: 10.35377/saucis...1210687

Ü. Yilmaz, Ümit Güler

Geostationary Satellites (GEO) are being used commonly in communication market. The service providers uplink or downlink the signal by using their dedicated antennas (whether with or without tracking capability) to the GEO satellite. The satellite down-converts and amplifies the signal before sending back to the end users on Earth. Normally, the user set and adjust the uplink antenna to follow the GEO satellite movement as much as possible. As soon as there is no reduction in the link budget, this pointing assumed to be successful. On the other hand, the input power of the satellite, together with satellite longitude vs latitude, can give reasonable ideas about the accuracy of the ground antenna pointing. In the study, ground station pointing performance is shown with two different cases. One with tracking and one without tracking capability.

地球同步卫星(GEO)在通信市场上得到了广泛的应用。服务提供商通过使用其专用天线(无论是否具有跟踪能力)将信号上行或下行到GEO卫星。在将信号发回地球上的终端用户之前，卫星对其进行下行转换和放大。通常情况下，用户设置和调整上行天线以尽可能地跟随GEO卫星的运动。只要链路预算没有减少，这个指向就假定是成功的。另一方面，卫星的输入功率以及卫星的经纬度可以合理地反映地面天线指向的精度。在研究中，以两种不同的情况展示了地面站指向性能。一个有跟踪功能，一个没有跟踪功能。

引用次数: 0

Time-series Forecasting of Energy Demand and Impact of the COVID-19 Pandemic on Model Performance in Electric Vehicles 电动汽车能源需求时间序列预测及新冠肺炎疫情对模型性能的影响

Sakarya University Journal of Computer and Information Sciences

Pub Date : 2023-01-19 DOI: 10.35377/saucis...1209519

Pınar Cihan

The increase in environmental problems such as climate change and air pollution caused by global warming has risen the popularity of electric vehicles (EVs) used in the smart grid environment. The increasing number of EVs can affect the grid in terms of power loss and voltage bias by changing the existing demand profile. Effective predicting of EVs energy demand ensures reliability and robustness of grid use, as well as aiding investment planning and resource allocation for charging infrastructures. In this study, the electricity demand amounts in two different cities are modeled by Support Vector Regression, Random Forest, Gauss Process, and Multilayer Perceptron algorithms. The findings reveal that electric vehicle owners usually start to charge their vehicles during the daytime, the COVID-19 pandemic causes a serious decrease in EVs energy demand, and the support vector regression (SVR) is more successful in energy demand forecasting. Furthermore, the results indicate that the decrease in electricity demand during the COVID-19 pandemic caused reduces in the prediction accuracy of the SVR model (decrease of 17.1% in training and 12.6% in test performance, P

全球变暖导致的气候变化和空气污染等环境问题日益严重，使得智能电网环境中使用的电动汽车(ev)越来越受欢迎。电动汽车数量的增加可能会通过改变现有的需求曲线，在功率损失和电压偏置方面影响电网。对电动汽车能源需求的有效预测可以保证电网使用的可靠性和鲁棒性，并有助于充电基础设施的投资规划和资源分配。本文采用支持向量回归、随机森林、高斯过程和多层感知器等算法对两个不同城市的电力需求进行建模。研究结果表明，电动汽车车主通常在白天开始充电，新冠肺炎疫情导致电动汽车能源需求严重下降，支持向量回归(SVR)在能源需求预测中更为成功。此外，结果表明，COVID-19大流行期间电力需求的减少导致SVR模型的预测精度下降(训练下降17.1%，测试性能下降12.6%，P

{"title":"Time-series Forecasting of Energy Demand and Impact of the COVID-19 Pandemic on Model Performance in Electric Vehicles","authors":"Pınar Cihan","doi":"10.35377/saucis...1209519","DOIUrl":"https://doi.org/10.35377/saucis...1209519","url":null,"abstract":"The increase in environmental problems such as climate change and air pollution caused by global warming has risen the popularity of electric vehicles (EVs) used in the smart grid environment. The increasing number of EVs can affect the grid in terms of power loss and voltage bias by changing the existing demand profile. Effective predicting of EVs energy demand ensures reliability and robustness of grid use, as well as aiding investment planning and resource allocation for charging infrastructures. In this study, the electricity demand amounts in two different cities are modeled by Support Vector Regression, Random Forest, Gauss Process, and Multilayer Perceptron algorithms. The findings reveal that electric vehicle owners usually start to charge their vehicles during the daytime, the COVID-19 pandemic causes a serious decrease in EVs energy demand, and the support vector regression (SVR) is more successful in energy demand forecasting. Furthermore, the results indicate that the decrease in electricity demand during the COVID-19 pandemic caused reduces in the prediction accuracy of the SVR model (decrease of 17.1% in training and 12.6% in test performance, P","PeriodicalId":257636,"journal":{"name":"Sakarya University Journal of Computer and Information Sciences","volume":"141 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123752071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

LSTM Hyperparameters optimization with Hparam parameters for Bitcoin Price Prediction 比特币价格预测的Hparam参数LSTM超参数优化

Sakarya University Journal of Computer and Information Sciences

Pub Date : 2023-01-02 DOI: 10.35377/saucis...1172027

I. Kervanci, Fatih Akay

Machine learning and deep learning algorithms produce very different results with different examples of their hyperparameters. Algorithm parameters require optimization because they aren't specific for all problems. In this paper Long Short-Term Memory (LSTM), eight different hyperparameters (go-backward, epoch, batch size, dropout, activation function, optimizer, learning rate and, number of layers) were used to examine to daily and hourly Bitcoin datasets. The effects of each parameter on the daily dataset on the results were evaluated and explained These parameters were examined with hparam properties of Tensorboard. As a result, it was seen that examining all combinations of parameters with hparam produced the best test Mean Square Error (MSE) values with hourly dataset 0.000043633 and daily dataset 0.00073843. Both datasets produced better results with the tanh activation function. Finally, when the results are interpreted, the daily dataset produces better results with a small learning rate and small dropout values, whereas the hourly dataset produces better results with a large learning rate and large dropout values.

机器学习和深度学习算法在不同的超参数示例中产生非常不同的结果。算法参数需要优化，因为它们并不适用于所有问题。在本文长短期记忆(LSTM)中，使用八个不同的超参数(回溯，epoch，批大小，dropout，激活函数，优化器，学习率和层数)来检查每日和每小时的比特币数据集。对每日数据集上的每个参数对结果的影响进行了评估和解释。这些参数使用Tensorboard的hparam属性进行了检查。因此，可以看到，使用hparam检查参数的所有组合产生了最佳的测试均方误差(MSE)值，每小时数据集为0.000043633，每日数据集为0.00073843。使用tanh激活函数，两个数据集都产生了更好的结果。最后，在对结果进行解释时，每天的数据集在较小的学习率和较小的dropout值下产生较好的结果，而每小时的数据集在较大的学习率和较大的dropout值下产生较好的结果。

{"title":"LSTM Hyperparameters optimization with Hparam parameters for Bitcoin Price Prediction","authors":"I. Kervanci, Fatih Akay","doi":"10.35377/saucis...1172027","DOIUrl":"https://doi.org/10.35377/saucis...1172027","url":null,"abstract":"Machine learning and deep learning algorithms produce very different results with different examples of their hyperparameters. Algorithm parameters require optimization because they aren't specific for all problems. In this paper Long Short-Term Memory (LSTM), eight different hyperparameters (go-backward, epoch, batch size, dropout, activation function, optimizer, learning rate and, number of layers) were used to examine to daily and hourly Bitcoin datasets. The effects of each parameter on the daily dataset on the results were evaluated and explained These parameters were examined with hparam properties of Tensorboard. As a result, it was seen that examining all combinations of parameters with hparam produced the best test Mean Square Error (MSE) values with hourly dataset 0.000043633 and daily dataset 0.00073843. Both datasets produced better results with the tanh activation function. Finally, when the results are interpreted, the daily dataset produces better results with a small learning rate and small dropout values, whereas the hourly dataset produces better results with a large learning rate and large dropout values.","PeriodicalId":257636,"journal":{"name":"Sakarya University Journal of Computer and Information Sciences","volume":"432 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115953938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automatic Classification of White Blood Cells Using Pre-Trained Deep Models 使用预训练深度模型的白细胞自动分类

Sakarya University Journal of Computer and Information Sciences

Pub Date : 2022-12-22 DOI: 10.35377/saucis...1196934

Oğuzhan Katar, Ilhan Firat Kilincer

White blood cells (WBCs), which are part of the immune system, help our body fight infections and other diseases. Certain diseases can cause our body to produce fewer WBCs than it needs. For this reason, WBCs are of great importance in the field of medical imaging. Artificial intelligence-based computer systems can assist experts in the analysis of WBCs. In this study, an approach is proposed for the automatic classification of WBCs over five different classes using a pre-trained model. ResNet-50, VGG-19, and MobileNet-V3-Small pre-trained models were trained with ImageNet weights. In the training, validation, and testing processes of the models, a public dataset containing 16,633 images and not having an even class distribution was used. While the ResNet-50 model reached 98.79% accuracy, the VGG-19 model reached 98.19% accuracy, the MobileNet-V3-Small model reached the highest accuracy rate with 98.86%. When the predictions of the MobileNet-V3-Small model are examined, it is seen that it is not affected by class dominance and can classify even the least sampled class images in the dataset correctly. WBCs were classified with high accuracy using the proposed pre-trained deep learning models. Experts can effectively use the proposed approach in the process of analyzing WBCs.

白细胞(wbc)是免疫系统的一部分，帮助我们的身体对抗感染和其他疾病。某些疾病会导致我们的身体产生比需要的更少的白细胞。因此，白细胞在医学成像领域具有重要意义。基于人工智能的计算机系统可以协助专家分析白细胞。在本研究中，提出了一种使用预训练模型对五个不同类别的白细胞进行自动分类的方法。ResNet-50、VGG-19和MobileNet-V3-Small预训练模型使用ImageNet权值进行训练。在模型的训练、验证和测试过程中，使用了包含16,633张图像的公共数据集，并且没有均匀的类分布。ResNet-50模型的准确率为98.79%，VGG-19模型的准确率为98.19%，而MobileNet-V3-Small模型的准确率最高，为98.86%。当对MobileNet-V3-Small模型的预测进行检验时，可以看到它不受类别优势的影响，甚至可以正确分类数据集中采样最少的类别图像。使用所提出的预训练深度学习模型对wbc进行了高精度分类。专家可以在分析白细胞的过程中有效地使用所提出的方法。

{"title":"Automatic Classification of White Blood Cells Using Pre-Trained Deep Models","authors":"Oğuzhan Katar, Ilhan Firat Kilincer","doi":"10.35377/saucis...1196934","DOIUrl":"https://doi.org/10.35377/saucis...1196934","url":null,"abstract":"White blood cells (WBCs), which are part of the immune system, help our body fight infections and other diseases. Certain diseases can cause our body to produce fewer WBCs than it needs. For this reason, WBCs are of great importance in the field of medical imaging. Artificial intelligence-based computer systems can assist experts in the analysis of WBCs. In this study, an approach is proposed for the automatic classification of WBCs over five different classes using a pre-trained model. ResNet-50, VGG-19, and MobileNet-V3-Small pre-trained models were trained with ImageNet weights. In the training, validation, and testing processes of the models, a public dataset containing 16,633 images and not having an even class distribution was used. While the ResNet-50 model reached 98.79% accuracy, the VGG-19 model reached 98.19% accuracy, the MobileNet-V3-Small model reached the highest accuracy rate with 98.86%. When the predictions of the MobileNet-V3-Small model are examined, it is seen that it is not affected by class dominance and can classify even the least sampled class images in the dataset correctly. WBCs were classified with high accuracy using the proposed pre-trained deep learning models. Experts can effectively use the proposed approach in the process of analyzing WBCs.","PeriodicalId":257636,"journal":{"name":"Sakarya University Journal of Computer and Information Sciences","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132011008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

A Decision Support System For Detecting Stage In Hodgkin Lymphoma Patients Using Artificial Neural Network and Optimization Algorithms 基于人工神经网络和优化算法的霍奇金淋巴瘤分期检测决策支持系统

Sakarya University Journal of Computer and Information Sciences

Pub Date : 2022-12-06 DOI: 10.35377/saucis...1210786

Fatma Akalın, M. Orhan, M. Buyukavci

Hodgkin-type lymphoma is a disease with unique histological, immunophenotypic, and clinical features. This disease occurs in nearly 30% of all lymphomas. Its treatable is high. However, the treatment plan is specified after the stage and risk status are determined. For this reason, it is an important process for doctors to decide on the stage of the disease correctly. Some of the data used for this decision are the patient's history, detailed physical examination, laboratory findings, imaging methods and bone marrow biopsy results. Hybrid FDG-PET is the other method used in the medical world. This method is used in diagnosis, evaluation of response given to treatment, staging and restaging process. However, it is radiation-based. Therefore it has the possibility of producing undesirable results in the future. In this study, an artificial intelligence-based computer-assisted decision support system is done to reduce the number of used medical methods and radiation exposure. Data were obtained from the NCBI-GEO dataset. The evaluation of these data, which contains missing values, is handled in two ways. Firstly, samples with missing values in the initial evaluation are deleted from the dataset. Then, these data are trained with “trainlm” function in artificial neural network architecture. However, reducing the error value of the estimates is important. For this, the artificial neural network architecture is retrained with the artificial bee colony algorithm, particle swarm optimization algorithm and invasive weed algorithm, respectively. Secondly, the same operations are performed again on the dataset containing missing values. As a result of the training, the maximum performance was obtained for invasive weed and particle swarm optimization algorithms with 1,45547E+14 and 1,23103E+14 average error rates, respectively.

霍奇金淋巴瘤是一种具有独特组织学、免疫表型和临床特征的疾病。这种疾病发生在所有淋巴瘤的近30%。它的可治疗性很高。然而，治疗方案是在确定阶段和风险状态后确定的。因此，医生正确判断疾病的阶段是一个重要的过程。用于此决定的一些数据包括患者的病史、详细的体格检查、实验室结果、成像方法和骨髓活检结果。混合FDG-PET是医学界使用的另一种方法。该方法用于诊断，治疗反应评价，分期和再分期过程。然而，它是基于辐射的。因此，它有可能在未来产生不良后果。在这项研究中，基于人工智能的计算机辅助决策支持系统，以减少使用的医疗方法和辐射暴露的数量。数据来自NCBI-GEO数据集。这些包含缺失值的数据的求值可以通过两种方式处理。首先，从数据集中删除初始评估中缺失值的样本。然后，使用人工神经网络架构中的“trainlm”函数对这些数据进行训练。然而，减少估计的误差值是很重要的。为此，分别使用人工蜂群算法、粒子群优化算法和入侵杂草算法对人工神经网络架构进行再训练。其次，对包含缺失值的数据集再次执行相同的操作。结果表明，入侵杂草和粒子群优化算法的平均错误率分别为1,45547 e +14和1,23103e +14，性能最佳。

{"title":"A Decision Support System For Detecting Stage In Hodgkin Lymphoma Patients Using Artificial Neural Network and Optimization Algorithms","authors":"Fatma Akalın, M. Orhan, M. Buyukavci","doi":"10.35377/saucis...1210786","DOIUrl":"https://doi.org/10.35377/saucis...1210786","url":null,"abstract":"Hodgkin-type lymphoma is a disease with unique histological, immunophenotypic, and clinical features. This disease occurs in nearly 30% of all lymphomas. Its treatable is high. However, the treatment plan is specified after the stage and risk status are determined. For this reason, it is an important process for doctors to decide on the stage of the disease correctly. Some of the data used for this decision are the patient's history, detailed physical examination, laboratory findings, imaging methods and bone marrow biopsy results. Hybrid FDG-PET is the other method used in the medical world. This method is used in diagnosis, evaluation of response given to treatment, staging and restaging process. However, it is radiation-based. Therefore it has the possibility of producing undesirable results in the future. In this study, an artificial intelligence-based computer-assisted decision support system is done to reduce the number of used medical methods and radiation exposure. Data were obtained from the NCBI-GEO dataset. The evaluation of these data, which contains missing values, is handled in two ways. Firstly, samples with missing values in the initial evaluation are deleted from the dataset. Then, these data are trained with “trainlm” function in artificial neural network architecture. However, reducing the error value of the estimates is important. For this, the artificial neural network architecture is retrained with the artificial bee colony algorithm, particle swarm optimization algorithm and invasive weed algorithm, respectively. Secondly, the same operations are performed again on the dataset containing missing values. As a result of the training, the maximum performance was obtained for invasive weed and particle swarm optimization algorithms with 1,45547E+14 and 1,23103E+14 average error rates, respectively.","PeriodicalId":257636,"journal":{"name":"Sakarya University Journal of Computer and Information Sciences","volume":"171 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123497416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1