Pub Date : 2022-01-01Epub Date: 2021-09-09DOI: 10.1007/s11227-021-04040-8
Ramin Safa, Peyman Bayat, Leila Moghtader
Depression is the most prevalent mental disorder that can lead to suicide. Due to the tendency of people to share their thoughts on social platforms, social data contain valuable information that can be used to identify user's psychological states. In this paper, we provide an automated approach to collect and evaluate tweets based on self-reported statements and present a novel multimodal framework to predict depression symptoms from user profiles. We used n-gram language models, LIWC dictionaries, automatic image tagging, and bag-of-visual-words. We consider the correlation-based feature selection and nine different classifiers with standard evaluation metrics to assess the effectiveness of the method. Based on the analysis, the tweets and bio-text alone showed 91% and 83% accuracy in predicting depressive symptoms, respectively, which seems to be an acceptable result. We also believe performance improvements can be achieved by limiting the user domain or presence of clinical information.
{"title":"Automatic detection of depression symptoms in twitter using multimodal analysis.","authors":"Ramin Safa, Peyman Bayat, Leila Moghtader","doi":"10.1007/s11227-021-04040-8","DOIUrl":"10.1007/s11227-021-04040-8","url":null,"abstract":"<p><p>Depression is the most prevalent mental disorder that can lead to suicide. Due to the tendency of people to share their thoughts on social platforms, social data contain valuable information that can be used to identify user's psychological states. In this paper, we provide an automated approach to collect and evaluate tweets based on self-reported statements and present a novel multimodal framework to predict depression symptoms from user profiles. We used n-gram language models, LIWC dictionaries, automatic image tagging, and bag-of-visual-words. We consider the correlation-based feature selection and nine different classifiers with standard evaluation metrics to assess the effectiveness of the method. Based on the analysis, the tweets and bio-text alone showed 91% and 83% accuracy in predicting depressive symptoms, respectively, which seems to be an acceptable result. We also believe performance improvements can be achieved by limiting the user domain or presence of clinical information.</p>","PeriodicalId":50034,"journal":{"name":"Journal of Supercomputing","volume":"78 4","pages":"4709-4744"},"PeriodicalIF":2.5,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8426595/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39414694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01Epub Date: 2021-06-04DOI: 10.1007/s11227-021-03921-2
Fatma Kuncan, Yılmaz Kaya, Ramazan Tekin, Melih Kuncan
In recent years, it has been observed that many researchers have been working on different areas of detection, recognition and monitoring of human activities. The automatic determination of human physical activities is often referred to as human activity recognition (HAR). One of the most important technology that detects and tracks the activity of the human body is sensor-based HAR technology. In recent days, sensor-based HAR attracts attention in the field of computers due to its wide use in daily life and is a rapidly growing field of research. Activity recognition (AR) application is carried out by evaluating the signals obtained from various sensors placed in the human body. In this study, a new approach is proposed to extract features from sensor signals using HAR. The proposed approach is inspired by the Gray Level Co-Occurrence Matrix (GLCM) method, which is widely used in image processing, but it is applied to one-dimensional signals, unlike GLCM. Two datasets were used to test the proposed approach. The datasets were created from the signals obtained from the accelerometer, gyro and magnetometer sensors. Heralick features were obtained from co-occurrence matrix created after 1D-GLCM (One (1) Dimensional-Gray Level Co-Occurrence Matrix) was applied to the signals. HAR operation has been carried out for different scenarios using these features. Success rates of 96.66 and 93.88% were obtained for two datasets, respectively. It has been observed that the new approach proposed within the scope of the study provides high success rates for HAR applications. It is thought that the proposed approach can be used in the classification of different signals.
近年来,人们注意到许多研究人员一直在研究人类活动的检测、识别和监测的不同领域。人类身体活动的自动测定通常被称为人类活动识别(HAR)。检测和跟踪人体活动的最重要的技术之一是基于传感器的HAR技术。近年来,基于传感器的HAR因其在日常生活中的广泛应用而受到计算机领域的关注,是一个快速发展的研究领域。活动识别(AR)应用是通过评估从放置在人体中的各种传感器获得的信号来进行的。本文提出了一种利用HAR提取传感器信号特征的新方法。该方法受到灰度共生矩阵(GLCM)方法的启发,该方法在图像处理中广泛使用,但与GLCM不同,它适用于一维信号。使用两个数据集来测试所提出的方法。数据集是根据加速度计、陀螺仪和磁力计传感器获得的信号创建的。利用一维灰度共生矩阵(1D-GLCM, One (1) Dimensional-Gray - Level co-occurrence matrix)对信号进行处理后生成的共生矩阵获得纹章特征。HAR操作已经在使用这些特性的不同场景中执行。两个数据集的成功率分别为96.66%和93.88%。据观察,在研究范围内提出的新方法为HAR应用提供了高成功率。认为该方法可用于不同信号的分类。
{"title":"A new approach for physical human activity recognition based on co-occurrence matrices.","authors":"Fatma Kuncan, Yılmaz Kaya, Ramazan Tekin, Melih Kuncan","doi":"10.1007/s11227-021-03921-2","DOIUrl":"https://doi.org/10.1007/s11227-021-03921-2","url":null,"abstract":"<p><p>In recent years, it has been observed that many researchers have been working on different areas of detection, recognition and monitoring of human activities. The automatic determination of human physical activities is often referred to as human activity recognition (HAR). One of the most important technology that detects and tracks the activity of the human body is sensor-based HAR technology. In recent days, sensor-based HAR attracts attention in the field of computers due to its wide use in daily life and is a rapidly growing field of research. Activity recognition (AR) application is carried out by evaluating the signals obtained from various sensors placed in the human body. In this study, a new approach is proposed to extract features from sensor signals using HAR. The proposed approach is inspired by the Gray Level Co-Occurrence Matrix (GLCM) method, which is widely used in image processing, but it is applied to one-dimensional signals, unlike GLCM. Two datasets were used to test the proposed approach. The datasets were created from the signals obtained from the accelerometer, gyro and magnetometer sensors. Heralick features were obtained from co-occurrence matrix created after 1D-GLCM (One (1) Dimensional-Gray Level Co-Occurrence Matrix) was applied to the signals. HAR operation has been carried out for different scenarios using these features. Success rates of 96.66 and 93.88% were obtained for two datasets, respectively. It has been observed that the new approach proposed within the scope of the study provides high success rates for HAR applications. It is thought that the proposed approach can be used in the classification of different signals.</p>","PeriodicalId":50034,"journal":{"name":"Journal of Supercomputing","volume":"78 1","pages":"1048-1070"},"PeriodicalIF":3.3,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s11227-021-03921-2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39075829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01Epub Date: 2022-02-01DOI: 10.1007/s11227-021-04238-w
Jia-Lang Xu, Ying-Lin Hsu
Agricultural exports are an important source of economic profit for many countries. Accurate predictions of a country's agricultural exports month on month are key to understanding a country's domestic use and export figures and facilitate advance planning of export, import, and domestic use figures and the resulting necessary adjustments of production and marketing. This study proposes a novel method for predicting the rise and fall of agricultural exports, called agricultural exports time series-long short-term memory (AETS-LSTM). The method applies Jieba word segmentation and Word2Vec to train word vectors and uses TF-IDF and word cloud to learn news-related keywords and finally obtain keyword vectors. This research explores whether the purchasing managers' index (PMI) of each industry can effectively use the AETS-LSTM model to predict the rise and fall of agricultural exports. Research results show that the inclusion of keyword vectors in the PMI values of the finance and insurance industries has a relative impact on the prediction of the rise and fall of agricultural exports, which can improve the prediction accuracy for the rise and fall of agricultural exports by 82.61%. The proposed method achieves improved prediction ability for the chemical/biological/medical, transportation equipment, wholesale, finance and insurance, food and textiles, basic materials, education/professional, science/technical, information/communications/broadcasting, transportation and storage, retail, and electrical and machinery equipment categories, while its performance for the electrical and optical categories shows improved prediction by combining keyword vectors, and its accuracy for the accommodation and food service, and construction and real estate industries remained unchanged. Therefore, the proposed method offers improved prediction capacity for agricultural exports month on month, allowing agribusiness operators and policy makers to evaluate and adjust domestic and foreign production and sales.
{"title":"Analysis of agricultural exports based on deep learning and text mining.","authors":"Jia-Lang Xu, Ying-Lin Hsu","doi":"10.1007/s11227-021-04238-w","DOIUrl":"https://doi.org/10.1007/s11227-021-04238-w","url":null,"abstract":"<p><p>Agricultural exports are an important source of economic profit for many countries. Accurate predictions of a country's agricultural exports month on month are key to understanding a country's domestic use and export figures and facilitate advance planning of export, import, and domestic use figures and the resulting necessary adjustments of production and marketing. This study proposes a novel method for predicting the rise and fall of agricultural exports, called agricultural exports time series-long short-term memory (AETS-LSTM). The method applies Jieba word segmentation and Word2Vec to train word vectors and uses TF-IDF and word cloud to learn news-related keywords and finally obtain keyword vectors. This research explores whether the purchasing managers' index (PMI) of each industry can effectively use the AETS-LSTM model to predict the rise and fall of agricultural exports. Research results show that the inclusion of keyword vectors in the PMI values of the finance and insurance industries has a relative impact on the prediction of the rise and fall of agricultural exports, which can improve the prediction accuracy for the rise and fall of agricultural exports by 82.61%. The proposed method achieves improved prediction ability for the chemical/biological/medical, transportation equipment, wholesale, finance and insurance, food and textiles, basic materials, education/professional, science/technical, information/communications/broadcasting, transportation and storage, retail, and electrical and machinery equipment categories, while its performance for the electrical and optical categories shows improved prediction by combining keyword vectors, and its accuracy for the accommodation and food service, and construction and real estate industries remained unchanged. Therefore, the proposed method offers improved prediction capacity for agricultural exports month on month, allowing agribusiness operators and policy makers to evaluate and adjust domestic and foreign production and sales.</p>","PeriodicalId":50034,"journal":{"name":"Journal of Supercomputing","volume":"78 8","pages":"10876-10892"},"PeriodicalIF":3.3,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8804672/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39893695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01Epub Date: 2021-08-03DOI: 10.1007/s11227-021-03972-5
Peng Su, Yuanyuan Chen, Mengmeng Lu
This study is to explore the smart city information (SCI) processing technology based on the Internet of Things (IoT) and cloud computing, promoting the construction of smart cities in the direction of effective sharing and interconnection. In this study, a SCI system is constructed based on the information islands in the smart construction of various fields in smart cities. The smart environment monitoring, smart transportation, and smart epidemic prevention at the application layer of the SCI system are designed separately. A multi-objective optimization algorithm for cloud computing virtual machine resource allocation method (CC-VMRA method) is proposed, and the application of the IoT and cloud computing technology in the smart city information system is further analysed and simulated for the performance verification. The results show that the multi-objective optimization algorithm in the CC-VMRA method can greatly reduce the number of physical servers in the SCI system (less than 20), and the variance is not higher than 0.0024, which can enable the server cluster to achieve better load balancing effects. In addition, the packet loss rate of the Zigbee protocol used by the IoT gateway in the SCI system is far below the 0.1% indicator, and the delay is less than 10 ms. Therefore, the SCI system constructed by this study shows low latency and high utilization rate, which can provide experimental reference for the later construction of smart city.
{"title":"Smart city information processing under internet of things and cloud computing.","authors":"Peng Su, Yuanyuan Chen, Mengmeng Lu","doi":"10.1007/s11227-021-03972-5","DOIUrl":"https://doi.org/10.1007/s11227-021-03972-5","url":null,"abstract":"<p><p>This study is to explore the smart city information (SCI) processing technology based on the Internet of Things (IoT) and cloud computing, promoting the construction of smart cities in the direction of effective sharing and interconnection. In this study, a SCI system is constructed based on the information islands in the smart construction of various fields in smart cities. The smart environment monitoring, smart transportation, and smart epidemic prevention at the application layer of the SCI system are designed separately. A multi-objective optimization algorithm for cloud computing virtual machine resource allocation method (CC-VMRA method) is proposed, and the application of the IoT and cloud computing technology in the smart city information system is further analysed and simulated for the performance verification. The results show that the multi-objective optimization algorithm in the CC-VMRA method can greatly reduce the number of physical servers in the SCI system (less than 20), and the variance is not higher than 0.0024, which can enable the server cluster to achieve better load balancing effects. In addition, the packet loss rate of the Zigbee protocol used by the IoT gateway in the SCI system is far below the 0.1% indicator, and the delay is less than 10 ms. Therefore, the SCI system constructed by this study shows low latency and high utilization rate, which can provide experimental reference for the later construction of smart city.</p>","PeriodicalId":50034,"journal":{"name":"Journal of Supercomputing","volume":"78 3","pages":"3676-3695"},"PeriodicalIF":3.3,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s11227-021-03972-5","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39290910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01Epub Date: 2021-09-20DOI: 10.1007/s11227-021-04076-w
Chhaihuoy Long, Eunhye Jo, Yunyoung Nam
Yoga is a form of exercise that is beneficial for health, focusing on physical, mental, and spiritual connections. However, practicing yoga and adopting incorrect postures can cause health problems, such as muscle sprains and pain. In this study, we propose the development of a yoga posture coaching system using an interactive display, based on a transfer learning technique. The 14 different yoga postures were collected from an RGB camera, and eight participants were required to perform each yoga posture 10 times. Data augmentation was applied to oversample and prevent over-fitting of the training datasets. Six transfer learning models (TL-VGG16-DA, TL-VGG19-DA, TL-MobileNet-DA, TL-MobileNetV2-DA, TL-InceptionV3-DA, and TL-DenseNet201-DA) were exploited for classification tasks to select the optimal model for the yoga coaching system, based on evaluation metrics. As a result, the TL-MobileNet-DA model was selected as the optimal model, showing an overall accuracy of 98.43%, sensitivity of 98.30%, specificity of 99.88%, and Matthews correlation coefficient of 0.9831. The study presented a yoga posture coaching system that recognized the yoga posture movement of users, in real time, according to the selected yoga posture guidance and can coach them to avoid incorrect postures.
{"title":"Development of a yoga posture coaching system using an interactive display based on transfer learning.","authors":"Chhaihuoy Long, Eunhye Jo, Yunyoung Nam","doi":"10.1007/s11227-021-04076-w","DOIUrl":"https://doi.org/10.1007/s11227-021-04076-w","url":null,"abstract":"<p><p>Yoga is a form of exercise that is beneficial for health, focusing on physical, mental, and spiritual connections. However, practicing yoga and adopting incorrect postures can cause health problems, such as muscle sprains and pain. In this study, we propose the development of a yoga posture coaching system using an interactive display, based on a transfer learning technique. The 14 different yoga postures were collected from an RGB camera, and eight participants were required to perform each yoga posture 10 times. Data augmentation was applied to oversample and prevent over-fitting of the training datasets. Six transfer learning models (TL-VGG16-DA, TL-VGG19-DA, TL-MobileNet-DA, TL-MobileNetV2-DA, TL-InceptionV3-DA, and TL-DenseNet201-DA) were exploited for classification tasks to select the optimal model for the yoga coaching system, based on evaluation metrics. As a result, the TL-MobileNet-DA model was selected as the optimal model, showing an overall accuracy of 98.43%, sensitivity of 98.30%, specificity of 99.88%, and Matthews correlation coefficient of 0.9831. The study presented a yoga posture coaching system that recognized the yoga posture movement of users, in real time, according to the selected yoga posture guidance and can coach them to avoid incorrect postures.</p>","PeriodicalId":50034,"journal":{"name":"Journal of Supercomputing","volume":"78 4","pages":"5269-5284"},"PeriodicalIF":3.3,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8451169/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39451307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pairs trading is an effective statistical arbitrage strategy considering the spread of paired stocks in a stable cointegration relationship. Nevertheless, rapid market changes may break the relationship (namely structural break), which further leads to tremendous loss in intraday trading. In this paper, we design a two-phase pairs trading strategy optimization framework, namely structural break-aware pairs trading strategy (SAPT), by leveraging machine learning techniques. Phase one is a hybrid model extracting frequency- and time-domain features to detect structural breaks. Phase two optimizes pairs trading strategy by sensing important risks, including structural breaks and market-closing risks, with a novel reinforcement learning model. In addition, the transaction cost is factored in a cost-aware objective to avoid significant reduction of profitability. Through large-scale experiments in real Taiwan stock market datasets, SAPT outperforms the state-of-the-art strategies by at least 456% and 934% in terms of profit and Sortino ratio, respectively.
{"title":"Structural break-aware pairs trading strategy using deep reinforcement learning.","authors":"Jing-You Lu, Hsu-Chao Lai, Wen-Yueh Shih, Yi-Feng Chen, Shen-Hang Huang, Hao-Han Chang, Jun-Zhe Wang, Jiun-Long Huang, Tian-Shyr Dai","doi":"10.1007/s11227-021-04013-x","DOIUrl":"10.1007/s11227-021-04013-x","url":null,"abstract":"<p><p><i>Pairs trading</i> is an effective statistical arbitrage strategy considering the spread of paired stocks in a stable cointegration relationship. Nevertheless, rapid market changes may break the relationship (namely structural break), which further leads to tremendous loss in intraday trading. In this paper, we design a two-phase pairs trading strategy optimization framework, namely <i>structural break-aware pairs trading strategy</i> (<i>SAPT</i>), by leveraging machine learning techniques. Phase one is a hybrid model extracting frequency- and time-domain features to detect structural breaks. Phase two optimizes pairs trading strategy by sensing important risks, including structural breaks and market-closing risks, with a novel reinforcement learning model. In addition, the transaction cost is factored in a cost-aware objective to avoid significant reduction of profitability. Through large-scale experiments in real Taiwan stock market datasets, SAPT outperforms the state-of-the-art strategies by at least 456% and 934% in terms of profit and Sortino ratio, respectively.</p>","PeriodicalId":50034,"journal":{"name":"Journal of Supercomputing","volume":"78 3","pages":"3843-3882"},"PeriodicalIF":3.3,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8369334/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39336202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01Epub Date: 2022-01-16DOI: 10.1007/s11227-021-04241-1
Nilesh Vishwasrao Patil, C Rama Krishna, Krishan Kumar
A distributed denial of service (DDoS) attack is the most destructive threat for internet-based systems and their resources. It stops the execution of victims by transferring large numbers of network traces. Due to this, legitimate users experience a delay while accessing internet-based systems and their resources. Even a short delay in responses leads to a massive financial loss. Numerous techniques have been proposed to protect internet-based systems from various kinds of DDoS attacks. However, the frequency and strength of attacks are increasing year-after-year. This paper proposes a novel Apache Kafka Streams-based distributed classification approach named KS-DDoS. For this classification approach, firstly, we design distributed classification models on the Hadoop cluster using highly scalable machine learning algorithms by fetching data from Hadoop distributed files system (HDFS). Secondly, we deploy an efficient distributed classification model on the Kafka Stream cluster to classify incoming network traces into nine classes in real-time. Further, this distributed classification approach stores highly discriminative features with predicted outcomes into HDFS for creating/updating models using a new set of instances. We implemented a distributed processing framework-based experimental environment to design, deploy, and validate the proposed classification approach for DDoS attacks. The results show that the proposed distributed KS-DDoS classification approach efficiently classifies incoming network traces with at least 80% classification accuracy.
{"title":"KS-DDoS: Kafka streams-based classification approach for DDoS attacks.","authors":"Nilesh Vishwasrao Patil, C Rama Krishna, Krishan Kumar","doi":"10.1007/s11227-021-04241-1","DOIUrl":"https://doi.org/10.1007/s11227-021-04241-1","url":null,"abstract":"<p><p>A distributed denial of service (DDoS) attack is the most destructive threat for internet-based systems and their resources. It stops the execution of victims by transferring large numbers of network traces. Due to this, legitimate users experience a delay while accessing internet-based systems and their resources. Even a short delay in responses leads to a massive financial loss. Numerous techniques have been proposed to protect internet-based systems from various kinds of DDoS attacks. However, the frequency and strength of attacks are increasing year-after-year. This paper proposes a novel Apache Kafka Streams-based distributed classification approach named KS-DDoS. For this classification approach, firstly, we design distributed classification models on the Hadoop cluster using highly scalable machine learning algorithms by fetching data from Hadoop distributed files system (HDFS). Secondly, we deploy an efficient distributed classification model on the Kafka Stream cluster to classify incoming network traces into nine classes in real-time. Further, this distributed classification approach stores highly discriminative features with predicted outcomes into HDFS for creating/updating models using a new set of instances. We implemented a distributed processing framework-based experimental environment to design, deploy, and validate the proposed classification approach for DDoS attacks. The results show that the proposed distributed KS-DDoS classification approach efficiently classifies incoming network traces with at least 80% classification accuracy.</p>","PeriodicalId":50034,"journal":{"name":"Journal of Supercomputing","volume":"78 6","pages":"8946-8976"},"PeriodicalIF":3.3,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8761113/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39941274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01Epub Date: 2022-01-07DOI: 10.1007/s11227-021-04253-x
Mohammad Najafimehr, Sajjad Zarifzadeh, Seyedakbar Mostafavi
Service availability plays a vital role on computer networks, against which Distributed Denial of Service (DDoS) attacks are an increasingly growing threat each year. Machine learning (ML) is a promising approach widely used for DDoS detection, which obtains satisfactory results for pre-known attacks. However, they are almost incapable of detecting unknown malicious traffic. This paper proposes a novel method combining both supervised and unsupervised algorithms. First, a clustering algorithm separates the anomalous traffic from the normal data using several flow-based features. Then, using certain statistical measures, a classification algorithm is used to label the clusters. Employing a big data processing framework, we evaluate the proposed method by training on the CICIDS2017 dataset and testing on a different set of attacks provided in the more up-to-date CICDDoS2019. The results demonstrate that the Positive Likelihood Ratio (LR+) of our method is approximately 198% higher than the ML classification algorithms.
{"title":"A hybrid machine learning approach for detecting unprecedented DDoS attacks.","authors":"Mohammad Najafimehr, Sajjad Zarifzadeh, Seyedakbar Mostafavi","doi":"10.1007/s11227-021-04253-x","DOIUrl":"https://doi.org/10.1007/s11227-021-04253-x","url":null,"abstract":"<p><p>Service availability plays a vital role on computer networks, against which Distributed Denial of Service (DDoS) attacks are an increasingly growing threat each year. Machine learning (ML) is a promising approach widely used for DDoS detection, which obtains satisfactory results for pre-known attacks. However, they are almost incapable of detecting unknown malicious traffic. This paper proposes a novel method combining both supervised and unsupervised algorithms. First, a clustering algorithm separates the anomalous traffic from the normal data using several flow-based features. Then, using certain statistical measures, a classification algorithm is used to label the clusters. Employing a big data processing framework, we evaluate the proposed method by training on the CICIDS2017 dataset and testing on a different set of attacks provided in the more up-to-date CICDDoS2019. The results demonstrate that the Positive Likelihood Ratio (LR+) of our method is approximately 198% higher than the ML classification algorithms.</p>","PeriodicalId":50034,"journal":{"name":"Journal of Supercomputing","volume":"78 6","pages":"8106-8136"},"PeriodicalIF":3.3,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8739683/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39688568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01DOI: 10.1007/s11227-021-03943-w
Hojjat Emami
In this paper, a human-inspired optimization algorithm called stock exchange trading optimization (SETO) for solving numerical and engineering problems is introduced. The inspiration source of this optimizer is the behavior of traders and stock price changes in the stock market. Traders use various fundamental and technical analysis methods to gain maximum profit. SETO mathematically models the technical trading strategy of traders to perform optimization. It contains three main actuators including rising, falling, and exchange. These operators navigate the search agents toward the global optimum. The proposed algorithm is compared with seven popular meta-heuristic optimizers on forty single-objective unconstraint numerical functions and four engineering design problems. The statistical results obtained on test problems show that SETO is capable of providing competitive and promising performances compared with counterpart algorithms in solving optimization problems of different dimensions, especially 1000-dimension problems. Out of 40 numerical functions, the SETO algorithm has achieved the global optimum on 36 functions, and out of 4 engineering problems, it has obtained the best results on 3 problems.
{"title":"Stock exchange trading optimization algorithm: a human-inspired method for global optimization.","authors":"Hojjat Emami","doi":"10.1007/s11227-021-03943-w","DOIUrl":"https://doi.org/10.1007/s11227-021-03943-w","url":null,"abstract":"<p><p>In this paper, a human-inspired optimization algorithm called stock exchange trading optimization (SETO) for solving numerical and engineering problems is introduced. The inspiration source of this optimizer is the behavior of traders and stock price changes in the stock market. Traders use various fundamental and technical analysis methods to gain maximum profit. SETO mathematically models the technical trading strategy of traders to perform optimization. It contains three main actuators including rising, falling, and exchange. These operators navigate the search agents toward the global optimum. The proposed algorithm is compared with seven popular meta-heuristic optimizers on forty single-objective unconstraint numerical functions and four engineering design problems. The statistical results obtained on test problems show that SETO is capable of providing competitive and promising performances compared with counterpart algorithms in solving optimization problems of different dimensions, especially 1000-dimension problems. Out of 40 numerical functions, the SETO algorithm has achieved the global optimum on 36 functions, and out of 4 engineering problems, it has obtained the best results on 3 problems.</p>","PeriodicalId":50034,"journal":{"name":"Journal of Supercomputing","volume":"78 2","pages":"2125-2174"},"PeriodicalIF":3.3,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s11227-021-03943-w","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10654294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-01-01Epub Date: 2021-08-24DOI: 10.1007/s11227-021-04020-y
Shanxi Li, Qingguo Zhou, Rui Zhou, Qingquan Lv
Malware has seriously threatened the safety of computer systems for a long time. Due to the rapid development of anti-detection technology, traditional detection methods based on static analysis and dynamic analysis have limited effects. With its better predictive performance, AI-based malware detection has been increasingly used to deal with malware in recent years. However, due to the diversity of malware, it is difficult to extract feature from malware, which make malware detection not conductive to the application of AI technology. To solve the problem, a malware classifier based on graph convolutional network is designed to adapt to the difference of malware characteristics. The specific method is to firstly extract the API call sequence from the malware code and generate a directed cycle graph, then use the Markov chain and principal component analysis method to extract the feature map of the graph, and design a classifier based on graph convolutional network, and finally analyze and compare the performance of the method. The results show that the method has better performance in most detection, and the highest accuracy is , compared with existing methods, our model is superior to other methods in terms of FPR and accuracy. It is also stable to deal with the development and growth of malware.
{"title":"Intelligent malware detection based on graph convolutional network.","authors":"Shanxi Li, Qingguo Zhou, Rui Zhou, Qingquan Lv","doi":"10.1007/s11227-021-04020-y","DOIUrl":"https://doi.org/10.1007/s11227-021-04020-y","url":null,"abstract":"<p><p>Malware has seriously threatened the safety of computer systems for a long time. Due to the rapid development of anti-detection technology, traditional detection methods based on static analysis and dynamic analysis have limited effects. With its better predictive performance, AI-based malware detection has been increasingly used to deal with malware in recent years. However, due to the diversity of malware, it is difficult to extract feature from malware, which make malware detection not conductive to the application of AI technology. To solve the problem, a malware classifier based on graph convolutional network is designed to adapt to the difference of malware characteristics. The specific method is to firstly extract the API call sequence from the malware code and generate a directed cycle graph, then use the Markov chain and principal component analysis method to extract the feature map of the graph, and design a classifier based on graph convolutional network, and finally analyze and compare the performance of the method. The results show that the method has better performance in most detection, and the highest accuracy is <math><mrow><mn>98.32</mn> <mo>%</mo></mrow> </math> , compared with existing methods, our model is superior to other methods in terms of FPR and accuracy. It is also stable to deal with the development and growth of malware.</p>","PeriodicalId":50034,"journal":{"name":"Journal of Supercomputing","volume":"78 3","pages":"4182-4198"},"PeriodicalIF":3.3,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s11227-021-04020-y","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39364630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}