Pub Date : 2021-10-30DOI: 10.1142/s2424922x21430014
Muaadh A. Alsoufi, S. Razak, M. M. Siraj, B. Al-rimy, Abdul-Rahman Al-Ali, Maged Nasser, Salah Abdo
{"title":"A Review of Anomaly Intrusion Detection Systems in IoT using Deep Learning Techniques","authors":"Muaadh A. Alsoufi, S. Razak, M. M. Siraj, B. Al-rimy, Abdul-Rahman Al-Ali, Maged Nasser, Salah Abdo","doi":"10.1142/s2424922x21430014","DOIUrl":"https://doi.org/10.1142/s2424922x21430014","url":null,"abstract":"","PeriodicalId":47145,"journal":{"name":"Advances in Data Science and Adaptive Analysis","volume":"87 1","pages":"2143001:1-2143001:21"},"PeriodicalIF":0.6,"publicationDate":"2021-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80764351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CatBoost - An Ensemble Machine Learning Model for Prediction and Classification of Student Academic Performance","authors":"Abhisht Joshi, Pranay Saggar, Rajat Jain, Moolchand Sharma, Deepak Gupta, Ashish Khanna","doi":"10.1142/s2424922x21410023","DOIUrl":"https://doi.org/10.1142/s2424922x21410023","url":null,"abstract":"","PeriodicalId":47145,"journal":{"name":"Advances in Data Science and Adaptive Analysis","volume":"131 1","pages":"2141002:1-2141002:28"},"PeriodicalIF":0.6,"publicationDate":"2021-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79565255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-30DOI: 10.1142/s2424922x21420018
Wenjing Wang, S. C. Sandaran, R. Sabitha, K. Thilak
{"title":"Student Behavior Simulation in English Online Education Based on Reinforcement Learning","authors":"Wenjing Wang, S. C. Sandaran, R. Sabitha, K. Thilak","doi":"10.1142/s2424922x21420018","DOIUrl":"https://doi.org/10.1142/s2424922x21420018","url":null,"abstract":"","PeriodicalId":47145,"journal":{"name":"Advances in Data Science and Adaptive Analysis","volume":"80 5 1","pages":"2142001:1-2142001:18"},"PeriodicalIF":0.6,"publicationDate":"2021-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83152534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-30DOI: 10.1142/s2424922x2150008x
Shabina Shaikh
{"title":"Development of Automated Knowledge Management Model (AKMM)","authors":"Shabina Shaikh","doi":"10.1142/s2424922x2150008x","DOIUrl":"https://doi.org/10.1142/s2424922x2150008x","url":null,"abstract":"","PeriodicalId":47145,"journal":{"name":"Advances in Data Science and Adaptive Analysis","volume":"41 1","pages":"2150008:1-2150008:13"},"PeriodicalIF":0.6,"publicationDate":"2021-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82052291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-10-29DOI: 10.1142/s2424922x21500078
D. Wilkins
One of the most critical national infrastructures (CNIs) in Liberia on which critical e-government services are dependent is GovNet. GovNet is the acronym for Government of Liberia’s (GoL’s) Network, and it is the conduit for connectivity and gateway to the internet for all of the Government’s over 107 Ministries, Agencies, and Commissions (MACs), as well as its e-government programs. Most of the MACs connected to GovNet run siloed ICT environments with little or no cybersecurity mechanisms in place. This paper is an investigation conducted at six MACs that are members of GovNet. The investigation identified several cybersecurity deficiencies at those MACs, due to the absence of vital dimensions or pre-requirements of cybersecurity readiness including the infrastructure (digital infrastructure), capacity (education and skills) and, governance (legal and regulatory instruments). The investigation examines previous and extant literature, conducted interviews with stakeholders of GovNet, and leverages the vast experiences of the author, who is the immediate past Managing Director of LIBTELCO. Recommendations are made for the necessary actions to be taken to remedy those deficiencies in GovNet, and the study’s contribution to the body of knowledge is indicated in the Conclusion.
{"title":"Investigating the Cybersecurity Aspects of the Liberian Government's Network (GovNet) as a Critical National Infrastructure","authors":"D. Wilkins","doi":"10.1142/s2424922x21500078","DOIUrl":"https://doi.org/10.1142/s2424922x21500078","url":null,"abstract":"One of the most critical national infrastructures (CNIs) in Liberia on which critical e-government services are dependent is GovNet. GovNet is the acronym for Government of Liberia’s (GoL’s) Network, and it is the conduit for connectivity and gateway to the internet for all of the Government’s over 107 Ministries, Agencies, and Commissions (MACs), as well as its e-government programs. Most of the MACs connected to GovNet run siloed ICT environments with little or no cybersecurity mechanisms in place. This paper is an investigation conducted at six MACs that are members of GovNet. The investigation identified several cybersecurity deficiencies at those MACs, due to the absence of vital dimensions or pre-requirements of cybersecurity readiness including the infrastructure (digital infrastructure), capacity (education and skills) and, governance (legal and regulatory instruments). The investigation examines previous and extant literature, conducted interviews with stakeholders of GovNet, and leverages the vast experiences of the author, who is the immediate past Managing Director of LIBTELCO. Recommendations are made for the necessary actions to be taken to remedy those deficiencies in GovNet, and the study’s contribution to the body of knowledge is indicated in the Conclusion.","PeriodicalId":47145,"journal":{"name":"Advances in Data Science and Adaptive Analysis","volume":"64 1","pages":"2150007:1-2150007:19"},"PeriodicalIF":0.6,"publicationDate":"2021-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88397569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-09-24DOI: 10.1142/s2424922x21410011
Durmuş Özkan Şahin, S. Akleylek, E. Kılıç
There is a remarkable increase in mobile device usage in recent years. The Android operating system is by far the most preferred open-source mobile operating system around the world. Besides, the Android operating system is preferred in many devices on the Internet of Things (IoT) devices are used in many areas of daily life. Smart cities, smart environment, health, home automation, agriculture, and livestock are some of the usage areas. Health is one of the most frequently used areas. Since the Android operating system is both the widely used operating system and open-source, the vast majority of malware released on the market is now designed for Android platforms. Therefore, devices using the Android operating system are under serious threat. In this study, a system that detects malware on Android operating systems based on machine learning is proposed. Besides, feature vectors are created with permissions that have an important place in the security of the Android operating system. Feature vectors created using the k-nearest neighbor algorithm (KNN), one of the machine learning techniques, are given as input to this algorithm, and a classification of malicious software and benign software is provided. In the KNN algorithm, the k value and the distance metric used to find the closest sample directly affect the classification performance. In addition, the study examining the parameters of the KNN algorithm in detail in permission-based studies is limited. For this reason, the performance of the malware detection system is presented comparatively using five different k values and five different distance metrics under different data sets. When the results are examined, it is observed that higher classification performances are obtained when values such as 1, 3 are given to k and metrics such as Euclidean and Minkowski are chosen instead of the Chebyshev distance metric.
{"title":"On the Effect of k Values and Distance Metrics in KNN Algorithm for Android Malware Detection","authors":"Durmuş Özkan Şahin, S. Akleylek, E. Kılıç","doi":"10.1142/s2424922x21410011","DOIUrl":"https://doi.org/10.1142/s2424922x21410011","url":null,"abstract":"There is a remarkable increase in mobile device usage in recent years. The Android operating system is by far the most preferred open-source mobile operating system around the world. Besides, the Android operating system is preferred in many devices on the Internet of Things (IoT) devices are used in many areas of daily life. Smart cities, smart environment, health, home automation, agriculture, and livestock are some of the usage areas. Health is one of the most frequently used areas. Since the Android operating system is both the widely used operating system and open-source, the vast majority of malware released on the market is now designed for Android platforms. Therefore, devices using the Android operating system are under serious threat. In this study, a system that detects malware on Android operating systems based on machine learning is proposed. Besides, feature vectors are created with permissions that have an important place in the security of the Android operating system. Feature vectors created using the k-nearest neighbor algorithm (KNN), one of the machine learning techniques, are given as input to this algorithm, and a classification of malicious software and benign software is provided. In the KNN algorithm, the k value and the distance metric used to find the closest sample directly affect the classification performance. In addition, the study examining the parameters of the KNN algorithm in detail in permission-based studies is limited. For this reason, the performance of the malware detection system is presented comparatively using five different k values and five different distance metrics under different data sets. When the results are examined, it is observed that higher classification performances are obtained when values such as 1, 3 are given to k and metrics such as Euclidean and Minkowski are chosen instead of the Chebyshev distance metric.","PeriodicalId":47145,"journal":{"name":"Advances in Data Science and Adaptive Analysis","volume":"12 1","pages":"2141001:1-2141001:20"},"PeriodicalIF":0.6,"publicationDate":"2021-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81977088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-18DOI: 10.1142/s2424922x21500054
Khyati Ahlawat, A. Chug, A. Singh
The uneven distribution of classes in any dataset poses a tendency of biasness toward the majority class when analyzed using any standard classifier. The instances of the significant class being deficient in numbers are generally ignored and their correct classification which is of paramount interest is often overlooked in calculating overall accuracy. Therefore, the conventional machine learning approaches are rigorously refined to address this class imbalance problem. This challenge of imbalanced classes is more prevalent in big data scenario due to its high volume. This study deals with acknowledging a sampling solution based on cluster computing in handling class imbalance problems in the case of big data. The newly proposed approach hybrid sampling algorithm (HSA) is assessed using three popular classification algorithms namely, support vector machine, decision tree and k-nearest neighbor based on balanced accuracy and elapsed time. The results obtained from the experiment are considered promising with an efficiency gain of 42% in comparison to the traditional sampling solution synthetic minority oversampling technique (SMOTE). This work proves the effectiveness of the distribution and clustering principle in imbalanced big data scenarios.
{"title":"A Novel Hybrid Sampling Algorithm for Solving Class Imbalance Problem in Big Data","authors":"Khyati Ahlawat, A. Chug, A. Singh","doi":"10.1142/s2424922x21500054","DOIUrl":"https://doi.org/10.1142/s2424922x21500054","url":null,"abstract":"The uneven distribution of classes in any dataset poses a tendency of biasness toward the majority class when analyzed using any standard classifier. The instances of the significant class being deficient in numbers are generally ignored and their correct classification which is of paramount interest is often overlooked in calculating overall accuracy. Therefore, the conventional machine learning approaches are rigorously refined to address this class imbalance problem. This challenge of imbalanced classes is more prevalent in big data scenario due to its high volume. This study deals with acknowledging a sampling solution based on cluster computing in handling class imbalance problems in the case of big data. The newly proposed approach hybrid sampling algorithm (HSA) is assessed using three popular classification algorithms namely, support vector machine, decision tree and k-nearest neighbor based on balanced accuracy and elapsed time. The results obtained from the experiment are considered promising with an efficiency gain of 42% in comparison to the traditional sampling solution synthetic minority oversampling technique (SMOTE). This work proves the effectiveness of the distribution and clustering principle in imbalanced big data scenarios.","PeriodicalId":47145,"journal":{"name":"Advances in Data Science and Adaptive Analysis","volume":"218 1","pages":"2150005:1-2150005:18"},"PeriodicalIF":0.6,"publicationDate":"2021-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75808047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-08-04DOI: 10.1142/s2424922x21500066
Ç. Dinçkal
The novel coronavirus COVID-19 (SARS-CoV-2) with the first clinical case emerged in the city of Wuhan in China in December 2019. Then it has spread to the entire world in very short time and turned into a global problem, namely, it has rapidly become a pandemic. Within this context, many studies have attempted to predict the consequences of the pandemic in certain countries. Nevertheless, these studies have focused on some parameters such as reproductive number, recovery rate and mortality rate when performing forecasting. This study aims to forecast COVID-19 data in Turkey with use of a new technique which is a combination of classical exponential smoothing and moving average. There is no need for reproductive number, recovery rate and mortality rate computation in this proposed technique. Simulations are carried out for the number of daily cases, active cases (those are cases with no symptoms), daily tests, recovering patients, patients in the intensive care unit, daily intubated patients, and deaths forecasting and results are tested on Mean Absolute Percentage Error (MAPE) criterion. It is shown that this technique captured the system dynamic behavior in Turkey and made exact predictions with the use of real time dataset.
{"title":"Exact Forecasting for COVID-19 Data: Case Study for Turkey","authors":"Ç. Dinçkal","doi":"10.1142/s2424922x21500066","DOIUrl":"https://doi.org/10.1142/s2424922x21500066","url":null,"abstract":"The novel coronavirus COVID-19 (SARS-CoV-2) with the first clinical case emerged in the city of Wuhan in China in December 2019. Then it has spread to the entire world in very short time and turned into a global problem, namely, it has rapidly become a pandemic. Within this context, many studies have attempted to predict the consequences of the pandemic in certain countries. Nevertheless, these studies have focused on some parameters such as reproductive number, recovery rate and mortality rate when performing forecasting. This study aims to forecast COVID-19 data in Turkey with use of a new technique which is a combination of classical exponential smoothing and moving average. There is no need for reproductive number, recovery rate and mortality rate computation in this proposed technique. Simulations are carried out for the number of daily cases, active cases (those are cases with no symptoms), daily tests, recovering patients, patients in the intensive care unit, daily intubated patients, and deaths forecasting and results are tested on Mean Absolute Percentage Error (MAPE) criterion. It is shown that this technique captured the system dynamic behavior in Turkey and made exact predictions with the use of real time dataset.","PeriodicalId":47145,"journal":{"name":"Advances in Data Science and Adaptive Analysis","volume":"436 1","pages":"2150006:1-2150006:17"},"PeriodicalIF":0.6,"publicationDate":"2021-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76666001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-07DOI: 10.1142/S2424922X21500042
Myongnam Jong, Yun-Ji Paek, Hyonil Kim, Cholsok Yu
Dempster’s combination rule may produce some unreasonable results when dealing with a combination of the conflicting evidence in evidence theory of Dempster–Shafer. Therefore, analyzing the degree of conflict between the bodies of evidence is essential to evaluate the applicability of Dempster’s rule. A new probability function, which is called a supporting probability function, is proposed to describe the correlation between evidences, and its distance is proposed to measure the distance between bodies of evidence. Combining this distance with classical conflict coefficient, a new method of evaluating the applicability of classical Dempster’s combination rule is presented. A weighted average approach to combine the conflicting evidences based on a supporting probability distance between the bodies of evidence is proposed. Numerical examples are given to illustrate the interest of the proposed approach.
{"title":"Analyzing the degree of conflict between bodies of evidence based on a new distance in data fusion","authors":"Myongnam Jong, Yun-Ji Paek, Hyonil Kim, Cholsok Yu","doi":"10.1142/S2424922X21500042","DOIUrl":"https://doi.org/10.1142/S2424922X21500042","url":null,"abstract":"Dempster’s combination rule may produce some unreasonable results when dealing with a combination of the conflicting evidence in evidence theory of Dempster–Shafer. Therefore, analyzing the degree of conflict between the bodies of evidence is essential to evaluate the applicability of Dempster’s rule. A new probability function, which is called a supporting probability function, is proposed to describe the correlation between evidences, and its distance is proposed to measure the distance between bodies of evidence. Combining this distance with classical conflict coefficient, a new method of evaluating the applicability of classical Dempster’s combination rule is presented. A weighted average approach to combine the conflicting evidences based on a supporting probability distance between the bodies of evidence is proposed. Numerical examples are given to illustrate the interest of the proposed approach.","PeriodicalId":47145,"journal":{"name":"Advances in Data Science and Adaptive Analysis","volume":"89 1","pages":"2150004:1-2150004:17"},"PeriodicalIF":0.6,"publicationDate":"2021-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76458662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-04-01DOI: 10.1142/S2424922X21500029
Chun-Hsiang Huang, T. Hsiao
The cardiovascular diseases are the major cause of death globally. To diagnose heart disease, automatic recognition of ECG’s T-wave is necessary. Empirical mode decomposition (EMD) can be used to decompose nonlinear and nonstationary signals. However, using EMD to decompose ECG potentially leads to a mode mixing problem. This study proposes modulated EEMD (mEEMD) as a solution, which can solve mode mixing problems with almost no influence from noise. Furthermore, the mEEMD has a less problematic boundary side effect and does not cause any phase shift. The sensitivity of T-wave onset and offset recognition is [Formula: see text] and [Formula: see text].
{"title":"Toward T-Wave Recognition of ECG Signals Through Modulated Ensemble Empirical Mode Decomposition","authors":"Chun-Hsiang Huang, T. Hsiao","doi":"10.1142/S2424922X21500029","DOIUrl":"https://doi.org/10.1142/S2424922X21500029","url":null,"abstract":"The cardiovascular diseases are the major cause of death globally. To diagnose heart disease, automatic recognition of ECG’s T-wave is necessary. Empirical mode decomposition (EMD) can be used to decompose nonlinear and nonstationary signals. However, using EMD to decompose ECG potentially leads to a mode mixing problem. This study proposes modulated EEMD (mEEMD) as a solution, which can solve mode mixing problems with almost no influence from noise. Furthermore, the mEEMD has a less problematic boundary side effect and does not cause any phase shift. The sensitivity of T-wave onset and offset recognition is [Formula: see text] and [Formula: see text].","PeriodicalId":47145,"journal":{"name":"Advances in Data Science and Adaptive Analysis","volume":"40 1","pages":"2150002:1-2150002:29"},"PeriodicalIF":0.6,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87280625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}