Text mining derives information and patterns from textual data. Online social media platforms, which have recently acquired great interest, generate vast text data about human behaviors based on their interactions. This data is generally ambiguous and unstructured. The data includes typing errors and errors in grammar that cause lexical, syntactic, and semantic uncertainties. This results in incorrect pattern detection and analysis. Researchers are employing various text mining techniques that can aid in Topic Modeling, the detection of Trending Topics, the identification of Hate Speeches, and the growth of communities in online social media networks. This review paper compares the performance of ten machine learning classification techniques on a Twitter data set for analyzing users' sentiments on posts related to airline usage. Review and comparative analysis of Gaussian Naive Bayes, Random Forest, Multinomial Naive Bayes, Multinomial Naive Bayes with Bagging, Adaptive Boosting (AdaBoost), Optimized AdaBoost, Support Vector Machine (SVM), Optimized SVM, Logistic Regression, and Long-Short Term Memory (LSTM) for sentiment analysis. The results of the experimental study showed that the Optimized SVM performed better than the other classifiers, with a training accuracy of 99.73% and testing accuracy of 89.74% compared to other models. Optimized SVM uses the RBF kernel function and nonlinear hyperplanes to split the dataset into classes, correctly classifying the dataset into distinct polarity. This, together with Feature Engineering utilizing Forward Trigrams and Weighted TF-IDF, has improved Optimized SVM classifier performance regarding train and test accuracy. Therefore, the train and test accuracy of Optimized SVM are 99.73% and 89.74% respectively. When compared to Random Forest, a marginal of 0.09% and 1.73% performance enhancement is observed in terms of train and test accuracy and 1.29% (train accuracy) and 3.63% (test accuracy) of improved performance when compared with LSTM. Likewise, Optimized SVM, gave more than 10% of enhanced performance in terms of train accuracy when compared with Gaussian Naïve Bayes, Multinomial Naïve Bayes, Multinomial Naïve Bayes with Bagging, Logistic Regression and a similar enhancement is observed with AdaBoost and Optimized AdaBoost which are ensemble models during the experimental process. Optimized SVM also has outperformed all the classification models in terms of AUC-ROC train and test scores.
{"title":"Text Mining – A Comparative Review of Twitter Sentiments Analysis","authors":"Sandeep Kumar, Sushma Patil, Dewang Subil, Noureen Nasar, Sujatha Arun Kokatnoor, Balachandran Krishnan","doi":"10.2174/2666255816666230726140726","DOIUrl":"https://doi.org/10.2174/2666255816666230726140726","url":null,"abstract":"\u0000\u0000Text mining derives information and patterns from textual data. Online social media platforms, which have recently acquired great interest, generate vast text data about human behaviors based on their interactions. This data is generally ambiguous and unstructured. The data includes typing errors and errors in grammar that cause lexical, syntactic, and semantic uncertainties. This results in incorrect pattern detection and analysis. Researchers are employing various text mining techniques that can aid in Topic Modeling, the detection of Trending Topics, the identification of Hate Speeches, and the growth of communities in online social media networks.\u0000\u0000\u0000\u0000This review paper compares the performance of ten machine learning classification techniques on a Twitter data set for analyzing users' sentiments on posts related to airline usage.\u0000\u0000\u0000\u0000Review and comparative analysis of Gaussian Naive Bayes, Random Forest, Multinomial Naive Bayes, Multinomial Naive Bayes with Bagging, Adaptive Boosting (AdaBoost), Optimized AdaBoost, Support Vector Machine (SVM), Optimized SVM, Logistic Regression, and Long-Short Term Memory (LSTM) for sentiment analysis.\u0000\u0000\u0000\u0000The results of the experimental study showed that the Optimized SVM performed better than the other classifiers, with a training accuracy of 99.73% and testing accuracy of 89.74% compared to other models.\u0000\u0000\u0000\u0000Optimized SVM uses the RBF kernel function and nonlinear hyperplanes to split the dataset into classes, correctly classifying the dataset into distinct polarity. This, together with Feature Engineering utilizing Forward Trigrams and Weighted TF-IDF, has improved Optimized SVM classifier performance regarding train and test accuracy. Therefore, the train and test accuracy of Optimized SVM are 99.73% and 89.74% respectively. When compared to Random Forest, a marginal of 0.09% and 1.73% performance enhancement is observed in terms of train and test accuracy and 1.29% (train accuracy) and 3.63% (test accuracy) of improved performance when compared with LSTM. Likewise, Optimized SVM, gave more than 10% of enhanced performance in terms of train accuracy when compared with Gaussian Naïve Bayes, Multinomial Naïve Bayes, Multinomial Naïve Bayes with Bagging, Logistic Regression and a similar enhancement is observed with AdaBoost and Optimized AdaBoost which are ensemble models during the experimental process. Optimized SVM also has outperformed all the classification models in terms of AUC-ROC train and test scores.\u0000","PeriodicalId":36514,"journal":{"name":"Recent Advances in Computer Science and Communications","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42350271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-07-14DOI: 10.2174/2666255816666230714111503
Mohamed Ezzat, H. Hefny, Ammar Mohmmed
Wi-Fi Direct technology enables users to share services in groups, and support Service discovery at the data link layer before creating a P2P Group, and it can be used as a collaborative application integrated into vehicles for multimedia transfer and group configuration between V2X. Compared to cellular networks, Wi-Fi Direct offers a high transmission data rate at a cheaper cost. However, there are numerous hurdles to using Wi-Fi Direct in vehicles, including the fact that Wi-Fi Direct communication has a relatively small coverage area, disconnection may occur multiple times, and the distance between vehicles changes often in a moving setting, which negatively affects the quality of service delivery. Previous studies disregarded the motion and direction of moving objects. The main contribution of this paper is to use Wi-Fi Direct among vehicles to reduce reliance on the 5G network, thereby addressing the previous challenges. In particular, the main contribution of this paper is to introduce a set of scenarios based on different speeds, directions, and distances between vehicles. The state of the packets is monitored in each scenario to compute the packets delay and loss. We present a new contribution to the services discovery by providing V2V IE with a set of services that reflect the user's interest, such as Web pages, SMS, Audio links, and Video links, using the Generic Advertisement Protocol GAS, and a comparison between the traditional P2P IE and the new V2V IE. Furthermore, the paper introduces a stable Wi-Fi Direct Fuzzy C-Means FCM clustering method based on important parameters impacting the group formation, such as the location, the destination, the direction, the speed of the vehicle, and the user’s Interests List. Based on the results of the FCM, there is still uncertainty in choosing the appropriate time to provide the services to the vehicles. We propose a Type-2 Fuzzy Logic Handover T2FLH system to solve the problem of handling uncertainty about dealing with the available services. Using the simulation on OMNeT++, the proposed scenarios with the fuzzy c-means FCM clustering method are compared to get the best clusters. Then the results were compared with the Type-2 Fuzzy T2FLH system to extract the best scenarios. We concluded from the results of previous experiments that Wi-Fi Direct can be used with vehicles at low speeds and high speeds. In the case of low speeds, it works efficiently depending on OMNET++ results. Therefore, Wi-Fi Direct can be used in vehicle stations and work sites that use limited-speed vehicles such as Clarks machines to alert safety and provide them with information about the devices around them. Bearing in mind that the speed of devices is limited in work areas. In the case of high speeds, the results are significantly improved using the proposed Type-2 fuzzy Logic Handover T2FLH system to model uncertainty and imprecision in a better way. Relying on T2FLH has led to a decrease in the rate of P
{"title":"Multimedia Transfer over Wi-Fi Direct based on Fuzzy \u0000Clustering for Vehicular Communications","authors":"Mohamed Ezzat, H. Hefny, Ammar Mohmmed","doi":"10.2174/2666255816666230714111503","DOIUrl":"https://doi.org/10.2174/2666255816666230714111503","url":null,"abstract":"\u0000\u0000Wi-Fi Direct technology enables users to share services in groups, and support Service discovery at the data link layer before creating a P2P Group, and it can be used as a collaborative application integrated into vehicles for multimedia transfer and group configuration between V2X. Compared to cellular networks, Wi-Fi Direct offers a high transmission data rate at a cheaper cost. However, there are numerous hurdles to using Wi-Fi Direct in vehicles, including the fact that Wi-Fi Direct communication has a relatively small coverage area, disconnection may occur multiple times, and the distance between vehicles changes often in a moving setting, which negatively affects the quality of service delivery. Previous studies disregarded the motion and direction of moving objects.\u0000\u0000\u0000\u0000The main contribution of this paper is to use Wi-Fi Direct among vehicles to reduce reliance on the 5G network, thereby addressing the previous challenges. In particular, the main contribution of this paper is to introduce a set of scenarios based on different speeds, directions, and distances between vehicles. The state of the packets is monitored in each scenario to compute the packets delay and loss. We present a new contribution to the services discovery by providing V2V IE with a set of services that reflect the user's interest, such as Web pages, SMS, Audio links, and Video links, using the Generic Advertisement Protocol GAS, and a comparison between the traditional P2P IE and the new V2V IE. Furthermore, the paper introduces a stable Wi-Fi Direct Fuzzy C-Means FCM clustering method based on important parameters impacting the group formation, such as the location, the destination, the direction, the speed of the vehicle, and the user’s Interests List.\u0000\u0000\u0000\u0000Based on the results of the FCM, there is still uncertainty in choosing the appropriate time to provide the services to the vehicles. We propose a Type-2 Fuzzy Logic Handover T2FLH system to solve the problem of handling uncertainty about dealing with the available services. Using the simulation on OMNeT++, the proposed scenarios with the fuzzy c-means FCM clustering method are compared to get the best clusters. Then the results were compared with the Type-2 Fuzzy T2FLH system to extract the best scenarios.\u0000\u0000\u0000\u0000We concluded from the results of previous experiments that Wi-Fi Direct can be used with vehicles at low speeds and high speeds. In the case of low speeds, it works efficiently depending on OMNET++ results. Therefore, Wi-Fi Direct can be used in vehicle stations and work sites that use limited-speed vehicles such as Clarks machines to alert safety and provide them with information about the devices around them. Bearing in mind that the speed of devices is limited in work areas. In the case of high speeds, the results are significantly improved using the proposed Type-2 fuzzy Logic Handover T2FLH system to model uncertainty and imprecision in a better way. Relying on T2FLH has led to a decrease in the rate of P","PeriodicalId":36514,"journal":{"name":"Recent Advances in Computer Science and Communications","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45858478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}