{"title":"Sarcasm detection using optimized bi-directional long short-term memory","authors":"Vidyullatha Sukhavasi, Venkatrama Phani kumar Sistla, Venkatesulu Dondeti","doi":"10.1007/s10115-024-02210-7","DOIUrl":null,"url":null,"abstract":"<p>In the current era, the number of social network users continues to increase day by day due to the vast usage of interactive social networking sites like Twitter, Facebook, Instagram, etc. On these sites, users generate posts, whereas the attitude of followers towards factor utilization like situation, sound, feeling, and so on can be analysed. But most people feel difficult to analyse feelings accurately, which is one of the most difficult problems in natural language processing. Some people expose their opinions with different sole meanings, and this sophisticated form of expressing sentiments through irony or mockery is termed sarcasm. The sarcastic comments, tweets or feedback can mislead data mining activities and may result in inaccurate predictions. Several existing models are used for sarcasm detection, but they have resulted in inaccuracy issues, huge time consumption, less training ability, high overfitting issues, etc. To overcome these limitations, an effective model is introduced in this research to detect sarcasm. Initially, the data are collected from publicly available sarcasmania and Generic sarcasm-Not sarcasm (Gen-Sarc-Notsarc) datasets. The collected data are pre-processed using stemming and stop word removal procedures. The features are extracted using the inverse filtering (IF) model through hash index creation, keyword matching and ranking. The optimal features are selected using adaptive search and rescue (ASAR) optimization algorithm. To enhance the accuracy of sarcasm detection, an optimized Bi-LSTM-based deep learning model is proposed by integrating Bi-directional long short-term memory (Bi-LSTM) with group teaching optimization (GTO). Also, the LSTM + GTO model is proposed to compare its performance with the Bi-LSTM + GTO model. The proposed models are compared with existing classifier approaches to prove the model’s superiority using PYTHON. The accuracy of 98.24% and 98.36% are attained for sarcasmania and Gen-Sarc-Notsarc datasets.</p>","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"15 1","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge and Information Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10115-024-02210-7","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In the current era, the number of social network users continues to increase day by day due to the vast usage of interactive social networking sites like Twitter, Facebook, Instagram, etc. On these sites, users generate posts, whereas the attitude of followers towards factor utilization like situation, sound, feeling, and so on can be analysed. But most people feel difficult to analyse feelings accurately, which is one of the most difficult problems in natural language processing. Some people expose their opinions with different sole meanings, and this sophisticated form of expressing sentiments through irony or mockery is termed sarcasm. The sarcastic comments, tweets or feedback can mislead data mining activities and may result in inaccurate predictions. Several existing models are used for sarcasm detection, but they have resulted in inaccuracy issues, huge time consumption, less training ability, high overfitting issues, etc. To overcome these limitations, an effective model is introduced in this research to detect sarcasm. Initially, the data are collected from publicly available sarcasmania and Generic sarcasm-Not sarcasm (Gen-Sarc-Notsarc) datasets. The collected data are pre-processed using stemming and stop word removal procedures. The features are extracted using the inverse filtering (IF) model through hash index creation, keyword matching and ranking. The optimal features are selected using adaptive search and rescue (ASAR) optimization algorithm. To enhance the accuracy of sarcasm detection, an optimized Bi-LSTM-based deep learning model is proposed by integrating Bi-directional long short-term memory (Bi-LSTM) with group teaching optimization (GTO). Also, the LSTM + GTO model is proposed to compare its performance with the Bi-LSTM + GTO model. The proposed models are compared with existing classifier approaches to prove the model’s superiority using PYTHON. The accuracy of 98.24% and 98.36% are attained for sarcasmania and Gen-Sarc-Notsarc datasets.
期刊介绍:
Knowledge and Information Systems (KAIS) provides an international forum for researchers and professionals to share their knowledge and report new advances on all topics related to knowledge systems and advanced information systems. This monthly peer-reviewed archival journal publishes state-of-the-art research reports on emerging topics in KAIS, reviews of important techniques in related areas, and application papers of interest to a general readership.