Reishi Amitani, Kazuyuki Matsumoto, Minoru Yoshida, K. Kita
The current study aimed to investigate social media trends and propose an analysis method to explore the factors underpinning the buzz phenomenon on Twitter. As it is not always possible to determine the cause of the buzz phenomenon from the text content alone posted on Twitter, we limited the analysis to tweets with attached images and devised an analysis method using both text and images. We investigated whether there is a relationship between the features of both tweet text and its attached images, and how the relationship between these features is related to the number of likes and retweets (RTs) received—that is, indicators of popularity. We trained a multi-task neural network that takes the features extracted from the images and text as input, and then outputs the number of likes and RTs before extracting the feature vectors of the same dimension from the two inputs (images and text, respectively) from the middle layer. By calculating the distance between these feature vectors, we analyzed the relationship between the number of likes and RTs. The results revealed that the average vectors of BERT and inceptionresnetv2 served as predictors of the number of likes and RTs. We also found that tweet text with a low number of likes and RTs was short and simple.
{"title":"Prediction of Number of Likes and Retweets based on the Features of Tweet Text and Images","authors":"Reishi Amitani, Kazuyuki Matsumoto, Minoru Yoshida, K. Kita","doi":"10.1145/3508230.3508244","DOIUrl":"https://doi.org/10.1145/3508230.3508244","url":null,"abstract":"The current study aimed to investigate social media trends and propose an analysis method to explore the factors underpinning the buzz phenomenon on Twitter. As it is not always possible to determine the cause of the buzz phenomenon from the text content alone posted on Twitter, we limited the analysis to tweets with attached images and devised an analysis method using both text and images. We investigated whether there is a relationship between the features of both tweet text and its attached images, and how the relationship between these features is related to the number of likes and retweets (RTs) received—that is, indicators of popularity. We trained a multi-task neural network that takes the features extracted from the images and text as input, and then outputs the number of likes and RTs before extracting the feature vectors of the same dimension from the two inputs (images and text, respectively) from the middle layer. By calculating the distance between these feature vectors, we analyzed the relationship between the number of likes and RTs. The results revealed that the average vectors of BERT and inceptionresnetv2 served as predictors of the number of likes and RTs. We also found that tweet text with a low number of likes and RTs was short and simple.","PeriodicalId":252146,"journal":{"name":"Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122439910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In order to solve the problems of description and verification of semantic properties in model driven development, process algebra is introduced on the basis of extending typed category theory. A unified semantic description framework is established for the description and transformation of component-based software models, as well as the maintenance and verification of semantic properties in the process of model transformation. Category diagram is used to describe the semantics of architecture model, and typed morphism implies the dependency relationship between component objects, and typed functor is used to describe the mapping mechanism before and after model transformation. Application research shows that the framework well follows the essence and process requirements of model-driven development, and provides a new guidance framework for understanding, cognitive learning and promotion of software development research on the basis of model transformation.
{"title":"Architecture-Based Semantic Description Framework for Model Transformation","authors":"Jinkui Hou, Cong Xu, Yuyan Zhang","doi":"10.1145/3508230.3508241","DOIUrl":"https://doi.org/10.1145/3508230.3508241","url":null,"abstract":"In order to solve the problems of description and verification of semantic properties in model driven development, process algebra is introduced on the basis of extending typed category theory. A unified semantic description framework is established for the description and transformation of component-based software models, as well as the maintenance and verification of semantic properties in the process of model transformation. Category diagram is used to describe the semantics of architecture model, and typed morphism implies the dependency relationship between component objects, and typed functor is used to describe the mapping mechanism before and after model transformation. Application research shows that the framework well follows the essence and process requirements of model-driven development, and provides a new guidance framework for understanding, cognitive learning and promotion of software development research on the basis of model transformation.","PeriodicalId":252146,"journal":{"name":"Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133473861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chenyang Zhao, Peng Zhang, Jing Liu, Juan Wang, Jiyang Zhang
The analysis of netizens' emotional tendency after emergencies is an important means for the government to understand netizens' mentality and guide public opinion. Constructing a scientific and reasonable domain emotion dictionary is an important part of accurate emotion analysis of Internet users. Currently, there are few sentiment dictionaries in the field of college education. This article proposes an improved SO-PMI method for constructing emotional dictionaries in the field of college education. Use TF-IDF to sort the importance of emotional seed words, modify the field importance of the SO-PMI extended word set, and a basic emotional dictionary formed by combining Dalian Polytechnic and HowNet emotional dictionary, and finally formed an emotional dictionary in the field of college education. According to the judgment of interrogative sentences and exclamation sentences, the calculation rules of sentiment intensity of sentences are revised. The experimental results show that this method has achieved good results on the actual Weibo comment data set.
{"title":"Research on Domain Emotion Dictionary Construction Method based on Improved SO-PMI Algorithm","authors":"Chenyang Zhao, Peng Zhang, Jing Liu, Juan Wang, Jiyang Zhang","doi":"10.1145/3508230.3508233","DOIUrl":"https://doi.org/10.1145/3508230.3508233","url":null,"abstract":"The analysis of netizens' emotional tendency after emergencies is an important means for the government to understand netizens' mentality and guide public opinion. Constructing a scientific and reasonable domain emotion dictionary is an important part of accurate emotion analysis of Internet users. Currently, there are few sentiment dictionaries in the field of college education. This article proposes an improved SO-PMI method for constructing emotional dictionaries in the field of college education. Use TF-IDF to sort the importance of emotional seed words, modify the field importance of the SO-PMI extended word set, and a basic emotional dictionary formed by combining Dalian Polytechnic and HowNet emotional dictionary, and finally formed an emotional dictionary in the field of college education. According to the judgment of interrogative sentences and exclamation sentences, the calculation rules of sentiment intensity of sentences are revised. The experimental results show that this method has achieved good results on the actual Weibo comment data set.","PeriodicalId":252146,"journal":{"name":"Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114171701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
There is a great scope in utilizing the increasing content expressed by users on social media platforms such as Twitter. This study explores the application of Aspect-based Sentiment Analysis (ABSA) of tweets to retrieve fine-grained sentiment insights. The Plant-based food domain is chosen as an area of focus. To the best of our knowledge this is the first time ABSA task is done for this sector and it is distinct from standard food products because different and controversial aspects arise and opinions are polarized. The choice is relevant because these products can help in meeting the sustainable development goals and improve the welfare of millions of animals. Pre-trained BERT,”Bidirectional Encoder Representations with transformers”, is fine-tuned for this task and stands out because it was trained to learn from all the words in the sentence simultaneously using transformers. The aim was to develop methods to be applied on real life cases, therefore lowering the dependency on labeled data and improving performance were the key objectives. This research contributes to existing approaches of ABSA by proposing data processing techniques to adapt social media data for ABSA. The scope of this project presents a new method for the aspect category detection task (ACD) which does not rely on labeled data by using regular expressions (Regex). For aspect the sentiment classification task (ASC) a semi-supervised learning technique is explored. Additionally Part-of-Speech (POS) tags are incorporated into the predictions. The findings show that Regex is a solution to eliminate the dependency on labeled data for ACD. For ASC fine-tuning BERT on a small subset of data was the most accurate method to lower the dependency on aspect level sentiment data.
{"title":"Aspect-Based Sentiment Analysis of Social Media Data With Pre-Trained Language Models","authors":"Anina Troya, Reshmi Gopalakrishna Pillai, Cristian Rodriguez Rivero, Zülküf Genç, S. Kayal, Dogu Araci","doi":"10.1145/3508230.3508232","DOIUrl":"https://doi.org/10.1145/3508230.3508232","url":null,"abstract":"There is a great scope in utilizing the increasing content expressed by users on social media platforms such as Twitter. This study explores the application of Aspect-based Sentiment Analysis (ABSA) of tweets to retrieve fine-grained sentiment insights. The Plant-based food domain is chosen as an area of focus. To the best of our knowledge this is the first time ABSA task is done for this sector and it is distinct from standard food products because different and controversial aspects arise and opinions are polarized. The choice is relevant because these products can help in meeting the sustainable development goals and improve the welfare of millions of animals. Pre-trained BERT,”Bidirectional Encoder Representations with transformers”, is fine-tuned for this task and stands out because it was trained to learn from all the words in the sentence simultaneously using transformers. The aim was to develop methods to be applied on real life cases, therefore lowering the dependency on labeled data and improving performance were the key objectives. This research contributes to existing approaches of ABSA by proposing data processing techniques to adapt social media data for ABSA. The scope of this project presents a new method for the aspect category detection task (ACD) which does not rely on labeled data by using regular expressions (Regex). For aspect the sentiment classification task (ASC) a semi-supervised learning technique is explored. Additionally Part-of-Speech (POS) tags are incorporated into the predictions. The findings show that Regex is a solution to eliminate the dependency on labeled data for ACD. For ASC fine-tuning BERT on a small subset of data was the most accurate method to lower the dependency on aspect level sentiment data.","PeriodicalId":252146,"journal":{"name":"Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125842278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Digital Rumors, because of the ease and innovations in social networking technologies, has become an important issue. These rumors become a critical issue in a disaster, epidemic, or pandemic. Considering classification power of conventional and deep learning techniques, we propose a hybrid learning technique that identifies rumors effectively. For this, TF-IDF description has been used to build a stack of multiple conventional learning techniques; logistic regression, Naïve Bayes, and random forest. Whereas, word-embedding features have been used for purpose of deep learning; LSTM and LSTM-RNN. The combination of LSTM and RNN makes this study unique in the field of rumor detection. With LSTM and RNN gated architectures, huge series rumor tweets may be efficiently managed. To aggregate the decisions, the labels of deep learning and the stack of conventional learning have been combined using majority voting based ensemble classification. To evaluate the performance of the proposed technique, we used publically available standard COVID-19 RUMOR dataset. The proposed technique obtains 99.02% accuracy, which shows its effectiveness. The dataset utilized and the ensemble model created for rumor identification distinguish our work from existing methods.
{"title":"Pandemic rumor identification on social networking sites: A case study of COVID-19","authors":"Mohsan Ali, Iqbal Murtza, A. Ejaz","doi":"10.1145/3508230.3508246","DOIUrl":"https://doi.org/10.1145/3508230.3508246","url":null,"abstract":"Digital Rumors, because of the ease and innovations in social networking technologies, has become an important issue. These rumors become a critical issue in a disaster, epidemic, or pandemic. Considering classification power of conventional and deep learning techniques, we propose a hybrid learning technique that identifies rumors effectively. For this, TF-IDF description has been used to build a stack of multiple conventional learning techniques; logistic regression, Naïve Bayes, and random forest. Whereas, word-embedding features have been used for purpose of deep learning; LSTM and LSTM-RNN. The combination of LSTM and RNN makes this study unique in the field of rumor detection. With LSTM and RNN gated architectures, huge series rumor tweets may be efficiently managed. To aggregate the decisions, the labels of deep learning and the stack of conventional learning have been combined using majority voting based ensemble classification. To evaluate the performance of the proposed technique, we used publically available standard COVID-19 RUMOR dataset. The proposed technique obtains 99.02% accuracy, which shows its effectiveness. The dataset utilized and the ensemble model created for rumor identification distinguish our work from existing methods.","PeriodicalId":252146,"journal":{"name":"Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114068780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval","authors":"","doi":"10.1145/3508230","DOIUrl":"https://doi.org/10.1145/3508230","url":null,"abstract":"","PeriodicalId":252146,"journal":{"name":"Proceedings of the 2021 5th International Conference on Natural Language Processing and Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121001970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}