John Daves S. Baguio, Billy A. Lu, Christine F. Peña
{"title":"Text Classification of Climate Change Tweets using Artificial Neural Networks, FastText Word Embeddings, and Latent Dirichlet Allocation","authors":"John Daves S. Baguio, Billy A. Lu, Christine F. Peña","doi":"10.1109/APSIT58554.2023.10201782","DOIUrl":null,"url":null,"abstract":"The climate change discourse on social media happens rapidly with microblogging sites such as Twitter. On these types of sites, there is a divide of stances. Some people believe that climate change is man-made, and some people deny its existence. This study aimed to classify climate change tweets in the given labeled dataset with the created text classification model that used Artificial Neural Networks, FastText Word Embeddings, and Latent Dirichlet Allocation. Additionally, domain-specific preprocessing methods for climate change tweets and adding features by appending the majority topic of a given tweet between each word are applied. This study has shown that the created text classification model improved the F1 score of the two undersampled classes by 1 % and 6 % respectively while still maintaining a good F1 score for the majority class. The text classification model overall increased both macro and weighted averages by 3 % and 1 % respectively.","PeriodicalId":170044,"journal":{"name":"2023 International Conference in Advances in Power, Signal, and Information Technology (APSIT)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference in Advances in Power, Signal, and Information Technology (APSIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSIT58554.2023.10201782","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The climate change discourse on social media happens rapidly with microblogging sites such as Twitter. On these types of sites, there is a divide of stances. Some people believe that climate change is man-made, and some people deny its existence. This study aimed to classify climate change tweets in the given labeled dataset with the created text classification model that used Artificial Neural Networks, FastText Word Embeddings, and Latent Dirichlet Allocation. Additionally, domain-specific preprocessing methods for climate change tweets and adding features by appending the majority topic of a given tweet between each word are applied. This study has shown that the created text classification model improved the F1 score of the two undersampled classes by 1 % and 6 % respectively while still maintaining a good F1 score for the majority class. The text classification model overall increased both macro and weighted averages by 3 % and 1 % respectively.