Caio Libânio Melo Jerônimo, L. Marinho, C. E. Campelo, Adriano Veloso, A. S. C. Melo
{"title":"Fake News Classification Based on Subjective Language","authors":"Caio Libânio Melo Jerônimo, L. Marinho, C. E. Campelo, Adriano Veloso, A. S. C. Melo","doi":"10.1145/3366030.3366039","DOIUrl":null,"url":null,"abstract":"While many works investigate spread patterns of fake news in social networks, we focus on the textual content. Instead of relying on syntactic representations of documents (aka Bag of Words) as many works do, we seek more robust representations that may better differentiate fake from legitimate news. We propose to consider the subjectivity of news under the assumption that the subjectivity levels of legitimate and fake news are significantly different. For computing the subjectivity level of news, we rely on a set subjectivity lexicons built by Brazilian linguists. We then build subjectivity feature vectors for each news article by calculating the Word Mover's Distance (WMD) between the news and these lexicons considering the embedding the news words lie in, in order to classify the documents. The results demonstrate that our method is more robust than classical text classification approaches, especially in scenarios where training and test domains are different.","PeriodicalId":446280,"journal":{"name":"Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3366030.3366039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17
Abstract
While many works investigate spread patterns of fake news in social networks, we focus on the textual content. Instead of relying on syntactic representations of documents (aka Bag of Words) as many works do, we seek more robust representations that may better differentiate fake from legitimate news. We propose to consider the subjectivity of news under the assumption that the subjectivity levels of legitimate and fake news are significantly different. For computing the subjectivity level of news, we rely on a set subjectivity lexicons built by Brazilian linguists. We then build subjectivity feature vectors for each news article by calculating the Word Mover's Distance (WMD) between the news and these lexicons considering the embedding the news words lie in, in order to classify the documents. The results demonstrate that our method is more robust than classical text classification approaches, especially in scenarios where training and test domains are different.