{"title":"使用三元组和加权TF-IDF模型识别关于冠状病毒爆发的误解","authors":"Sujatha Arun Kokatnoor, Balachandran Krishnan","doi":"10.5373/jardcs/v12sp5/20201788","DOIUrl":null,"url":null,"abstract":"Misconceptions of a particular issue like health, diseases, politics, government policies, epidemics and pandemics have been a social issue for a number of years, particularly after the advent of social media, and often spread faster than true truth The engagement with social media like Twitter being one of the most prominent news outlets continuing is a major source of information today, particularly the information distributed around the network In this paper, the efficacy of Misconception Detection System was tested on Corona Pandemic Dataset extracted from Twitter posts A Trigram and a weighted TF-IDF Model followed by a supervised classifier were used for categorizing the dataset into two classes: one with misconceptions about COVID-19 virus and the other comprising correct and authenticated information Trigrams were more reliable as the functional words related to coronavirus appeared more frequently in the corpus created The proposed system using a combination of trigrams and weighted TF-IDF gave relevant and a normalized score leading to an efficient creation of vector space model and this has yielded good performance results when compared with traditional approaches using Bag of Words and Count Vectorizer technique where the vector space model was created only through word count © 2020, Institute of Advanced Scientific Research, Inc All rights reserved","PeriodicalId":269116,"journal":{"name":"Journal of Advanced Research in Dynamical and Control Systems","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Identification of Misconceptions about Corona Outbreak Using Trigrams and Weighted TF-IDF Model\",\"authors\":\"Sujatha Arun Kokatnoor, Balachandran Krishnan\",\"doi\":\"10.5373/jardcs/v12sp5/20201788\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Misconceptions of a particular issue like health, diseases, politics, government policies, epidemics and pandemics have been a social issue for a number of years, particularly after the advent of social media, and often spread faster than true truth The engagement with social media like Twitter being one of the most prominent news outlets continuing is a major source of information today, particularly the information distributed around the network In this paper, the efficacy of Misconception Detection System was tested on Corona Pandemic Dataset extracted from Twitter posts A Trigram and a weighted TF-IDF Model followed by a supervised classifier were used for categorizing the dataset into two classes: one with misconceptions about COVID-19 virus and the other comprising correct and authenticated information Trigrams were more reliable as the functional words related to coronavirus appeared more frequently in the corpus created The proposed system using a combination of trigrams and weighted TF-IDF gave relevant and a normalized score leading to an efficient creation of vector space model and this has yielded good performance results when compared with traditional approaches using Bag of Words and Count Vectorizer technique where the vector space model was created only through word count © 2020, Institute of Advanced Scientific Research, Inc All rights reserved\",\"PeriodicalId\":269116,\"journal\":{\"name\":\"Journal of Advanced Research in Dynamical and Control Systems\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Advanced Research in Dynamical and Control Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5373/jardcs/v12sp5/20201788\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Advanced Research in Dynamical and Control Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5373/jardcs/v12sp5/20201788","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Identification of Misconceptions about Corona Outbreak Using Trigrams and Weighted TF-IDF Model
Misconceptions of a particular issue like health, diseases, politics, government policies, epidemics and pandemics have been a social issue for a number of years, particularly after the advent of social media, and often spread faster than true truth The engagement with social media like Twitter being one of the most prominent news outlets continuing is a major source of information today, particularly the information distributed around the network In this paper, the efficacy of Misconception Detection System was tested on Corona Pandemic Dataset extracted from Twitter posts A Trigram and a weighted TF-IDF Model followed by a supervised classifier were used for categorizing the dataset into two classes: one with misconceptions about COVID-19 virus and the other comprising correct and authenticated information Trigrams were more reliable as the functional words related to coronavirus appeared more frequently in the corpus created The proposed system using a combination of trigrams and weighted TF-IDF gave relevant and a normalized score leading to an efficient creation of vector space model and this has yielded good performance results when compared with traditional approaches using Bag of Words and Count Vectorizer technique where the vector space model was created only through word count © 2020, Institute of Advanced Scientific Research, Inc All rights reserved