{"title":"TLV-Bandit: Bandit Method for Collecting Topic-related Local Tweets","authors":"Carina Miwa Yoshimura, H. Kitagawa","doi":"10.1109/MIPR51284.2021.00016","DOIUrl":null,"url":null,"abstract":"Twitter hosts a large and diverse amount of information that makes up a corpus of data valuable to a wide range of institutions from marketing firms to governments. Collection of tweets can enable analysis like surveys of public opinions, marketing analysis or target analysis to users who live in a specific area. To collect useful data for a given task, the ability to capture tweets related to a specific topic sent from a specific area is needed. However, performing this kind of task on significantly sizable data sources such as the twitter stream data using just the Twitter API is a big challenge because of limitation relating to usage restrictions and lack of geotags. In this work, we propose \"TLV-Bandit\", which collects topic-related tweets sent from a specific area based on the bandit algorithm and analyze its performance. The experimental results show that our proposed method can collect efficiently the target tweets in comparison to other methods when considering the three aspects of collection requirements: Locality (sent from the target area), Similarity (topic-related) and Volume (number of tweets).","PeriodicalId":139543,"journal":{"name":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MIPR51284.2021.00016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Twitter hosts a large and diverse amount of information that makes up a corpus of data valuable to a wide range of institutions from marketing firms to governments. Collection of tweets can enable analysis like surveys of public opinions, marketing analysis or target analysis to users who live in a specific area. To collect useful data for a given task, the ability to capture tweets related to a specific topic sent from a specific area is needed. However, performing this kind of task on significantly sizable data sources such as the twitter stream data using just the Twitter API is a big challenge because of limitation relating to usage restrictions and lack of geotags. In this work, we propose "TLV-Bandit", which collects topic-related tweets sent from a specific area based on the bandit algorithm and analyze its performance. The experimental results show that our proposed method can collect efficiently the target tweets in comparison to other methods when considering the three aspects of collection requirements: Locality (sent from the target area), Similarity (topic-related) and Volume (number of tweets).