{"title":"Content-based clustering and visualization of social media text messages","authors":"S. A. Barnard, S. M. Chung, Vincent A. Schmidt","doi":"10.1109/ICODSE.2017.8285856","DOIUrl":null,"url":null,"abstract":"Although Twitter has been around for more than ten years, crisis management agencies and first response personnel are not able to fully use the information this type of data provides during a crisis or a natural disaster. This paper presents a tool that automatically clusters geotagged text data based on their content, rather than by only time and location, and displays the clusters and their locations on the map. It allows at-a-glance information to be displayed throughout the evolution of a crisis. For accurate clustering, we used the silhouette coefficient to determine the number of clusters automatically. To visualize the topics (i.e., frequent words) within each cluster, we used the word cloud. Our experiments demonstrated the performance of this tool is very scalable. This tool could be easily used by first response and official management personnel to quickly determine when a crisis is occurring, where it is concentrated, and what resources to best deploy to stabilize the situation.","PeriodicalId":366005,"journal":{"name":"2017 International Conference on Data and Software Engineering (ICoDSE)","volume":"16 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Data and Software Engineering (ICoDSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICODSE.2017.8285856","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Although Twitter has been around for more than ten years, crisis management agencies and first response personnel are not able to fully use the information this type of data provides during a crisis or a natural disaster. This paper presents a tool that automatically clusters geotagged text data based on their content, rather than by only time and location, and displays the clusters and their locations on the map. It allows at-a-glance information to be displayed throughout the evolution of a crisis. For accurate clustering, we used the silhouette coefficient to determine the number of clusters automatically. To visualize the topics (i.e., frequent words) within each cluster, we used the word cloud. Our experiments demonstrated the performance of this tool is very scalable. This tool could be easily used by first response and official management personnel to quickly determine when a crisis is occurring, where it is concentrated, and what resources to best deploy to stabilize the situation.