Chia-Hsuan Chang, Michal Monselise, Christopher C Yang
{"title":"What Are People Concerned About During the Pandemic? Detecting Evolving Topics about COVID-19 from Twitter.","authors":"Chia-Hsuan Chang, Michal Monselise, Christopher C Yang","doi":"10.1007/s41666-020-00083-3","DOIUrl":null,"url":null,"abstract":"<p><p>With the novel coronavirus (COVID-19) pandemic affecting the lives of the citizens of over 200 countries, there is a need for policy makers and clinicians to understand public sentiment and track the spread of the disease. One of the sources for gaining valuable insight into public sentiment is through social media. This study aims to extract this insight by producing a list of the most discussed topics regarding COVID-19 on Twitter every week and monitoring the evolution of topics from week to week. This research will propose two topic mining that can handle a large-scale dataset-rolling online non-negative matrix factorization (Rolling-ONMF) and sliding online non-negative matrix factorization (Sliding-ONMF)-and compare the insights produced by both techniques. Each algorithm produces 425 topics over the course of 17 weeks. However, topics that have not evolved from one week to the next beyond a certain evolution threshold are consolidated into a single topic. Since the topics produced by the Rolling-ONMF algorithm each week depend on the topics from the previous week, we find that the Sliding-ONMF algorithm produces more varied topics each week; however, the topics produced by the Rolling-ONMF algorithm contain keywords that appear more consistent with each other when reviewing the terms manually. We also observe that the Sliding-ONMF algorithm is able to capture events that have shorter time frames rather than ones that last throughout many months while the Rolling-ONMF algorithm detects more general themes due to a higher average evolution score which leads to more topic consolidation. We have also conducted a qualitative analysis and grouped the detected topics into themes. A number of important themes such as government policy, economic crisis, COVID-19-related updates, COVID-19-related events, prevention, vaccines and treatments, and COVID-19 testing are identified. These reflected the concerns related to the pandemic expressed in social media.</p>","PeriodicalId":36444,"journal":{"name":"Journal of Healthcare Informatics Research","volume":"5 1","pages":"70-97"},"PeriodicalIF":5.9000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7811869/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Healthcare Informatics Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s41666-020-00083-3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/1/17 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0
Abstract
With the novel coronavirus (COVID-19) pandemic affecting the lives of the citizens of over 200 countries, there is a need for policy makers and clinicians to understand public sentiment and track the spread of the disease. One of the sources for gaining valuable insight into public sentiment is through social media. This study aims to extract this insight by producing a list of the most discussed topics regarding COVID-19 on Twitter every week and monitoring the evolution of topics from week to week. This research will propose two topic mining that can handle a large-scale dataset-rolling online non-negative matrix factorization (Rolling-ONMF) and sliding online non-negative matrix factorization (Sliding-ONMF)-and compare the insights produced by both techniques. Each algorithm produces 425 topics over the course of 17 weeks. However, topics that have not evolved from one week to the next beyond a certain evolution threshold are consolidated into a single topic. Since the topics produced by the Rolling-ONMF algorithm each week depend on the topics from the previous week, we find that the Sliding-ONMF algorithm produces more varied topics each week; however, the topics produced by the Rolling-ONMF algorithm contain keywords that appear more consistent with each other when reviewing the terms manually. We also observe that the Sliding-ONMF algorithm is able to capture events that have shorter time frames rather than ones that last throughout many months while the Rolling-ONMF algorithm detects more general themes due to a higher average evolution score which leads to more topic consolidation. We have also conducted a qualitative analysis and grouped the detected topics into themes. A number of important themes such as government policy, economic crisis, COVID-19-related updates, COVID-19-related events, prevention, vaccines and treatments, and COVID-19 testing are identified. These reflected the concerns related to the pandemic expressed in social media.
期刊介绍:
Journal of Healthcare Informatics Research serves as a publication venue for the innovative technical contributions highlighting analytics, systems, and human factors research in healthcare informatics.Journal of Healthcare Informatics Research is concerned with the application of computer science principles, information science principles, information technology, and communication technology to address problems in healthcare, and everyday wellness. Journal of Healthcare Informatics Research highlights the most cutting-edge technical contributions in computing-oriented healthcare informatics. The journal covers three major tracks: (1) analytics—focuses on data analytics, knowledge discovery, predictive modeling; (2) systems—focuses on building healthcare informatics systems (e.g., architecture, framework, design, engineering, and application); (3) human factors—focuses on understanding users or context, interface design, health behavior, and user studies of healthcare informatics applications. Topics include but are not limited to: · healthcare software architecture, framework, design, and engineering;· electronic health records· medical data mining· predictive modeling· medical information retrieval· medical natural language processing· healthcare information systems· smart health and connected health· social media analytics· mobile healthcare· medical signal processing· human factors in healthcare· usability studies in healthcare· user-interface design for medical devices and healthcare software· health service delivery· health games· security and privacy in healthcare· medical recommender system· healthcare workflow management· disease profiling and personalized treatment· visualization of medical data· intelligent medical devices and sensors· RFID solutions for healthcare· healthcare decision analytics and support systems· epidemiological surveillance systems and intervention modeling· consumer and clinician health information needs, seeking, sharing, and use· semantic Web, linked data, and ontology· collaboration technologies for healthcare· assistive and adaptive ubiquitous computing technologies· statistics and quality of medical data· healthcare delivery in developing countries· health systems modeling and simulation· computer-aided diagnosis