{"title":"Twitter as research data <i>Tools, costs, skill sets, and lessons learned</i>.","authors":"Kaiping Chen, Zening Duan, Sijia Yang","doi":"10.1017/pls.2021.19","DOIUrl":null,"url":null,"abstract":"<p><p>Scholars increasingly use Twitter data to study the life sciences and politics. However, Twitter data collection tools often pose challenges for scholars who are unfamiliar with their operation. Equally important, although many tools indicate that they offer representative samples of the full Twitter archive, little is known about whether the samples are indeed representative of the targeted population of tweets. This article evaluates such tools in terms of costs, training, and data quality as a means to introduce Twitter data as a research tool. Further, using an analysis of COVID-19 and moral foundations theory as an example, we compared the distributions of moral discussions from two commonly used tools for accessing Twitter data (Twitter's standard APIs and third-party access) to the ground truth, the Twitter full archive. Our results highlight the importance of assessing the comparability of data sources to improve confidence in findings based on Twitter data. We also review the major new features of Twitter's API version 2.</p>","PeriodicalId":35901,"journal":{"name":"Politics and the Life Sciences","volume":"41 1","pages":"114-130"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Politics and the Life Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1017/pls.2021.19","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 12
Abstract
Scholars increasingly use Twitter data to study the life sciences and politics. However, Twitter data collection tools often pose challenges for scholars who are unfamiliar with their operation. Equally important, although many tools indicate that they offer representative samples of the full Twitter archive, little is known about whether the samples are indeed representative of the targeted population of tweets. This article evaluates such tools in terms of costs, training, and data quality as a means to introduce Twitter data as a research tool. Further, using an analysis of COVID-19 and moral foundations theory as an example, we compared the distributions of moral discussions from two commonly used tools for accessing Twitter data (Twitter's standard APIs and third-party access) to the ground truth, the Twitter full archive. Our results highlight the importance of assessing the comparability of data sources to improve confidence in findings based on Twitter data. We also review the major new features of Twitter's API version 2.
期刊介绍:
POLITICS AND THE LIFE SCIENCES is an interdisciplinary peer-reviewed journal with a global audience. PLS is owned and published by the ASSOCIATION FOR POLITICS AND THE LIFE SCIENCES, the APLS, which is both an American Political Science Association (APSA) Related Group and an American Institute of Biological Sciences (AIBS) Member Society. The PLS topic range is exceptionally broad: evolutionary and laboratory insights into political behavior, including political violence, from group conflict to war, terrorism, and torture; political analysis of life-sciences research, health policy, environmental policy, and biosecurity policy; and philosophical analysis of life-sciences problems, such as bioethical controversies.