Translating surveys to surveillance on social media: methodological challenges & solutions

Chao Yang, P. Srinivasan
{"title":"Translating surveys to surveillance on social media: methodological challenges & solutions","authors":"Chao Yang, P. Srinivasan","doi":"10.1145/2615569.2615696","DOIUrl":null,"url":null,"abstract":"Passive surveillance of preferences, opinions and behaviors on social media is becoming increasingly common. The general goal is to make inferences from observations collected from the numerous posts publicly available in blogs, microblogs, and other social forums. A traditional approach for collecting observations is by querying a random (or convenience) sample of individuals with surveys. A wide variety of well respected survey instruments have been developed over many decades especially in social sciences.The question addressed here is: how does one `translate' a survey of interest into surveillance strategies on social media? Specifically, how does one find the posts that could be interpreted as valid responses to the survey? Developing a general methodology for translating a survey into social medial surveillance might further the inclusion of social media research into traditional social science research. We propose a translation methodology using a well-reputed survey (the Satisfaction with Life Scale) as an example. A second methodological contribution that goes beyond the survey translation focus is a crowdsourcing approach, which we claim with reasonable confidence, finds close to \\ul{all} the relevant items in a dataset. This is different from the standard approach of asking workers to annotate all items in a small dataset. Our method supports more accurate evaluations (i.e., more precise recall calculations) as well as the development of larger training datasets. Finally the resulting surveillance method derived from the life satisfaction survey achieves recall, precision and F scores between 0.59 and 0.65. This is considerably better than standard methods using lexicons (precision around 0.16) or classifiers (precision, recall and F scores between 0.32 and 0.38).","PeriodicalId":93136,"journal":{"name":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","volume":"32 1","pages":"4-12"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... ACM Web Science Conference. ACM Web Science Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2615569.2615696","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

Passive surveillance of preferences, opinions and behaviors on social media is becoming increasingly common. The general goal is to make inferences from observations collected from the numerous posts publicly available in blogs, microblogs, and other social forums. A traditional approach for collecting observations is by querying a random (or convenience) sample of individuals with surveys. A wide variety of well respected survey instruments have been developed over many decades especially in social sciences.The question addressed here is: how does one `translate' a survey of interest into surveillance strategies on social media? Specifically, how does one find the posts that could be interpreted as valid responses to the survey? Developing a general methodology for translating a survey into social medial surveillance might further the inclusion of social media research into traditional social science research. We propose a translation methodology using a well-reputed survey (the Satisfaction with Life Scale) as an example. A second methodological contribution that goes beyond the survey translation focus is a crowdsourcing approach, which we claim with reasonable confidence, finds close to \ul{all} the relevant items in a dataset. This is different from the standard approach of asking workers to annotate all items in a small dataset. Our method supports more accurate evaluations (i.e., more precise recall calculations) as well as the development of larger training datasets. Finally the resulting surveillance method derived from the life satisfaction survey achieves recall, precision and F scores between 0.59 and 0.65. This is considerably better than standard methods using lexicons (precision around 0.16) or classifiers (precision, recall and F scores between 0.32 and 0.38).
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
将调查转化为社交媒体上的监控:方法上的挑战和解决方案
对社交媒体上的偏好、观点和行为进行被动监控正变得越来越普遍。总体目标是从博客、微博客和其他社会论坛上公开的大量帖子中收集到的观察结果进行推断。收集观察结果的传统方法是通过调查查询随机(或方便)的个人样本。在过去的几十年里,特别是在社会科学领域,各种各样的调查工具都得到了发展。这里要解决的问题是:如何将兴趣调查“转化”为社交媒体上的监控策略?具体来说,一个人如何找到可以被解释为对调查的有效回应的帖子?开发一种将调查转化为社会媒体监测的一般方法,可能会进一步将社会媒体研究纳入传统的社会科学研究。我们以一项著名的调查(生活满意度量表)为例,提出了一种翻译方法。超越调查翻译焦点的第二个方法贡献是众包方法,我们有理由相信,它可以在数据集中找到接近\ \{所有}的相关项目。这与要求工作人员在小数据集中注释所有项目的标准方法不同。我们的方法支持更准确的评估(即更精确的召回计算)以及更大的训练数据集的开发。最后,从生活满意度调查中得出的监测方法达到了召回率、精度和F值在0.59到0.65之间。这比使用词汇表(精度在0.16左右)或分类器(精度、召回率和F分数在0.32到0.38之间)的标准方法要好得多。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Opinions on Homeopathy for COVID-19 on Twitter. An Initial Study of Depression Detection on Mandarin Textual through BERT Model WebSci '22: 14th ACM Web Science Conference 2022, Barcelona, Spain, June 26 - 29, 2022 WebSci '21: 13th ACM Web Science Conference 2021, Virtual Event, United Kingdom, 21-25 June, 2021, Companion Publication In conversation with Martha Lane Fox and Wendy Hall on the Future of the Internet
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1