量化Twitter上自我报告的药物不良事件:信号和话题分析

Vassilis Plachouras, Jochen L. Leidner, Andrew G. Garrow
{"title":"量化Twitter上自我报告的药物不良事件:信号和话题分析","authors":"Vassilis Plachouras, Jochen L. Leidner, Andrew G. Garrow","doi":"10.1145/2930971.2930977","DOIUrl":null,"url":null,"abstract":"When a drug that is sold exhibits side effects, a well functioning ecosystem of pharmaceutical drug suppliers includes responsive regulators and pharmaceutical companies. Existing systems for monitoring adverse drug events, such as the Federal Adverse Events Reporting System (FAERS) in the US, have shown limited effectiveness due to the lack of incentives for healthcare professionals and patients. While social media present opportunities to mine information about adverse events in near real-time, there are still important questions to be answered in order to understand their impact on pharmacovigilance. First, it is not known how many relevant social media posts occur per day on platforms like Twitter, i.e., whether there is \"enough signal\" for a post-market pharmacovigilance program based on Twitter mining. Second, it is not known what other topics are discussed by users in posts mentioning pharmaceutical drugs. In this paper, we outline how social media can be used as a human sensor for drug use monitoring. We introduce a large-scale, near real-time system for computational pharmacovigilance, and use our system to estimate the order of magnitude of the volume of daily self-reported pharmaceutical drug side effect tweets. The processing pipeline comprises a set of cascaded filters, followed by a supervised machine learning classifier. The cascaded filters quickly reduce the volume to a manageable sub-stream, from which a Support Vector Machine (SVM) based classifier identifies adverse events based on a rich set of features taking into account surface-textual properties, as well as domain knowledge about drugs, side effects and the Twitter medium. Using a dataset of 10,000 manually annotated tweets, a SVM classifier achieves F1=60.4% and AUC=0.894. The yield of the classifier for a drug universe comprising 2,600 keywords is 721 tweets per day. We also investigate what other topics are discussed in the posts mentioning pharmaceutical drugs. We conclude by suggesting an ecosystem where regulators and pharmaceutical companies utilize social media to obtain feedback about consequences of pharmaceutical drug use.","PeriodicalId":227482,"journal":{"name":"Proceedings of the 7th 2016 International Conference on Social Media & Society","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"Quantifying Self-Reported Adverse Drug Events on Twitter: Signal and Topic Analysis\",\"authors\":\"Vassilis Plachouras, Jochen L. Leidner, Andrew G. Garrow\",\"doi\":\"10.1145/2930971.2930977\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When a drug that is sold exhibits side effects, a well functioning ecosystem of pharmaceutical drug suppliers includes responsive regulators and pharmaceutical companies. Existing systems for monitoring adverse drug events, such as the Federal Adverse Events Reporting System (FAERS) in the US, have shown limited effectiveness due to the lack of incentives for healthcare professionals and patients. While social media present opportunities to mine information about adverse events in near real-time, there are still important questions to be answered in order to understand their impact on pharmacovigilance. First, it is not known how many relevant social media posts occur per day on platforms like Twitter, i.e., whether there is \\\"enough signal\\\" for a post-market pharmacovigilance program based on Twitter mining. Second, it is not known what other topics are discussed by users in posts mentioning pharmaceutical drugs. In this paper, we outline how social media can be used as a human sensor for drug use monitoring. We introduce a large-scale, near real-time system for computational pharmacovigilance, and use our system to estimate the order of magnitude of the volume of daily self-reported pharmaceutical drug side effect tweets. The processing pipeline comprises a set of cascaded filters, followed by a supervised machine learning classifier. The cascaded filters quickly reduce the volume to a manageable sub-stream, from which a Support Vector Machine (SVM) based classifier identifies adverse events based on a rich set of features taking into account surface-textual properties, as well as domain knowledge about drugs, side effects and the Twitter medium. Using a dataset of 10,000 manually annotated tweets, a SVM classifier achieves F1=60.4% and AUC=0.894. The yield of the classifier for a drug universe comprising 2,600 keywords is 721 tweets per day. We also investigate what other topics are discussed in the posts mentioning pharmaceutical drugs. We conclude by suggesting an ecosystem where regulators and pharmaceutical companies utilize social media to obtain feedback about consequences of pharmaceutical drug use.\",\"PeriodicalId\":227482,\"journal\":{\"name\":\"Proceedings of the 7th 2016 International Conference on Social Media & Society\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 7th 2016 International Conference on Social Media & Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2930971.2930977\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th 2016 International Conference on Social Media & Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2930971.2930977","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22

摘要

当销售的药物显示出副作用时,一个运作良好的药物供应商生态系统包括负责任的监管机构和制药公司。现有的药物不良事件监测系统,如美国的联邦不良事件报告系统(FAERS),由于缺乏对医疗保健专业人员和患者的激励,显示出有限的有效性。虽然社交媒体提供了近乎实时地挖掘不良事件信息的机会,但为了了解它们对药物警戒的影响,仍然有重要的问题需要回答。首先,不知道Twitter等平台上每天有多少相关的社交媒体帖子,也就是说,是否有“足够的信号”来开展基于Twitter挖掘的上市后药物警戒项目。其次,不知道用户在提到药品的帖子中还讨论了哪些话题。在本文中,我们概述了如何将社交媒体用作药物使用监测的人体传感器。我们引入了一个大规模的、接近实时的计算药物警戒系统,并使用我们的系统来估计每天自我报告药物副作用的推文的数量。处理管道包括一组级联过滤器,然后是一个监督机器学习分类器。级联过滤器迅速将体积减少到一个可管理的子流,其中基于支持向量机(SVM)的分类器基于一组丰富的特征来识别不良事件,考虑到表面文本属性,以及关于药物、副作用和Twitter媒体的领域知识。使用10000条人工标注推文的数据集,SVM分类器实现F1=60.4%, AUC=0.894。对于包含2600个关键词的药物领域,分类器的产出是每天721条tweet。我们也调查了在提到药物的帖子中讨论的其他话题。最后,我们建议建立一个生态系统,监管机构和制药公司利用社交媒体获得有关药物使用后果的反馈。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Quantifying Self-Reported Adverse Drug Events on Twitter: Signal and Topic Analysis
When a drug that is sold exhibits side effects, a well functioning ecosystem of pharmaceutical drug suppliers includes responsive regulators and pharmaceutical companies. Existing systems for monitoring adverse drug events, such as the Federal Adverse Events Reporting System (FAERS) in the US, have shown limited effectiveness due to the lack of incentives for healthcare professionals and patients. While social media present opportunities to mine information about adverse events in near real-time, there are still important questions to be answered in order to understand their impact on pharmacovigilance. First, it is not known how many relevant social media posts occur per day on platforms like Twitter, i.e., whether there is "enough signal" for a post-market pharmacovigilance program based on Twitter mining. Second, it is not known what other topics are discussed by users in posts mentioning pharmaceutical drugs. In this paper, we outline how social media can be used as a human sensor for drug use monitoring. We introduce a large-scale, near real-time system for computational pharmacovigilance, and use our system to estimate the order of magnitude of the volume of daily self-reported pharmaceutical drug side effect tweets. The processing pipeline comprises a set of cascaded filters, followed by a supervised machine learning classifier. The cascaded filters quickly reduce the volume to a manageable sub-stream, from which a Support Vector Machine (SVM) based classifier identifies adverse events based on a rich set of features taking into account surface-textual properties, as well as domain knowledge about drugs, side effects and the Twitter medium. Using a dataset of 10,000 manually annotated tweets, a SVM classifier achieves F1=60.4% and AUC=0.894. The yield of the classifier for a drug universe comprising 2,600 keywords is 721 tweets per day. We also investigate what other topics are discussed in the posts mentioning pharmaceutical drugs. We conclude by suggesting an ecosystem where regulators and pharmaceutical companies utilize social media to obtain feedback about consequences of pharmaceutical drug use.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Twitter Adoption in U.S. Legislatures: A Fifty-State Study The Roles of Sensation Seeking and Gratifications Sought in Social Networking Apps Use and Attendant Sexual Behaviors How Twitter reveals Cities within Cities The Method to the Madness: The 2012 United States Presidential Election Twitter Corpus Introduction to the 2016 International Conference on Social Media and Society
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1