基于Web和社交媒体内容的COVID-19疫苗不良事件监测语言模型工具的应用：算法开发和验证研究

IF 3.5 Q1 HEALTH CARE SCIENCES & SERVICES JMIR infodemiology Pub Date : 2024-12-20 DOI:10.2196/53424

Chathuri Daluwatte, Alena Khromava, Yuning Chen, Laurence Serradell, Anne-Laure Chabanon, Anthony Chan-Ou-Teung, Cliona Molony, Juhaeri Juhaeri

{"title":"基于Web和社交媒体内容的COVID-19疫苗不良事件监测语言模型工具的应用：算法开发和验证研究","authors":"Chathuri Daluwatte, Alena Khromava, Yuning Chen, Laurence Serradell, Anne-Laure Chabanon, Anthony Chan-Ou-Teung, Cliona Molony, Juhaeri Juhaeri","doi":"10.2196/53424","DOIUrl":null,"url":null,"abstract":"Background: Spontaneous pharmacovigilance reporting systems are the main data source for signal detection for vaccines. However, there is a large time lag between the occurrence of an adverse event (AE) and the availability for analysis. With global mass COVID-19 vaccination campaigns, social media, and web content, there is an opportunity for real-time, faster monitoring of AEs potentially related to COVID-19 vaccine use. Our work aims to detect AEs from social media to augment those from spontaneous reporting systems.Objective: This study aims to monitor AEs shared in social media and online support groups using medical context-aware natural language processing language models.Methods: We developed a language model-based web app to analyze social media, patient blogs, and forums (from 190 countries in 61 languages) around COVID-19 vaccine-related keywords. Following machine translation to English, lay language safety terms (ie, AEs) were observed using the PubmedBERT-based named-entity recognition model (precision=0.76 and recall=0.82) and mapped to Medical Dictionary for Regulatory Activities (MedDRA) terms using knowledge graphs (MedDRA terminology is an internationally used set of terms relating to medical conditions, medicines, and medical devices that are developed and registered under the auspices of the International Council for Harmonization of Technical Requirements for Pharmaceuticals for Human Use). Weekly and cumulative aggregated AE counts, proportions, and ratios were displayed via visual analytics, such as word clouds.Results: Most AEs were identified in 2021, with fewer in 2022. AEs observed using the web app were consistent with AEs communicated by health authorities shortly before or within the same period.Conclusions: Monitoring the web and social media provides opportunities to observe AEs that may be related to the use of COVID-19 vaccines. The presented analysis demonstrates the ability to use web content and social media as a data source that could contribute to the early observation of AEs and enhance postmarketing surveillance. It could help to adjust signal detection strategies and communication with external stakeholders, contributing to increased confidence in vaccine safety monitoring.","PeriodicalId":73554,"journal":{"name":"JMIR infodemiology","volume":"4 ","pages":"e53424"},"PeriodicalIF":3.5000,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11699502/pdf/","citationCount":"0","resultStr":"{\"title\":\"Application of a Language Model Tool for COVID-19 Vaccine Adverse Event Monitoring Using Web and Social Media Content: Algorithm Development and Validation Study.\",\"authors\":\"Chathuri Daluwatte, Alena Khromava, Yuning Chen, Laurence Serradell, Anne-Laure Chabanon, Anthony Chan-Ou-Teung, Cliona Molony, Juhaeri Juhaeri\",\"doi\":\"10.2196/53424\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Spontaneous pharmacovigilance reporting systems are the main data source for signal detection for vaccines. However, there is a large time lag between the occurrence of an adverse event (AE) and the availability for analysis. With global mass COVID-19 vaccination campaigns, social media, and web content, there is an opportunity for real-time, faster monitoring of AEs potentially related to COVID-19 vaccine use. Our work aims to detect AEs from social media to augment those from spontaneous reporting systems.Objective: This study aims to monitor AEs shared in social media and online support groups using medical context-aware natural language processing language models.Methods: We developed a language model-based web app to analyze social media, patient blogs, and forums (from 190 countries in 61 languages) around COVID-19 vaccine-related keywords. Following machine translation to English, lay language safety terms (ie, AEs) were observed using the PubmedBERT-based named-entity recognition model (precision=0.76 and recall=0.82) and mapped to Medical Dictionary for Regulatory Activities (MedDRA) terms using knowledge graphs (MedDRA terminology is an internationally used set of terms relating to medical conditions, medicines, and medical devices that are developed and registered under the auspices of the International Council for Harmonization of Technical Requirements for Pharmaceuticals for Human Use). Weekly and cumulative aggregated AE counts, proportions, and ratios were displayed via visual analytics, such as word clouds.Results: Most AEs were identified in 2021, with fewer in 2022. AEs observed using the web app were consistent with AEs communicated by health authorities shortly before or within the same period.Conclusions: Monitoring the web and social media provides opportunities to observe AEs that may be related to the use of COVID-19 vaccines. The presented analysis demonstrates the ability to use web content and social media as a data source that could contribute to the early observation of AEs and enhance postmarketing surveillance. It could help to adjust signal detection strategies and communication with external stakeholders, contributing to increased confidence in vaccine safety monitoring.\",\"PeriodicalId\":73554,\"journal\":{\"name\":\"JMIR infodemiology\",\"volume\":\"4 \",\"pages\":\"e53424\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-12-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11699502/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JMIR infodemiology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2196/53424\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR infodemiology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/53424","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

摘要

背景：自发药物警戒报告系统是疫苗信号检测的主要数据来源。然而，在不良事件（AE）的发生和可用性分析之间存在较大的时间滞后。随着全球大规模COVID-19疫苗接种运动、社交媒体和网络内容的出现，有机会实时、更快地监测可能与COVID-19疫苗使用有关的不良反应。我们的工作旨在检测来自社交媒体的ae，以增强来自自发报告系统的ae。目的：本研究旨在利用医学语境感知自然语言处理语言模型监测社交媒体和在线支持团体中共享的ae。方法：我们开发了一个基于语言模型的web应用程序，分析来自190个国家、61种语言的社交媒体、患者博客和论坛中与COVID-19疫苗相关的关键词。在机器翻译成英语之后，使用基于pubmedbert的命名实体识别模型（精度=0.76，召回率=0.82）观察外行语言安全术语（即ae），并使用知识图将其映射到监管活动医学词典（MedDRA）术语(MedDRA术语是一套国际使用的与医疗条件、药物、以及在国际人用药品技术要求统一理事会主持下开发和注册的医疗器械)。每周和累计汇总的AE计数、比例和比率通过可视化分析（如字云）显示。结果：大多数ae在2021年被发现，较少在2022年被发现。使用网络应用程序观察到的ae与卫生当局在不久前或同一时期内通报的ae一致。结论：监测网络和社交媒体提供了观察可能与使用COVID-19疫苗有关的ae的机会。所提出的分析证明了使用网络内容和社交媒体作为数据源的能力，可以有助于早期观察ae并加强上市后监督。它可以帮助调整信号检测战略和与外部利益攸关方的沟通，有助于提高对疫苗安全监测的信心。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Application of a Language Model Tool for COVID-19 Vaccine Adverse Event Monitoring Using Web and Social Media Content: Algorithm Development and Validation Study.

Background: Spontaneous pharmacovigilance reporting systems are the main data source for signal detection for vaccines. However, there is a large time lag between the occurrence of an adverse event (AE) and the availability for analysis. With global mass COVID-19 vaccination campaigns, social media, and web content, there is an opportunity for real-time, faster monitoring of AEs potentially related to COVID-19 vaccine use. Our work aims to detect AEs from social media to augment those from spontaneous reporting systems.

Objective: This study aims to monitor AEs shared in social media and online support groups using medical context-aware natural language processing language models.

Methods: We developed a language model-based web app to analyze social media, patient blogs, and forums (from 190 countries in 61 languages) around COVID-19 vaccine-related keywords. Following machine translation to English, lay language safety terms (ie, AEs) were observed using the PubmedBERT-based named-entity recognition model (precision=0.76 and recall=0.82) and mapped to Medical Dictionary for Regulatory Activities (MedDRA) terms using knowledge graphs (MedDRA terminology is an internationally used set of terms relating to medical conditions, medicines, and medical devices that are developed and registered under the auspices of the International Council for Harmonization of Technical Requirements for Pharmaceuticals for Human Use). Weekly and cumulative aggregated AE counts, proportions, and ratios were displayed via visual analytics, such as word clouds.

Results: Most AEs were identified in 2021, with fewer in 2022. AEs observed using the web app were consistent with AEs communicated by health authorities shortly before or within the same period.

Conclusions: Monitoring the web and social media provides opportunities to observe AEs that may be related to the use of COVID-19 vaccines. The presented analysis demonstrates the ability to use web content and social media as a data source that could contribute to the early observation of AEs and enhance postmarketing surveillance. It could help to adjust signal detection strategies and communication with external stakeholders, contributing to increased confidence in vaccine safety monitoring.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

JMIR infodemiology

CiteScore

4.80

自引率

0.00%

发文量