Nils Erlanson, Joana Félix China, Henric Taavola, G Niklas Norén
{"title":"Clinical Relatedness and Stability of vigiVec Semantic Vector Representations of Adverse Events and Drugs in Pharmacovigilance.","authors":"Nils Erlanson, Joana Félix China, Henric Taavola, G Niklas Norén","doi":"10.1007/s40264-024-01509-2","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Individual case reports are essential to identify and assess previously unknown adverse effects of medicines. On these reports, information on adverse events (AEs) and drugs are encoded in hierarchical terminologies. Encoding differences may hinder the retrieval and analysis of clinically related reports relevant to a topic of interest. Recent studies have explored the use of data-driven semantic vector representations to support analysis of pharmacovigilance data.</p><p><strong>Objective: </strong>This study aims to evaluate the stability and clinical relatedness of vigiVec, a semantic vector representation for codes of AEs and drugs.</p><p><strong>Methods: </strong>vigiVec is a published adaptation to pharmacovigilance of the publicly available Word2Vec model, applied to structured data instead of free text. It provides vector representations for MedDRA<sup>®</sup> Preferred Terms and WHODrug Global active ingredients, learned from reporting patterns in VigiBase, the WHO global database of adverse event reports for medicines and vaccines. For this study, a 20-dimensional Skip-gram architecture with window size 250 was used. Our evaluation focused on nearest neighbors identified by the cosine similarity of vigiVec vector representations. Clinical relatedness was measured through term intruder detection, whereby a medical doctor was tasked to identify a randomly selected term-the intruder-included among the four nearest neighbors to a specific AE or drug. Stability was measured as the average overlap in the ten nearest neighbors for each AE or drug, in repeated fittings of vigiVec.</p><p><strong>Results: </strong>Among the ten nearest neighbors, 1.8 AEs on average belonged to the same MedDRA High Level Term (HLT; e.g., coagulopathies), and 1.3 drugs belonged to the same Anatomical Therapeutic Chemical level 3 (ATC-3; e.g., opioids). In the intruder detection task, when neighbors and intruders were both chosen from the same HLT, the intruder detection rate was 46%. When selected from different HLTs, it was 79%. By random chance, we should expect 20% (1 in 5). Corresponding rates for drugs were 42% in same ATC-3 and 65% in different ATC-3. The stability of nearest neighbors was 80% for AEs and 64% for drugs.</p><p><strong>Conclusion: </strong>Nearest neighbors identified with vigiVec are stable and show high level of clinical relatedness. They are often from different parts of the existing hierarchies and complement these.</p>","PeriodicalId":11382,"journal":{"name":"Drug Safety","volume":" ","pages":""},"PeriodicalIF":4.0000,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Drug Safety","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s40264-024-01509-2","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: Individual case reports are essential to identify and assess previously unknown adverse effects of medicines. On these reports, information on adverse events (AEs) and drugs are encoded in hierarchical terminologies. Encoding differences may hinder the retrieval and analysis of clinically related reports relevant to a topic of interest. Recent studies have explored the use of data-driven semantic vector representations to support analysis of pharmacovigilance data.
Objective: This study aims to evaluate the stability and clinical relatedness of vigiVec, a semantic vector representation for codes of AEs and drugs.
Methods: vigiVec is a published adaptation to pharmacovigilance of the publicly available Word2Vec model, applied to structured data instead of free text. It provides vector representations for MedDRA® Preferred Terms and WHODrug Global active ingredients, learned from reporting patterns in VigiBase, the WHO global database of adverse event reports for medicines and vaccines. For this study, a 20-dimensional Skip-gram architecture with window size 250 was used. Our evaluation focused on nearest neighbors identified by the cosine similarity of vigiVec vector representations. Clinical relatedness was measured through term intruder detection, whereby a medical doctor was tasked to identify a randomly selected term-the intruder-included among the four nearest neighbors to a specific AE or drug. Stability was measured as the average overlap in the ten nearest neighbors for each AE or drug, in repeated fittings of vigiVec.
Results: Among the ten nearest neighbors, 1.8 AEs on average belonged to the same MedDRA High Level Term (HLT; e.g., coagulopathies), and 1.3 drugs belonged to the same Anatomical Therapeutic Chemical level 3 (ATC-3; e.g., opioids). In the intruder detection task, when neighbors and intruders were both chosen from the same HLT, the intruder detection rate was 46%. When selected from different HLTs, it was 79%. By random chance, we should expect 20% (1 in 5). Corresponding rates for drugs were 42% in same ATC-3 and 65% in different ATC-3. The stability of nearest neighbors was 80% for AEs and 64% for drugs.
Conclusion: Nearest neighbors identified with vigiVec are stable and show high level of clinical relatedness. They are often from different parts of the existing hierarchies and complement these.
期刊介绍:
Drug Safety is the official journal of the International Society of Pharmacovigilance. The journal includes:
Overviews of contentious or emerging issues.
Comprehensive narrative reviews that provide an authoritative source of information on epidemiology, clinical features, prevention and management of adverse effects of individual drugs and drug classes.
In-depth benefit-risk assessment of adverse effect and efficacy data for a drug in a defined therapeutic area.
Systematic reviews (with or without meta-analyses) that collate empirical evidence to answer a specific research question, using explicit, systematic methods as outlined by the PRISMA statement.
Original research articles reporting the results of well-designed studies in disciplines such as pharmacoepidemiology, pharmacovigilance, pharmacology and toxicology, and pharmacogenomics.
Editorials and commentaries on topical issues.
Additional digital features (including animated abstracts, video abstracts, slide decks, audio slides, instructional videos, infographics, podcasts and animations) can be published with articles; these are designed to increase the visibility, readership and educational value of the journal’s content. In addition, articles published in Drug Safety Drugs may be accompanied by plain language summaries to assist readers who have some knowledge of, but not in-depth expertise in, the area to understand important medical advances.