Patient-Related Metadata Reported in Sequencing Studies of SARS-CoV-2: Protocol for a Scoping Review and Bibliometric Analysis.

Karen O'Connor, Davy Weissenbacher, Amir Elyaderani, Ebbing Lautenbach, Matthew Scotch, Graciela Gonzalez-Hernandez
{"title":"Patient-Related Metadata Reported in Sequencing Studies of SARS-CoV-2: Protocol for a Scoping Review and Bibliometric Analysis.","authors":"Karen O'Connor, Davy Weissenbacher, Amir Elyaderani, Ebbing Lautenbach, Matthew Scotch, Graciela Gonzalez-Hernandez","doi":"10.1101/2023.07.14.23292681","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>There has been an unprecedented effort to sequence the SARS-CoV-2 virus and examine its molecular evolution. This has been facilitated by the availability of publicly accessible databases, the Global Initiative on Sharing All Influenza Data (GISAID) and GenBank, which collectively hold millions of SARS-CoV-2 sequence records. Genomic epidemiology, however, seeks to go beyond phylogenetic analysis by linking genetic information to patient characteristics and disease outcomes, enabling a comprehensive understanding of transmission dynamics and disease impact.While these repositories include fields reflecting patient-related metadata for a given sequence, inclusion of these demographic and clinical details is scarce. The extent to which patient-related metadata is reported in published sequencing studies and its quality remains largely unexplored.</p><p><strong>Methods: </strong>The NIH's LitCovid collection will be used for automated classification of articles reporting having deposited SARS-CoV-2 sequences in public repositories, while an independent search will be conducted in PubMed for validation. Data extraction will be conducted using Covidence. The extracted data will be synthesized and summarized to quantify the availability of patient metadata in the published literature of SARS-CoV-2 sequencing studies. For the bibliometric analysis, relevant data points, such as author affiliations and citation metrics will be extracted.</p><p><strong>Discussion: </strong>This scoping review will report on the extent and types of patient-related metadata reported in genomic viral sequencing studies of SARS-CoV-2, identify gaps in this reporting, and make recommendations for improving the quality and consistency of reporting in this area. The bibliometric analysis will uncover trends and patterns in the reporting of patient-related metadata, including differences in reporting based on study types or geographic regions. Co-occurrence networks of author keywords will also be presented. The insights gained from this study may help improve the quality and consistency of reporting patient metadata, enhancing the utility of sequence metadata and facilitating future research on infectious diseases.</p>","PeriodicalId":18659,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/4a/3c/nihpp-2023.07.14.23292681v1.PMC10371180.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv : the preprint server for health sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2023.07.14.23292681","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background: There has been an unprecedented effort to sequence the SARS-CoV-2 virus and examine its molecular evolution. This has been facilitated by the availability of publicly accessible databases, the Global Initiative on Sharing All Influenza Data (GISAID) and GenBank, which collectively hold millions of SARS-CoV-2 sequence records. Genomic epidemiology, however, seeks to go beyond phylogenetic analysis by linking genetic information to patient characteristics and disease outcomes, enabling a comprehensive understanding of transmission dynamics and disease impact.While these repositories include fields reflecting patient-related metadata for a given sequence, inclusion of these demographic and clinical details is scarce. The extent to which patient-related metadata is reported in published sequencing studies and its quality remains largely unexplored.

Methods: The NIH's LitCovid collection will be used for automated classification of articles reporting having deposited SARS-CoV-2 sequences in public repositories, while an independent search will be conducted in PubMed for validation. Data extraction will be conducted using Covidence. The extracted data will be synthesized and summarized to quantify the availability of patient metadata in the published literature of SARS-CoV-2 sequencing studies. For the bibliometric analysis, relevant data points, such as author affiliations and citation metrics will be extracted.

Discussion: This scoping review will report on the extent and types of patient-related metadata reported in genomic viral sequencing studies of SARS-CoV-2, identify gaps in this reporting, and make recommendations for improving the quality and consistency of reporting in this area. The bibliometric analysis will uncover trends and patterns in the reporting of patient-related metadata, including differences in reporting based on study types or geographic regions. Co-occurrence networks of author keywords will also be presented. The insights gained from this study may help improve the quality and consistency of reporting patient metadata, enhancing the utility of sequence metadata and facilitating future research on infectious diseases.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
严重急性呼吸系统综合征冠状病毒2型测序研究中报告的患者相关元数据:范围审查和文献计量分析方案。
背景:自新冠肺炎大流行开始以来,基因组流行病学做出了前所未有的努力,对SARS-CoV-2病毒进行测序并检查其分子进化。公共访问数据库GISAID和GenBank的可用性促进了这一点,它们共同保存了数百万条严重急性呼吸系统综合征冠状病毒2型序列记录。然而,基因组流行病学试图超越系统发育分析,将遗传信息与患者人口统计和疾病结果联系起来,从而全面了解传播动态和疾病影响。虽然这些存储库包括一些与患者相关的信息,如感染宿主的位置,但这些数据的粒度以及人口统计和临床细节的包含是不一致的。此外,在已发表的测序研究中,患者相关元数据的报告程度在很大程度上仍未得到探索。因此,评估严重急性呼吸系统综合征冠状病毒2型测序研究中报告的患者相关元数据的范围和质量至关重要。此外,已发表的文章和序列库之间的联系有限,阻碍了相关研究的识别。传统的基于关键词的搜索策略可能会漏掉相关文章。为了克服这些挑战,本研究提出使用自动分类器来识别相关文章。目的:本研究旨在进行系统全面的范围界定综述,并进行文献计量分析,以评估严重急性呼吸系统综合征冠状病毒2型测序研究中患者相关元数据的报告。方法:美国国立卫生研究院的LitCovid集合将用于机器学习分类,而PubMed将进行独立搜索。数据提取将使用Covidence进行,提取的数据将被合成和汇总,以量化已发表的严重急性呼吸系统综合征冠状病毒2型测序研究文献中患者元数据的可用性。对于文献计量分析,将提取相关数据点,如作者隶属关系、期刊信息和引用指标。结果:该研究将报告严重急性呼吸系统综合征冠状病毒2型基因组病毒测序研究中报告的患者相关元数据的范围和类型。范围审查将确定患者元数据报告中的差距,并为提高该领域报告的质量和一致性提出建议。文献计量分析将揭示患者相关元数据报告的趋势和模式,例如基于研究类型或地理区域的报告差异。还将展示作者关键词的共现网络,以突出常见主题及其与患者元数据报告的关联。结论:本研究将通过全面概述严重急性呼吸系统综合征冠状病毒2型测序研究中患者相关元数据的报告,有助于推进基因组流行病学领域的知识。从这项研究中获得的见解可能有助于提高报告患者元数据的质量和一致性,增强序列元数据的实用性,并促进未来对传染病的研究。这些发现还可能为机器学习方法的开发提供信息,以从测序研究中自动提取患者相关信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
After the Infection: A Survey of Pathogens and Non-communicable Human Disease. The Extra-Islet Pancreas Supports Autoimmunity in Human Type 1 Diabetes. Keyphrase Identification Using Minimal Labeled Data with Hierarchical Contexts and Transfer Learning. Advancing Efficacy Prediction for EHR-based Emulated Trials in Repurposing Heart Failure Therapies. Novel autoantibody targets identified in patients with autoimmune hepatitis (AIH) by PhIP-Seq reveals pathogenic insights.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1