评估真实世界数据登记册中乳腺癌患者的 BRCA1 和 BRCA2 基因变异数据。

IF 3.3 Q2 ONCOLOGY JCO Clinical Cancer Informatics Pub Date : 2024-05-01 DOI:10.1200/CCI.23.00251
Thales C Nepomuceno, Paulo Lyra, Jianbin Zhu, Fanchao Yi, Rachael H Martin, Daniel Lupu, Luke Peterson, Lauren C Peres, Anna Berry, Edwin S Iversen, Fergus J Couch, Qianxing Mo, Alvaro N Monteiro
{"title":"评估真实世界数据登记册中乳腺癌患者的 BRCA1 和 BRCA2 基因变异数据。","authors":"Thales C Nepomuceno, Paulo Lyra, Jianbin Zhu, Fanchao Yi, Rachael H Martin, Daniel Lupu, Luke Peterson, Lauren C Peres, Anna Berry, Edwin S Iversen, Fergus J Couch, Qianxing Mo, Alvaro N Monteiro","doi":"10.1200/CCI.23.00251","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>The emergence of large real-world clinical databases and tools to mine electronic medical records has allowed for an unprecedented look at large data sets with clinical and epidemiologic correlates. In clinical cancer genetics, real-world databases allow for the investigation of prevalence and effectiveness of prevention strategies and targeted treatments and for the identification of barriers to better outcomes. However, real-world data sets have inherent biases and problems (eg, selection bias, incomplete data, measurement error) that may hamper adequate analysis and affect statistical power.</p><p><strong>Methods: </strong>Here, we leverage a real-world clinical data set from a large health network for patients with breast cancer tested for variants in <i>BRCA1</i> and <i>BRCA2</i> (N = 12,423). We conducted data cleaning and harmonization, cross-referenced with publicly available databases, performed variant reassessment and functional assays, and used functional data to inform a variant's clinical significance applying American College of Medical Geneticists and the Association of Molecular Pathology guidelines.</p><p><strong>Results: </strong>In the cohort, White and Black patients were over-represented, whereas non-White Hispanic and Asian patients were under-represented. Incorrect or missing variant designations were the most significant contributor to data loss. While manual curation corrected many incorrect designations, a sizable fraction of patient carriers remained with incorrect or missing variant designations. Despite the large number of patients with clinical significance not reported, original reported clinical significance assessments were accurate. Reassessment of variants in which clinical significance was not reported led to a marked improvement in data quality.</p><p><strong>Conclusion: </strong>We identify the most common issues with <i>BRCA1</i> and <i>BRCA2</i> testing data entry and suggest approaches to minimize data loss and keep interpretation of clinical significance of variants up to date.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11161245/pdf/","citationCount":"0","resultStr":"{\"title\":\"Assessment of <i>BRCA1</i> and <i>BRCA2</i> Germline Variant Data From Patients With Breast Cancer in a Real-World Data Registry.\",\"authors\":\"Thales C Nepomuceno, Paulo Lyra, Jianbin Zhu, Fanchao Yi, Rachael H Martin, Daniel Lupu, Luke Peterson, Lauren C Peres, Anna Berry, Edwin S Iversen, Fergus J Couch, Qianxing Mo, Alvaro N Monteiro\",\"doi\":\"10.1200/CCI.23.00251\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>The emergence of large real-world clinical databases and tools to mine electronic medical records has allowed for an unprecedented look at large data sets with clinical and epidemiologic correlates. In clinical cancer genetics, real-world databases allow for the investigation of prevalence and effectiveness of prevention strategies and targeted treatments and for the identification of barriers to better outcomes. However, real-world data sets have inherent biases and problems (eg, selection bias, incomplete data, measurement error) that may hamper adequate analysis and affect statistical power.</p><p><strong>Methods: </strong>Here, we leverage a real-world clinical data set from a large health network for patients with breast cancer tested for variants in <i>BRCA1</i> and <i>BRCA2</i> (N = 12,423). We conducted data cleaning and harmonization, cross-referenced with publicly available databases, performed variant reassessment and functional assays, and used functional data to inform a variant's clinical significance applying American College of Medical Geneticists and the Association of Molecular Pathology guidelines.</p><p><strong>Results: </strong>In the cohort, White and Black patients were over-represented, whereas non-White Hispanic and Asian patients were under-represented. Incorrect or missing variant designations were the most significant contributor to data loss. While manual curation corrected many incorrect designations, a sizable fraction of patient carriers remained with incorrect or missing variant designations. Despite the large number of patients with clinical significance not reported, original reported clinical significance assessments were accurate. Reassessment of variants in which clinical significance was not reported led to a marked improvement in data quality.</p><p><strong>Conclusion: </strong>We identify the most common issues with <i>BRCA1</i> and <i>BRCA2</i> testing data entry and suggest approaches to minimize data loss and keep interpretation of clinical significance of variants up to date.</p>\",\"PeriodicalId\":51626,\"journal\":{\"name\":\"JCO Clinical Cancer Informatics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2024-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11161245/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JCO Clinical Cancer Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1200/CCI.23.00251\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JCO Clinical Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1200/CCI.23.00251","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

目的:随着大型真实世界临床数据库和电子病历挖掘工具的出现,人们可以前所未有地查看与临床和流行病学相关的大型数据集。在临床癌症遗传学中,真实世界数据库可用于调查预防策略和靶向治疗的流行率和有效性,并确定获得更好结果的障碍。然而,真实世界的数据集存在固有的偏差和问题(如选择偏差、数据不完整、测量误差),可能会妨碍充分的分析并影响统计能力。方法:在此,我们利用一个大型医疗网络的真实世界临床数据集,对乳腺癌患者进行 BRCA1 和 BRCA2 变异检测(N = 12,423)。我们对数据进行了清理和统一,与公开数据库进行了交叉比对,进行了变异再评估和功能测定,并根据美国医学遗传学家学会和分子病理学协会的指导原则使用功能数据来确定变异的临床意义:在队列中,白人和黑人患者所占比例较高,而非白人的西班牙裔和亚裔患者所占比例较低。不正确或缺失的变异名称是造成数据丢失的最主要原因。虽然人工整理纠正了许多错误的指定,但仍有相当一部分患者携带者的变异体指定不正确或缺失。尽管有大量患者未报告临床意义,但原始报告的临床意义评估是准确的。对未报告临床意义的变异进行重新评估后,数据质量明显提高:我们找出了 BRCA1 和 BRCA2 检测数据录入中最常见的问题,并提出了尽量减少数据丢失和及时解释变异临床意义的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Assessment of BRCA1 and BRCA2 Germline Variant Data From Patients With Breast Cancer in a Real-World Data Registry.

Purpose: The emergence of large real-world clinical databases and tools to mine electronic medical records has allowed for an unprecedented look at large data sets with clinical and epidemiologic correlates. In clinical cancer genetics, real-world databases allow for the investigation of prevalence and effectiveness of prevention strategies and targeted treatments and for the identification of barriers to better outcomes. However, real-world data sets have inherent biases and problems (eg, selection bias, incomplete data, measurement error) that may hamper adequate analysis and affect statistical power.

Methods: Here, we leverage a real-world clinical data set from a large health network for patients with breast cancer tested for variants in BRCA1 and BRCA2 (N = 12,423). We conducted data cleaning and harmonization, cross-referenced with publicly available databases, performed variant reassessment and functional assays, and used functional data to inform a variant's clinical significance applying American College of Medical Geneticists and the Association of Molecular Pathology guidelines.

Results: In the cohort, White and Black patients were over-represented, whereas non-White Hispanic and Asian patients were under-represented. Incorrect or missing variant designations were the most significant contributor to data loss. While manual curation corrected many incorrect designations, a sizable fraction of patient carriers remained with incorrect or missing variant designations. Despite the large number of patients with clinical significance not reported, original reported clinical significance assessments were accurate. Reassessment of variants in which clinical significance was not reported led to a marked improvement in data quality.

Conclusion: We identify the most common issues with BRCA1 and BRCA2 testing data entry and suggest approaches to minimize data loss and keep interpretation of clinical significance of variants up to date.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
6.20
自引率
4.80%
发文量
190
期刊最新文献
Increasing Power in Phase III Oncology Trials With Multivariable Regression: An Empirical Assessment of 535 Primary End Point Analyses. Validation of Non-Small Cell Lung Cancer Clinical Insights Using a Generalized Oncology Natural Language Processing Model. Interinstitutional Approach to Advancing Geospatial Technologies for US Cancer Centers. Classification and Regression Trees to Predict for Survival for Patients With Hepatocellular Carcinoma Treated With Atezolizumab and Bevacizumab. Cureit: An End-to-End Pipeline for Implementing Mixture Cure Models With an Application to Liposarcoma Data.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1