{"title":"Evaluation of literature searching tools for curation of mismatch repair gene variants in hereditary colon cancer","authors":"Varun Kaushik, John-Paul Plazzer, Finlay Macrae","doi":"10.1002/ggn2.10039","DOIUrl":null,"url":null,"abstract":"<p>Pathogenic constitutional genomic variants in the mismatch repair (MMR) genes are the drivers of Lynch syndrome; optimal variant interpretation is required for the management of suspected and confirmed cases. The International Society for Hereditary Gastrointestinal Tumours (InSiGHT) provides expert classifications for MMR variants for the US National Human Genome Research Institute's (NHGRI) ClinGen initiative and interprets variants with discordant classifications and those of uncertain significance (VUSs). Given the onerous nature of extracting information related to variants, literature searching tools which harness artificial intelligence may aid in retrieving information to allow optimum variant classification. In this study, we described the nature of discordance in a sample of 80 variants from a list of variants requiring updating by InSiGHT for ClinGen by comparing their existing InSiGHT classifications with the various submissions for each variant on the US National Centre for Biotechnology Information's (NCBI) ClinVar database. To identify the potential value of a literature searching tool in extracting information related to classification, all variants were searched for using a traditional method (Google Scholar) and literature searching tool (Mastermind) independently. Descriptive statistics were used to compare: the number of articles before and after screening for relevance and the number of relevant articles unique to either method. Relevance was defined as containing the variant in question as well as data informing variant interpretation. A total of 916 articles were returned by both methods and Mastermind averaged four relevant articles per search compared to Google Scholar's three. Of relevant Mastermind articles, 193/308 (62.7%) were unique to it, compared to 87/202, (43.0%) for Google Scholar. For 24 variants, either or both methods found no information. All 6/80 (20%) variants with pathogenic or likely pathogenic InSiGHT classifications have newer VUS assertions on ClinVar. Our study demonstrated that for a sample of variants with varying discordant interpretations, Mastermind was able to return on average, a more relevant and unique literature search. Google Scholar was able to retrieve information that Mastermind did not, which supports a conclusion that Mastermind could play a complementary role in literature searching for classification. This work will aid InSiGHT in its role of classifying MMR variants.</p>","PeriodicalId":72071,"journal":{"name":"Advanced genetics (Hoboken, N.J.)","volume":"2 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1002/ggn2.10039","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced genetics (Hoboken, N.J.)","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ggn2.10039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Pathogenic constitutional genomic variants in the mismatch repair (MMR) genes are the drivers of Lynch syndrome; optimal variant interpretation is required for the management of suspected and confirmed cases. The International Society for Hereditary Gastrointestinal Tumours (InSiGHT) provides expert classifications for MMR variants for the US National Human Genome Research Institute's (NHGRI) ClinGen initiative and interprets variants with discordant classifications and those of uncertain significance (VUSs). Given the onerous nature of extracting information related to variants, literature searching tools which harness artificial intelligence may aid in retrieving information to allow optimum variant classification. In this study, we described the nature of discordance in a sample of 80 variants from a list of variants requiring updating by InSiGHT for ClinGen by comparing their existing InSiGHT classifications with the various submissions for each variant on the US National Centre for Biotechnology Information's (NCBI) ClinVar database. To identify the potential value of a literature searching tool in extracting information related to classification, all variants were searched for using a traditional method (Google Scholar) and literature searching tool (Mastermind) independently. Descriptive statistics were used to compare: the number of articles before and after screening for relevance and the number of relevant articles unique to either method. Relevance was defined as containing the variant in question as well as data informing variant interpretation. A total of 916 articles were returned by both methods and Mastermind averaged four relevant articles per search compared to Google Scholar's three. Of relevant Mastermind articles, 193/308 (62.7%) were unique to it, compared to 87/202, (43.0%) for Google Scholar. For 24 variants, either or both methods found no information. All 6/80 (20%) variants with pathogenic or likely pathogenic InSiGHT classifications have newer VUS assertions on ClinVar. Our study demonstrated that for a sample of variants with varying discordant interpretations, Mastermind was able to return on average, a more relevant and unique literature search. Google Scholar was able to retrieve information that Mastermind did not, which supports a conclusion that Mastermind could play a complementary role in literature searching for classification. This work will aid InSiGHT in its role of classifying MMR variants.
错配修复(MMR)基因中的致病性体质基因组变异是Lynch综合征的驱动因素;对疑似病例和确诊病例的管理需要最佳的变异解释。国际遗传性胃肠道肿瘤学会(InSiGHT)为美国国家人类基因组研究所(NHGRI) ClinGen计划提供MMR变异的专家分类,并解释分类不一致和不确定意义(VUSs)的变异。鉴于提取变体相关信息的繁重性质,利用人工智能的文献检索工具可以帮助检索信息,从而实现最佳的变体分类。在这项研究中,我们通过将现有的InSiGHT分类与美国国家生物技术信息中心(NCBI) ClinVar数据库中每个变体的各种提交进行比较,描述了来自InSiGHT for ClinGen需要更新的变体列表中的80个变体样本的不一致性质。为了确定文献检索工具在提取分类相关信息方面的潜在价值,我们分别使用传统方法(b谷歌Scholar)和文献检索工具(Mastermind)对所有变体进行检索。描述性统计用于比较:筛选相关性之前和之后的文章数量以及两种方法独有的相关文章数量。相关性被定义为包含有问题的变量以及通知变量解释的数据。两种方法总共返回了916篇文章,Mastermind平均每次搜索4篇相关文章,而b谷歌Scholar平均每次搜索3篇。在Mastermind的相关文章中,193/308篇(62.7%)是独一无二的,而谷歌Scholar的这一比例为87/202篇(43.0%)。对于24种变体,其中一种或两种方法都没有发现任何信息。所有6/80(20%)具有致病性或可能致病性InSiGHT分类的变异在ClinVar上有较新的VUS断言。我们的研究表明,对于具有不同不一致解释的变体样本,平均而言,Mastermind能够返回更相关和独特的文献搜索。谷歌Scholar能够检索到Mastermind无法检索到的信息,这支持了一个结论,即Mastermind可以在文献检索分类中发挥补充作用。这项工作将有助于InSiGHT对MMR变体进行分类。