An AI-based approach driven by genotypes and phenotypes to uplift the diagnostic yield of genetic diseases.

IF 3.8 2区 生物学 Q2 GENETICS & HEREDITY Human Genetics Pub Date : 2024-03-23 DOI:10.1007/s00439-023-02638-x
S Zucca, G Nicora, F De Paoli, M G Carta, R Bellazzi, P Magni, E Rizzo, I Limongelli
{"title":"An AI-based approach driven by genotypes and phenotypes to uplift the diagnostic yield of genetic diseases.","authors":"S Zucca, G Nicora, F De Paoli, M G Carta, R Bellazzi, P Magni, E Rizzo, I Limongelli","doi":"10.1007/s00439-023-02638-x","DOIUrl":null,"url":null,"abstract":"<p><p>Identifying disease-causing variants in Rare Disease patients' genome is a challenging problem. To accomplish this task, we describe a machine learning framework, that we called \"Suggested Diagnosis\", whose aim is to prioritize genetic variants in an exome/genome based on the probability of being disease-causing. To do so, our method leverages standard guidelines for germline variant interpretation as defined by the American College of Human Genomics (ACMG) and the Association for Molecular Pathology (AMP), inheritance information, phenotypic similarity, and variant quality. Starting from (1) the VCF file containing proband's variants, (2) the list of proband's phenotypes encoded in Human Phenotype Ontology terms, and optionally (3) the information about family members (if available), the \"Suggested Diagnosis\" ranks all the variants according to their machine learning prediction. This method significantly reduces the number of variants that need to be evaluated by geneticists by pinpointing causative variants in the very first positions of the prioritized list. Most importantly, our approach proved to be among the top performers within the CAGI6 Rare Genome Project Challenge, where it was able to rank the true causative variant among the first positions and, uniquely among all the challenge participants, increased the diagnostic yield of 12.5% by solving 2 undiagnosed cases.</p>","PeriodicalId":13175,"journal":{"name":"Human Genetics","volume":" ","pages":""},"PeriodicalIF":3.8000,"publicationDate":"2024-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s00439-023-02638-x","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Identifying disease-causing variants in Rare Disease patients' genome is a challenging problem. To accomplish this task, we describe a machine learning framework, that we called "Suggested Diagnosis", whose aim is to prioritize genetic variants in an exome/genome based on the probability of being disease-causing. To do so, our method leverages standard guidelines for germline variant interpretation as defined by the American College of Human Genomics (ACMG) and the Association for Molecular Pathology (AMP), inheritance information, phenotypic similarity, and variant quality. Starting from (1) the VCF file containing proband's variants, (2) the list of proband's phenotypes encoded in Human Phenotype Ontology terms, and optionally (3) the information about family members (if available), the "Suggested Diagnosis" ranks all the variants according to their machine learning prediction. This method significantly reduces the number of variants that need to be evaluated by geneticists by pinpointing causative variants in the very first positions of the prioritized list. Most importantly, our approach proved to be among the top performers within the CAGI6 Rare Genome Project Challenge, where it was able to rank the true causative variant among the first positions and, uniquely among all the challenge participants, increased the diagnostic yield of 12.5% by solving 2 undiagnosed cases.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
以基因型和表型为驱动力的人工智能方法,提高遗传疾病的诊断率。
识别罕见病患者基因组中的致病变异是一个具有挑战性的问题。为了完成这项任务,我们描述了一个机器学习框架,我们称之为 "建议诊断",其目的是根据外显子组/基因组中基因变异的致病概率确定其优先级。为此,我们的方法利用了美国人类基因组学学会(ACMG)和分子病理学协会(AMP)定义的种系变异解释标准指南、遗传信息、表型相似性和变异质量。建议诊断 "从(1)包含原癌基因变异的 VCF 文件、(2)以人类表型本体术语编码的原癌基因表型列表以及(3)家庭成员信息(如有)开始,根据机器学习预测结果对所有变异进行排序。这种方法通过将致病变体精确定位在优先列表的首位,大大减少了遗传学家需要评估的变体数量。最重要的是,我们的方法被证明是 CAGI6 罕见基因组项目挑战赛中表现最出色的方法之一,它能够将真正的致病变异体排在第一位,并且在所有挑战赛参与者中独一无二地解决了 2 个未诊断病例,从而将诊断率提高了 12.5%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Human Genetics
Human Genetics 生物-遗传学
CiteScore
10.80
自引率
3.80%
发文量
94
审稿时长
1 months
期刊介绍: Human Genetics is a monthly journal publishing original and timely articles on all aspects of human genetics. The Journal particularly welcomes articles in the areas of Behavioral genetics, Bioinformatics, Cancer genetics and genomics, Cytogenetics, Developmental genetics, Disease association studies, Dysmorphology, ELSI (ethical, legal and social issues), Evolutionary genetics, Gene expression, Gene structure and organization, Genetics of complex diseases and epistatic interactions, Genetic epidemiology, Genome biology, Genome structure and organization, Genotype-phenotype relationships, Human Genomics, Immunogenetics and genomics, Linkage analysis and genetic mapping, Methods in Statistical Genetics, Molecular diagnostics, Mutation detection and analysis, Neurogenetics, Physical mapping and Population Genetics. Articles reporting animal models relevant to human biology or disease are also welcome. Preference will be given to those articles which address clinically relevant questions or which provide new insights into human biology. Unless reporting entirely novel and unusual aspects of a topic, clinical case reports, cytogenetic case reports, papers on descriptive population genetics, articles dealing with the frequency of polymorphisms or additional mutations within genes in which numerous lesions have already been described, and papers that report meta-analyses of previously published datasets will normally not be accepted. The Journal typically will not consider for publication manuscripts that report merely the isolation, map position, structure, and tissue expression profile of a gene of unknown function unless the gene is of particular interest or is a candidate gene involved in a human trait or disorder.
期刊最新文献
Biallelic germline DDX41 variants in a patient with bone dysplasia, ichthyosis, and dysmorphic features. Genetic analysis of preaxial polydactyly: identification of novel variants and the role of ZRS duplications in a Chinese cohort of 102 cases. The MorbidGenes panel: a monthly updated list of diagnostically relevant rare disease genes derived from diverse sources. Polymorphic pseudogenes in the human genome - a comprehensive assessment. Germline copy number variants and endometrial cancer risk.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1