基因组规模研究中确定实验优先次序的图论方法。

IF 2.7 4区生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY Mammalian Genome Pub Date : 2024-12-01 Epub Date: 2024-08-27 DOI:10.1007/s00335-024-10066-z

Stephen K Grady, Kevin A Peterson, Stephen A Murray, Erich J Baker, Michael A Langston, Elissa J Chesler

{"title":"基因组规模研究中确定实验优先次序的图论方法。","authors":"Stephen K Grady, Kevin A Peterson, Stephen A Murray, Erich J Baker, Michael A Langston, Elissa J Chesler","doi":"10.1007/s00335-024-10066-z","DOIUrl":null,"url":null,"abstract":"The goal of systems biology is to gain a network level understanding of how gene interactions influence biological states, and ultimately inform upon human disease. Given the scale and scope of systems biology studies, resource constraints often limit researchers when validating genome-wide phenomena and potentially lead to an incomplete understanding of the underlying mechanisms. Further, prioritization strategies are often biased towards known entities (e.g. previously studied genes/proteins with commercially available reagents), and other technical issues that limit experimental breadth. Here, heterogeneous biological information is modeled as an association graph to which a high-performance minimum dominating set solver is applied to maximize coverage across the graph, and thus increase the breadth of experimentation. First, we tested our model on retrieval of existing gene functional annotations and demonstrated that minimum dominating set returns more diverse terms when compared to other computational methods. Next, we utilized our heterogenous network and minimum dominating set solver to assist in the process of identifying understudied genes to be interrogated by the International Mouse Phenotyping Consortium. Using an unbiased algorithmic strategy, poorly studied genes are prioritized from the remaining thousands of genes yet to be characterized. This method is tunable and extensible with the potential to incorporate additional user-defined prioritizing information. The minimum dominating set approach can be applied to any biological network in order to identify a tractable subset of features to test experimentally or to assist in prioritizing candidate genes associated with human disease.","PeriodicalId":18259,"journal":{"name":"Mammalian Genome","volume":" ","pages":"724-733"},"PeriodicalIF":2.7000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11522061/pdf/","citationCount":"0","resultStr":"{\"title\":\"A graph theoretical approach to experimental prioritization in genome-scale investigations.\",\"authors\":\"Stephen K Grady, Kevin A Peterson, Stephen A Murray, Erich J Baker, Michael A Langston, Elissa J Chesler\",\"doi\":\"10.1007/s00335-024-10066-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The goal of systems biology is to gain a network level understanding of how gene interactions influence biological states, and ultimately inform upon human disease. Given the scale and scope of systems biology studies, resource constraints often limit researchers when validating genome-wide phenomena and potentially lead to an incomplete understanding of the underlying mechanisms. Further, prioritization strategies are often biased towards known entities (e.g. previously studied genes/proteins with commercially available reagents), and other technical issues that limit experimental breadth. Here, heterogeneous biological information is modeled as an association graph to which a high-performance minimum dominating set solver is applied to maximize coverage across the graph, and thus increase the breadth of experimentation. First, we tested our model on retrieval of existing gene functional annotations and demonstrated that minimum dominating set returns more diverse terms when compared to other computational methods. Next, we utilized our heterogenous network and minimum dominating set solver to assist in the process of identifying understudied genes to be interrogated by the International Mouse Phenotyping Consortium. Using an unbiased algorithmic strategy, poorly studied genes are prioritized from the remaining thousands of genes yet to be characterized. This method is tunable and extensible with the potential to incorporate additional user-defined prioritizing information. The minimum dominating set approach can be applied to any biological network in order to identify a tractable subset of features to test experimentally or to assist in prioritizing candidate genes associated with human disease.\",\"PeriodicalId\":18259,\"journal\":{\"name\":\"Mammalian Genome\",\"volume\":\" \",\"pages\":\"724-733\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11522061/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mammalian Genome\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1007/s00335-024-10066-z\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/8/27 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mammalian Genome","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s00335-024-10066-z","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/27 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

系统生物学的目标是从网络层面了解基因相互作用如何影响生物状态，并最终为人类疾病提供信息。鉴于系统生物学研究的规模和范围，在验证全基因组现象时，资源限制往往会限制研究人员的研究，并可能导致对潜在机制的不完整理解。此外，优先策略往往偏重于已知实体（如使用市售试剂的先前研究过的基因/蛋白质），以及限制实验广度的其他技术问题。在此，我们将异质生物信息建模为关联图，并对其应用高性能最小支配集求解器，以最大限度地扩大整个关联图的覆盖范围，从而提高实验的广度。首先，我们在现有基因功能注释的检索中测试了我们的模型，结果表明，与其他计算方法相比，最小支配集能返回更多样化的术语。接下来，我们利用我们的异质网络和最小主宰集求解器，协助国际小鼠表型研究联盟（International Mouse Phenotyping Consortium）鉴定未被充分研究的基因。利用无偏算法策略，研究不足的基因将从剩余的数千个尚待表征的基因中脱颖而出。这种方法具有可调整性和可扩展性，可以加入用户定义的其他优先级信息。最小优势集方法可应用于任何生物网络，以确定可进行实验测试的特征子集，或协助确定与人类疾病相关的候选基因的优先顺序。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A graph theoretical approach to experimental prioritization in genome-scale investigations.

The goal of systems biology is to gain a network level understanding of how gene interactions influence biological states, and ultimately inform upon human disease. Given the scale and scope of systems biology studies, resource constraints often limit researchers when validating genome-wide phenomena and potentially lead to an incomplete understanding of the underlying mechanisms. Further, prioritization strategies are often biased towards known entities (e.g. previously studied genes/proteins with commercially available reagents), and other technical issues that limit experimental breadth. Here, heterogeneous biological information is modeled as an association graph to which a high-performance minimum dominating set solver is applied to maximize coverage across the graph, and thus increase the breadth of experimentation. First, we tested our model on retrieval of existing gene functional annotations and demonstrated that minimum dominating set returns more diverse terms when compared to other computational methods. Next, we utilized our heterogenous network and minimum dominating set solver to assist in the process of identifying understudied genes to be interrogated by the International Mouse Phenotyping Consortium. Using an unbiased algorithmic strategy, poorly studied genes are prioritized from the remaining thousands of genes yet to be characterized. This method is tunable and extensible with the potential to incorporate additional user-defined prioritizing information. The minimum dominating set approach can be applied to any biological network in order to identify a tractable subset of features to test experimentally or to assist in prioritizing candidate genes associated with human disease.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Mammalian Genome 生物-生化与分子生物学

CiteScore

4.00

自引率

0.00%

发文量

审稿时长

6-12 weeks

期刊介绍： Mammalian Genome focuses on the experimental, theoretical and technical aspects of genetics, genomics, epigenetics and systems biology in mouse, human and other mammalian species, with an emphasis on the relationship between genotype and phenotype, elucidation of biological and disease pathways as well as experimental aspects of interventions, therapeutics, and precision medicine. The journal aims to publish high quality original papers that present novel findings in all areas of mammalian genetic research as well as review articles on areas of topical interest. The journal will also feature commentaries and editorials to inform readers of breakthrough discoveries as well as issues of research standards, policies and ethics.