Stephen K Grady, Kevin A Peterson, Stephen A Murray, Erich J Baker, Michael A Langston, Elissa J Chesler
{"title":"A graph theoretical approach to experimental prioritization in genome-scale investigations.","authors":"Stephen K Grady, Kevin A Peterson, Stephen A Murray, Erich J Baker, Michael A Langston, Elissa J Chesler","doi":"10.1007/s00335-024-10066-z","DOIUrl":null,"url":null,"abstract":"<p><p>The goal of systems biology is to gain a network level understanding of how gene interactions influence biological states, and ultimately inform upon human disease. Given the scale and scope of systems biology studies, resource constraints often limit researchers when validating genome-wide phenomena and potentially lead to an incomplete understanding of the underlying mechanisms. Further, prioritization strategies are often biased towards known entities (e.g. previously studied genes/proteins with commercially available reagents), and other technical issues that limit experimental breadth. Here, heterogeneous biological information is modeled as an association graph to which a high-performance minimum dominating set solver is applied to maximize coverage across the graph, and thus increase the breadth of experimentation. First, we tested our model on retrieval of existing gene functional annotations and demonstrated that minimum dominating set returns more diverse terms when compared to other computational methods. Next, we utilized our heterogenous network and minimum dominating set solver to assist in the process of identifying understudied genes to be interrogated by the International Mouse Phenotyping Consortium. Using an unbiased algorithmic strategy, poorly studied genes are prioritized from the remaining thousands of genes yet to be characterized. This method is tunable and extensible with the potential to incorporate additional user-defined prioritizing information. The minimum dominating set approach can be applied to any biological network in order to identify a tractable subset of features to test experimentally or to assist in prioritizing candidate genes associated with human disease.</p>","PeriodicalId":18259,"journal":{"name":"Mammalian Genome","volume":" ","pages":"724-733"},"PeriodicalIF":2.7000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11522061/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mammalian Genome","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s00335-024-10066-z","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/27 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The goal of systems biology is to gain a network level understanding of how gene interactions influence biological states, and ultimately inform upon human disease. Given the scale and scope of systems biology studies, resource constraints often limit researchers when validating genome-wide phenomena and potentially lead to an incomplete understanding of the underlying mechanisms. Further, prioritization strategies are often biased towards known entities (e.g. previously studied genes/proteins with commercially available reagents), and other technical issues that limit experimental breadth. Here, heterogeneous biological information is modeled as an association graph to which a high-performance minimum dominating set solver is applied to maximize coverage across the graph, and thus increase the breadth of experimentation. First, we tested our model on retrieval of existing gene functional annotations and demonstrated that minimum dominating set returns more diverse terms when compared to other computational methods. Next, we utilized our heterogenous network and minimum dominating set solver to assist in the process of identifying understudied genes to be interrogated by the International Mouse Phenotyping Consortium. Using an unbiased algorithmic strategy, poorly studied genes are prioritized from the remaining thousands of genes yet to be characterized. This method is tunable and extensible with the potential to incorporate additional user-defined prioritizing information. The minimum dominating set approach can be applied to any biological network in order to identify a tractable subset of features to test experimentally or to assist in prioritizing candidate genes associated with human disease.
期刊介绍:
Mammalian Genome focuses on the experimental, theoretical and technical aspects of genetics, genomics, epigenetics and systems biology in mouse, human and other mammalian species, with an emphasis on the relationship between genotype and phenotype, elucidation of biological and disease pathways as well as experimental aspects of interventions, therapeutics, and precision medicine. The journal aims to publish high quality original papers that present novel findings in all areas of mammalian genetic research as well as review articles on areas of topical interest. The journal will also feature commentaries and editorials to inform readers of breakthrough discoveries as well as issues of research standards, policies and ethics.