{"title":"GeneRHi-C:基于Hi-C数据的三维基因组重建","authors":"Kimberly MacKay, M. Carlsson, A. Kusalik","doi":"10.1145/3365953.3365962","DOIUrl":null,"url":null,"abstract":"Background: Many computational methods have been developed that leverage the results from biological experiments (such as Hi-C) to infer the 3D organization of the genome. Formally, this is referred to as the 3D genome reconstruction problem (3D-GRP). Hi-C data is now being generated at increasingly high resolutions. As this resolution increases, it has become computationally infeasible to predict a 3D genome organization with the majority of existing methods. None of the existing solution methods have utilized a non-procedural programming approach (such as integer programming) despite the established advantages and successful applications of such approaches for predicting high-resolution 3D structures of other biomolecules. Our objective was to develop a new solution to the 3D-GRP that utilizes non-procedural programming to realize the same advantages. Results: In this paper, we present a three-step consensus method (called GeneRHi-C; pronounced \"generic\") for solving the 3D-GRP which utilizes both new and existing techniques. Briefly, (1) the dimensionality of the 3D-GRP is reduced by identifying a biologically plausible, ploidy-dependent subset of interactions from the Hi-C data. This is performed by modelling the task as an optimization problem and solving it efficiently with an implementation in a non-procedural programming language. The second step (2) generates a biological network (graph) that represents the subset of interactions identified in the previous step. Briefly, genomic bins are represented as nodes in the network with weighted-edges representing known and detected interactions. Finally, the third step (3) uses the ForceAtlas 3D network layout algorithm to calculate (x, y, z) coordinates for each genomic region in the contact map. The resultant predicted genome organization represents the interactions of a population-averaged consensus structure. The overall workflow was tested with Hi-C data from Schizosaccharomyces pombe (fission yeast). The resulting 3D structure clearly recapitulated previously established features of fission yeast 3D genome organization. Conclusion: Overall, GeneRHi-C demonstrates the power of non-procedural programming and graph theoretic techniques for providing an efficient, generalizable solution to the 3D-GRP. Project Homepage: https://github.com/kimmackay/GeneRHi-C","PeriodicalId":158189,"journal":{"name":"Proceedings of the Tenth International Conference on Computational Systems-Biology and Bioinformatics","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"GeneRHi-C: 3D GENomE Reconstruction from Hi-C data\",\"authors\":\"Kimberly MacKay, M. Carlsson, A. Kusalik\",\"doi\":\"10.1145/3365953.3365962\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Many computational methods have been developed that leverage the results from biological experiments (such as Hi-C) to infer the 3D organization of the genome. Formally, this is referred to as the 3D genome reconstruction problem (3D-GRP). Hi-C data is now being generated at increasingly high resolutions. As this resolution increases, it has become computationally infeasible to predict a 3D genome organization with the majority of existing methods. None of the existing solution methods have utilized a non-procedural programming approach (such as integer programming) despite the established advantages and successful applications of such approaches for predicting high-resolution 3D structures of other biomolecules. Our objective was to develop a new solution to the 3D-GRP that utilizes non-procedural programming to realize the same advantages. Results: In this paper, we present a three-step consensus method (called GeneRHi-C; pronounced \\\"generic\\\") for solving the 3D-GRP which utilizes both new and existing techniques. Briefly, (1) the dimensionality of the 3D-GRP is reduced by identifying a biologically plausible, ploidy-dependent subset of interactions from the Hi-C data. This is performed by modelling the task as an optimization problem and solving it efficiently with an implementation in a non-procedural programming language. The second step (2) generates a biological network (graph) that represents the subset of interactions identified in the previous step. Briefly, genomic bins are represented as nodes in the network with weighted-edges representing known and detected interactions. Finally, the third step (3) uses the ForceAtlas 3D network layout algorithm to calculate (x, y, z) coordinates for each genomic region in the contact map. The resultant predicted genome organization represents the interactions of a population-averaged consensus structure. The overall workflow was tested with Hi-C data from Schizosaccharomyces pombe (fission yeast). The resulting 3D structure clearly recapitulated previously established features of fission yeast 3D genome organization. Conclusion: Overall, GeneRHi-C demonstrates the power of non-procedural programming and graph theoretic techniques for providing an efficient, generalizable solution to the 3D-GRP. Project Homepage: https://github.com/kimmackay/GeneRHi-C\",\"PeriodicalId\":158189,\"journal\":{\"name\":\"Proceedings of the Tenth International Conference on Computational Systems-Biology and Bioinformatics\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Tenth International Conference on Computational Systems-Biology and Bioinformatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3365953.3365962\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Tenth International Conference on Computational Systems-Biology and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3365953.3365962","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
摘要
背景:利用生物学实验(如Hi-C)的结果来推断基因组的三维组织,已经开发了许多计算方法。正式地,这被称为三维基因组重建问题(3D- grp)。现在以越来越高的分辨率生成高碳数据。随着分辨率的提高,用现有的大多数方法预测三维基因组组织在计算上变得不可行。尽管这些方法在预测其他生物分子的高分辨率3D结构方面具有既定的优势和成功的应用,但现有的解决方法都没有利用非过程性编程方法(如整数编程)。我们的目标是为3D-GRP开发一种新的解决方案,利用非程序编程来实现相同的优势。结果:在本文中,我们提出了一个三步共识方法(称为GeneRHi-C;发音为“generic”)来解决3D-GRP,它利用了新的和现有的技术。简而言之,(1)通过从Hi-C数据中识别生物学上合理的、倍体依赖性的相互作用子集,降低了3D-GRP的维度。这是通过将任务建模为优化问题并使用非过程性编程语言实现有效地解决它来实现的。第二步(2)生成一个生物网络(图),表示在前一步中确定的相互作用的子集。简而言之,基因组箱被表示为网络中的节点,其中加权边表示已知和检测到的相互作用。最后,第三步(3)使用ForceAtlas 3D网络布局算法计算接触图中每个基因组区域的(x, y, z)坐标。由此预测的基因组组织代表了种群平均共识结构的相互作用。整个流程用分裂酵母(Schizosaccharomyces pombe)的Hi-C数据进行了测试。由此产生的三维结构清楚地再现了以前建立的裂变酵母三维基因组组织的特征。结论:总的来说,GeneRHi-C展示了非过程编程和图论技术的力量,为3D-GRP提供了一个有效的、可推广的解决方案。项目主页:https://github.com/kimmackay/GeneRHi-C
GeneRHi-C: 3D GENomE Reconstruction from Hi-C data
Background: Many computational methods have been developed that leverage the results from biological experiments (such as Hi-C) to infer the 3D organization of the genome. Formally, this is referred to as the 3D genome reconstruction problem (3D-GRP). Hi-C data is now being generated at increasingly high resolutions. As this resolution increases, it has become computationally infeasible to predict a 3D genome organization with the majority of existing methods. None of the existing solution methods have utilized a non-procedural programming approach (such as integer programming) despite the established advantages and successful applications of such approaches for predicting high-resolution 3D structures of other biomolecules. Our objective was to develop a new solution to the 3D-GRP that utilizes non-procedural programming to realize the same advantages. Results: In this paper, we present a three-step consensus method (called GeneRHi-C; pronounced "generic") for solving the 3D-GRP which utilizes both new and existing techniques. Briefly, (1) the dimensionality of the 3D-GRP is reduced by identifying a biologically plausible, ploidy-dependent subset of interactions from the Hi-C data. This is performed by modelling the task as an optimization problem and solving it efficiently with an implementation in a non-procedural programming language. The second step (2) generates a biological network (graph) that represents the subset of interactions identified in the previous step. Briefly, genomic bins are represented as nodes in the network with weighted-edges representing known and detected interactions. Finally, the third step (3) uses the ForceAtlas 3D network layout algorithm to calculate (x, y, z) coordinates for each genomic region in the contact map. The resultant predicted genome organization represents the interactions of a population-averaged consensus structure. The overall workflow was tested with Hi-C data from Schizosaccharomyces pombe (fission yeast). The resulting 3D structure clearly recapitulated previously established features of fission yeast 3D genome organization. Conclusion: Overall, GeneRHi-C demonstrates the power of non-procedural programming and graph theoretic techniques for providing an efficient, generalizable solution to the 3D-GRP. Project Homepage: https://github.com/kimmackay/GeneRHi-C