首页 > 最新文献

Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics最新文献

英文 中文
Fast and Highly Scalable Bayesian MDP on a GPU Platform 在GPU平台上快速和高度可扩展的贝叶斯MDP
He Zhou, S. Khatri, Jiang Hu, Frank Liu, C. Sze
By employing the Optimal Bayesian Robust (OBR) policy, Bayesian Markov Decision Process (BMDP) can be used to solve the Gene Regulatory Network (GRN) control problem. However, due to the "curse of dimensionality", the data storage limitation hinders the practical applicability of the BMDP. To overcome this impediment, we propose a novel Duplex Sparse Storage (DSS) scheme in this paper, and develop a BMDP solver with the DSS scheme on a heterogeneous GPU-based platform. The simulation results demonstrate that our approach achieves a 5x reduction in memory utilization with a 2.4% "decision difference" and an average speedup of 4.1x compared to the full matrix based storage scheme. Additionally, we present the tradeoff between the runtime and result accuracy for our DSS techniques versus the full matrix approach. We also compare our results with the well known Compressed Sparse Row (CSR) approach for reducing memory utilization, and discuss the benefits of DSS over CSR.
利用最优贝叶斯鲁棒策略(OBR),贝叶斯马尔可夫决策过程(BMDP)可以解决基因调控网络(GRN)的控制问题。然而,由于“维数诅咒”,数据存储的限制阻碍了BMDP的实际应用。为了克服这一障碍,本文提出了一种新的双工稀疏存储(DSS)方案,并在基于异构gpu的平台上开发了基于DSS方案的BMDP求解器。仿真结果表明,与基于全矩阵的存储方案相比,我们的方法实现了内存利用率降低5倍,“决策差异”为2.4%,平均加速为4.1倍。此外,我们还介绍了DSS技术与完整矩阵方法在运行时和结果准确性之间的权衡。我们还将我们的结果与众所周知的压缩稀疏行(CSR)方法进行了比较,以减少内存利用率,并讨论了DSS相对于CSR的好处。
{"title":"Fast and Highly Scalable Bayesian MDP on a GPU Platform","authors":"He Zhou, S. Khatri, Jiang Hu, Frank Liu, C. Sze","doi":"10.1145/3107411.3107440","DOIUrl":"https://doi.org/10.1145/3107411.3107440","url":null,"abstract":"By employing the Optimal Bayesian Robust (OBR) policy, Bayesian Markov Decision Process (BMDP) can be used to solve the Gene Regulatory Network (GRN) control problem. However, due to the \"curse of dimensionality\", the data storage limitation hinders the practical applicability of the BMDP. To overcome this impediment, we propose a novel Duplex Sparse Storage (DSS) scheme in this paper, and develop a BMDP solver with the DSS scheme on a heterogeneous GPU-based platform. The simulation results demonstrate that our approach achieves a 5x reduction in memory utilization with a 2.4% \"decision difference\" and an average speedup of 4.1x compared to the full matrix based storage scheme. Additionally, we present the tradeoff between the runtime and result accuracy for our DSS techniques versus the full matrix approach. We also compare our results with the well known Compressed Sparse Row (CSR) approach for reducing memory utilization, and discuss the benefits of DSS over CSR.","PeriodicalId":246388,"journal":{"name":"Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116878503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Discovering Inconsistencies in PubMed Abstracts through Ontology-Based Information Extraction 通过基于本体的信息提取发现PubMed摘要中的不一致性
Nisansa de Silva, D. Dou, Jingshan Huang
Searching for a cure for cancer is one of the most vital pursuits in modern medicine. In that aspect microRNA research plays a key role. Keeping track of the shifts and changes in established knowledge in the microRNA domain is very important. In this paper, we introduce an Ontology-Based Information Extraction method to detect occurrences of inconsistencies in microRNA research paper abstracts. We propose a method to first use the Ontology for MIcroRNA Targets (OMIT) to extract triples from the abstracts. Then we introduce a new algorithm to calculate the oppositeness of these candidate relationships. Finally we present the discovered inconsistencies in an easy to read manner to be used by medical professionals. To our best knowledge, this study is the first ontology-based information extraction model introduced to find shifts in the established knowledge in the medical domain using research paper abstracts. We downloaded 36877 abstracts from the PubMed database. From those, we found 102 inconsistencies relevant to the microRNA domain.
寻找治疗癌症的方法是现代医学最重要的追求之一。在这方面,microRNA的研究起着关键作用。跟踪microRNA领域已有知识的变化是非常重要的。在本文中,我们引入了一种基于本体的信息提取方法来检测microRNA研究论文摘要中不一致的情况。我们提出了一种方法,首先使用Ontology for MIcroRNA Targets (OMIT)从摘要中提取三元组。然后,我们引入了一种新的算法来计算这些候选关系的对立面。最后,我们以一种易于阅读的方式呈现所发现的不一致,以供医学专业人员使用。据我们所知,这项研究是第一个引入基于本体的信息提取模型,利用研究论文摘要来发现医学领域已建立知识的变化。我们从PubMed数据库下载了36877篇摘要。从中,我们发现了102个与microRNA结构域相关的不一致之处。
{"title":"Discovering Inconsistencies in PubMed Abstracts through Ontology-Based Information Extraction","authors":"Nisansa de Silva, D. Dou, Jingshan Huang","doi":"10.1145/3107411.3107452","DOIUrl":"https://doi.org/10.1145/3107411.3107452","url":null,"abstract":"Searching for a cure for cancer is one of the most vital pursuits in modern medicine. In that aspect microRNA research plays a key role. Keeping track of the shifts and changes in established knowledge in the microRNA domain is very important. In this paper, we introduce an Ontology-Based Information Extraction method to detect occurrences of inconsistencies in microRNA research paper abstracts. We propose a method to first use the Ontology for MIcroRNA Targets (OMIT) to extract triples from the abstracts. Then we introduce a new algorithm to calculate the oppositeness of these candidate relationships. Finally we present the discovered inconsistencies in an easy to read manner to be used by medical professionals. To our best knowledge, this study is the first ontology-based information extraction model introduced to find shifts in the established knowledge in the medical domain using research paper abstracts. We downloaded 36877 abstracts from the PubMed database. From those, we found 102 inconsistencies relevant to the microRNA domain.","PeriodicalId":246388,"journal":{"name":"Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126134180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Varsimlab: A Docker-based Pipeline to Automatically Synthesize Short Reads with Genomic Aberrations Varsimlab:一个基于docker的管道来自动合成具有基因组畸变的短读
Abdelrahman Hosny, Fatima Zare, S. Nabavi
Individuals of a species have similar characteristics but they are rarely identical because of the genomic variations. One of the important genomic variations is structural variation (SV), including copy number variation (CNV), which is a result of amplifications or deletions of genomic regions. It has been shown that SV plays an important role in phenotypic diversity and evolution. A Genome encompasses other aberrations such as Single Nucleotide Polymorphism (SNP) and small insertions and deletions (Indels). Although genetic variations contribute to our uniqueness, they can comprise critical developmental genes leading to gene dosage imbalances, new genes creation, and gene structures reshaping that ultimately may result in disease. Understanding the mechanisms of structural variation formation helps us better understand human phenotypic diversity, evolution and diseases susceptibility. Computational tools have been developed for genomic variation detection using next-generation sequencing (NGS) data. However, with no prior knowledge about variants in real samples, the tools that are used for detection and analysis have been hindered by the lack of a gold standard benchmark. Some multi-variant simulators have been developed for whole genome sequencing (WGS) data such as SInC and SCNVSim. However, they are not easy to use and technical skills are required to run them. Moreover, those simulators only apply genomic variations to a reference file; and other software tools, such as ART simulator, need to be used to generate the sequenced short reads. We have developed a user-friendly automated pipeline, VarSimLab, which offers an integrated web-based suite to simulate structural variations and also to generate WGS and WES short reads. It utilizes some of the existing tools and packages them into a standard Docker image; an open source technology used to package applications and their dependencies into a standardized software container. VarSimLab automates the process of simulating tumor genotypes such as SNPs, Indels, CNVs, transition/transversion, ploidy and tumor sub-clone and generating short reads. Thanks to the Docker technology, the pipeline is platform-independent and super easy for non-technical scientists to use from a web browser. VarSimLab is designed to grow as a full suite of integrated tools to analyze genomic aberrations.
一个物种的个体具有相似的特征,但由于基因组的变异,它们很少是相同的。基因组的一个重要变异是结构变异(SV),包括拷贝数变异(CNV),这是基因组区域扩增或缺失的结果。研究表明,SV在表型多样性和进化中起着重要作用。基因组包含其他畸变,如单核苷酸多态性(SNP)和小插入和缺失(Indels)。虽然基因变异有助于我们的独特性,但它们可能构成关键的发育基因,导致基因剂量失衡、新基因产生和基因结构重塑,最终可能导致疾病。了解结构变异的形成机制有助于我们更好地理解人类表型多样性、进化和疾病易感性。利用下一代测序(NGS)数据进行基因组变异检测的计算工具已经开发出来。然而,由于没有关于真实样本变异的先验知识,用于检测和分析的工具由于缺乏金标准基准而受到阻碍。针对全基因组测序(WGS)数据,已经开发了一些多变异模拟器,如SInC和SCNVSim。然而,它们并不容易使用,并且需要技术技能来运行它们。此外,这些模拟器仅将基因组变异应用于参考文件;和其他软件工具,如ART模拟器,需要使用生成测序短读。我们已经开发了一个用户友好的自动化管道,VarSimLab,它提供了一个集成的基于web的套件来模拟结构变化,并生成WGS和WES短读。它利用了一些现有的工具,并将它们打包成一个标准的Docker镜像;一种开源技术,用于将应用程序及其依赖项打包到标准化的软件容器中。VarSimLab自动化模拟肿瘤基因型的过程,如snp、Indels、cnv、转移/翻转、倍性和肿瘤亚克隆,并生成短读。由于Docker技术,管道是平台独立的,对于非技术科学家来说,从web浏览器中使用它非常容易。VarSimLab旨在发展为一套完整的集成工具来分析基因组畸变。
{"title":"Varsimlab: A Docker-based Pipeline to Automatically Synthesize Short Reads with Genomic Aberrations","authors":"Abdelrahman Hosny, Fatima Zare, S. Nabavi","doi":"10.1145/3107411.3108188","DOIUrl":"https://doi.org/10.1145/3107411.3108188","url":null,"abstract":"Individuals of a species have similar characteristics but they are rarely identical because of the genomic variations. One of the important genomic variations is structural variation (SV), including copy number variation (CNV), which is a result of amplifications or deletions of genomic regions. It has been shown that SV plays an important role in phenotypic diversity and evolution. A Genome encompasses other aberrations such as Single Nucleotide Polymorphism (SNP) and small insertions and deletions (Indels). Although genetic variations contribute to our uniqueness, they can comprise critical developmental genes leading to gene dosage imbalances, new genes creation, and gene structures reshaping that ultimately may result in disease. Understanding the mechanisms of structural variation formation helps us better understand human phenotypic diversity, evolution and diseases susceptibility. Computational tools have been developed for genomic variation detection using next-generation sequencing (NGS) data. However, with no prior knowledge about variants in real samples, the tools that are used for detection and analysis have been hindered by the lack of a gold standard benchmark. Some multi-variant simulators have been developed for whole genome sequencing (WGS) data such as SInC and SCNVSim. However, they are not easy to use and technical skills are required to run them. Moreover, those simulators only apply genomic variations to a reference file; and other software tools, such as ART simulator, need to be used to generate the sequenced short reads. We have developed a user-friendly automated pipeline, VarSimLab, which offers an integrated web-based suite to simulate structural variations and also to generate WGS and WES short reads. It utilizes some of the existing tools and packages them into a standard Docker image; an open source technology used to package applications and their dependencies into a standardized software container. VarSimLab automates the process of simulating tumor genotypes such as SNPs, Indels, CNVs, transition/transversion, ploidy and tumor sub-clone and generating short reads. Thanks to the Docker technology, the pipeline is platform-independent and super easy for non-technical scientists to use from a web browser. VarSimLab is designed to grow as a full suite of integrated tools to analyze genomic aberrations.","PeriodicalId":246388,"journal":{"name":"Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116098470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SeqAnt: Cloud-Based Whole-Genome Annotation and Search SeqAnt:基于云的全基因组注释和搜索
Alex V. Kotlar, Cristina E. Trevino, M. Zwick, D. Cutler, T. Wingo
Describing, prioritizing, and selecting alleles from large sequencing experiments remains technically challenging. SeqAnt (https://seqant.emory.edu) is the first online, cloud-based application that makes these tasks accessible for non-programmers, even for terabyte-sized experiments containing thousands of whole-genome samples. It rapidly describes the alleles found within submitted VCF files, and then indexes the results in a natural-language search engine, which enables users to locate alleles of interest in milliseconds using normal English phrases. Our results show that SeqAnt decreases processing time by orders of magnitude and that its search engine can be used to precisely identify alleles by phenotype, genomic structure, and population genetics characteristics.
从大型测序实验中描述、排序和选择等位基因在技术上仍然具有挑战性。SeqAnt (https://seqant.emory.edu)是第一个让非程序员也能完成这些任务的在线云应用程序,即使是包含数千个全基因组样本的tb大小的实验。它可以快速描述在提交的VCF文件中找到的等位基因,然后在自然语言搜索引擎中对结果进行索引,从而使用户能够使用正常的英语短语在几毫秒内定位感兴趣的等位基因。我们的研究结果表明,SeqAnt减少了处理时间的数量级,其搜索引擎可用于根据表型,基因组结构和群体遗传特征精确识别等位基因。
{"title":"SeqAnt: Cloud-Based Whole-Genome Annotation and Search","authors":"Alex V. Kotlar, Cristina E. Trevino, M. Zwick, D. Cutler, T. Wingo","doi":"10.1145/3107411.3108231","DOIUrl":"https://doi.org/10.1145/3107411.3108231","url":null,"abstract":"Describing, prioritizing, and selecting alleles from large sequencing experiments remains technically challenging. SeqAnt (https://seqant.emory.edu) is the first online, cloud-based application that makes these tasks accessible for non-programmers, even for terabyte-sized experiments containing thousands of whole-genome samples. It rapidly describes the alleles found within submitted VCF files, and then indexes the results in a natural-language search engine, which enables users to locate alleles of interest in milliseconds using normal English phrases. Our results show that SeqAnt decreases processing time by orders of magnitude and that its search engine can be used to precisely identify alleles by phenotype, genomic structure, and population genetics characteristics.","PeriodicalId":246388,"journal":{"name":"Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116110599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Session details: Session 14: Integrative Methods for Genomic Data 会议详情:第14部分:基因组数据的综合方法
M. Masseroli
{"title":"Session details: Session 14: Integrative Methods for Genomic Data","authors":"M. Masseroli","doi":"10.1145/3254557","DOIUrl":"https://doi.org/10.1145/3254557","url":null,"abstract":"","PeriodicalId":246388,"journal":{"name":"Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122647600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Geometric Sampling Framework for Exploring Molecular Walker Energetics and Dynamics 探索分子步行者能量学和动力学的几何采样框架
Bruna Jacobson, Jon Christian L. David, Mitchell C. Malone, Kasra Manavi, S. Atlas, Lydia Tapia
The motor protein kinesin is a remarkable natural nanobot that moves cellular cargo by taking 8 nm steps along a microtubule molecular highway. Understanding kinesin's mechanism of operation continues to present considerable modeling challenges, primarily due to the millisecond timescale of its motion, which prohibits fully atomistic simulations. Here we describe the first phase of a physics-based approach that combines energetic information from all-atom modeling with a robotic framework to enable kinetic access to longer simulation timescales. Starting from experimental PDB structures, we have designed a computational model of the combined kinesin-microtubule system represented by the isosurface of an all-atom model. We use motion planning techniques originally developed for robotics to generate candidate conformations of the kinesin head with respect to the microtubule, considering all six degrees of freedom of the molecular walker's catalytic domain. This efficient sampling technique, combined with all-atom energy calculations of the kinesin-microtubule system, allows us to explore the configuration space in the vicinity of the kinesin binding site on the microtubule. We report initial results characterizing the energy landscape of the kinesin-microtubule system, setting the stage for an efficient, graph-based exploration of kinesin preferential binding and dynamics on the microtubule, including interactions with obstacles.
马达蛋白驱动蛋白是一种非凡的天然纳米机器人,它沿着微管分子高速公路以8纳米的速度移动细胞货物。了解kinesin的运作机制仍然存在相当大的建模挑战,主要是由于其运动的毫秒时间尺度,这禁止完全原子模拟。在这里,我们描述了基于物理的方法的第一阶段,该方法将来自全原子建模的能量信息与机器人框架相结合,以实现更长的模拟时间尺度的动力学访问。从实验PDB结构出发,我们设计了一个用全原子模型等面表示的联合动力微管系统的计算模型。我们使用最初为机器人技术开发的运动规划技术来生成相对于微管的运动蛋白头部的候选构象,考虑到分子步行者催化域的所有六个自由度。这种高效的采样技术,结合动力蛋白-微管系统的全原子能量计算,使我们能够探索微管上动力蛋白结合位点附近的构型空间。我们报告了初步结果,表征了动力蛋白-微管系统的能量景观,为有效的、基于图的探索动力蛋白在微管上的优先结合和动力学,包括与障碍物的相互作用,奠定了基础。
{"title":"Geometric Sampling Framework for Exploring Molecular Walker Energetics and Dynamics","authors":"Bruna Jacobson, Jon Christian L. David, Mitchell C. Malone, Kasra Manavi, S. Atlas, Lydia Tapia","doi":"10.1145/3107411.3107503","DOIUrl":"https://doi.org/10.1145/3107411.3107503","url":null,"abstract":"The motor protein kinesin is a remarkable natural nanobot that moves cellular cargo by taking 8 nm steps along a microtubule molecular highway. Understanding kinesin's mechanism of operation continues to present considerable modeling challenges, primarily due to the millisecond timescale of its motion, which prohibits fully atomistic simulations. Here we describe the first phase of a physics-based approach that combines energetic information from all-atom modeling with a robotic framework to enable kinetic access to longer simulation timescales. Starting from experimental PDB structures, we have designed a computational model of the combined kinesin-microtubule system represented by the isosurface of an all-atom model. We use motion planning techniques originally developed for robotics to generate candidate conformations of the kinesin head with respect to the microtubule, considering all six degrees of freedom of the molecular walker's catalytic domain. This efficient sampling technique, combined with all-atom energy calculations of the kinesin-microtubule system, allows us to explore the configuration space in the vicinity of the kinesin binding site on the microtubule. We report initial results characterizing the energy landscape of the kinesin-microtubule system, setting the stage for an efficient, graph-based exploration of kinesin preferential binding and dynamics on the microtubule, including interactions with obstacles.","PeriodicalId":246388,"journal":{"name":"Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114184719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Stochastic Process Model and Its Applications to Analysis of Longitudinal Data 随机过程模型及其在纵向数据分析中的应用
I. Zhbannikov, K. Arbeev
Longitudinal studies are widely used in medicine, biology, population health and other areas related to bioinformatics. A broad spectrum of methods for joint analysis of longitudinal and time-to-event (survival) data has been proposed the in last few decades. The Stochastic process model (SPM) represents one possible framework for modelling joint evolution of repeatedly measured variables and time-to-event outcome typically observed in longitudinal studies. SPM is applicable for analyses of longitudinal data in many research areas such as demography and medicine and allows researchers to utilize the full potential of longitudinal data by evaluating dynamic mechanisms of changing physiological variables with time (age), allowing the study of differences, for example, in genotype-specific hazards. SPM allows incorporation of available knowledge about regularities of aging-related changes in the human body for addressing fundamental problems of changes in resilience and physiological norms. It permits evaluating mechanisms that indirectly affect longitudinal trajectories of physiological variables using data on mortality or onset of diseases. In this tutorial we explain the basic concepts of SPM, its current state and possible applications, corresponding software tools and show practical examples of analysis of joint analysis of longitudinal and time-to-event data with this methodology.
纵向研究广泛应用于医学、生物学、人口健康等与生物信息学相关的领域。在过去的几十年里,人们提出了一种广泛的方法来联合分析纵向和时间到事件(生存)数据。随机过程模型(SPM)代表了一种可能的框架,用于模拟重复测量变量和纵向研究中典型观察到的事件时间结果的联合演化。SPM适用于人口统计学和医学等许多研究领域的纵向数据分析,并允许研究人员通过评估生理变量随时间(年龄)变化的动态机制来充分利用纵向数据的潜力,从而研究差异,例如基因型特异性危害。SPM允许结合有关人体衰老相关变化规律的现有知识,以解决恢复力和生理规范变化的基本问题。它允许利用死亡率或疾病发病数据来评估间接影响生理变量纵向轨迹的机制。在本教程中,我们将解释SPM的基本概念、其当前状态和可能的应用、相应的软件工具,并展示使用该方法对纵向和事件时间数据进行联合分析的实际示例。
{"title":"Stochastic Process Model and Its Applications to Analysis of Longitudinal Data","authors":"I. Zhbannikov, K. Arbeev","doi":"10.1145/3107411.3107496","DOIUrl":"https://doi.org/10.1145/3107411.3107496","url":null,"abstract":"Longitudinal studies are widely used in medicine, biology, population health and other areas related to bioinformatics. A broad spectrum of methods for joint analysis of longitudinal and time-to-event (survival) data has been proposed the in last few decades. The Stochastic process model (SPM) represents one possible framework for modelling joint evolution of repeatedly measured variables and time-to-event outcome typically observed in longitudinal studies. SPM is applicable for analyses of longitudinal data in many research areas such as demography and medicine and allows researchers to utilize the full potential of longitudinal data by evaluating dynamic mechanisms of changing physiological variables with time (age), allowing the study of differences, for example, in genotype-specific hazards. SPM allows incorporation of available knowledge about regularities of aging-related changes in the human body for addressing fundamental problems of changes in resilience and physiological norms. It permits evaluating mechanisms that indirectly affect longitudinal trajectories of physiological variables using data on mortality or onset of diseases. In this tutorial we explain the basic concepts of SPM, its current state and possible applications, corresponding software tools and show practical examples of analysis of joint analysis of longitudinal and time-to-event data with this methodology.","PeriodicalId":246388,"journal":{"name":"Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121925914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TINTIN: Exploiting Target Features for Signaling Network Similarity Computation and Ranking 丁丁:利用目标特征进行信令网络相似性计算和排序
Huey-Eng Chua, S. Bhowmick, L. Tucker-Kellogg
Network similarity ranking attempts to rank a given set of networks based on its "similarity" to a reference network. State-of-the-art approaches tend to be general in the sense that they can be applied to networks in a variety of domains. Consequently, they are not designed to exploit domain-specific knowledge to find similar networks although such knowledge may yield interesting insights that are unique to specific problems, paving the way to solutions that are more effective. We propose Tintin which uses a novel target feature-based network similarity distance for ranking similar signaling networks. In contrast to state-of-the-art network similarity techniques, Tintin considers both topological and dynamic features in order to compute network similarity. Our empirical study on signaling networks from BioModels with real-world curated outcomes reveals that Tintin ranking is different from state-of-the-art approaches.
网络相似度排序是根据网络与参考网络的“相似度”对一组给定网络进行排序。最先进的方法往往是通用的,因为它们可以应用于各种领域的网络。因此,它们不是为了利用特定领域的知识来找到类似的网络而设计的,尽管这些知识可能会产生针对特定问题的独特的有趣见解,从而为更有效的解决方案铺平道路。我们提出了丁丁,它使用一种新的基于目标特征的网络相似距离来对相似的信令网络进行排序。与最先进的网络相似度技术相比,丁丁同时考虑拓扑和动态特征来计算网络相似度。我们对具有现实世界策划结果的生物模型信号网络的实证研究表明,丁丁排名不同于最先进的方法。
{"title":"TINTIN: Exploiting Target Features for Signaling Network Similarity Computation and Ranking","authors":"Huey-Eng Chua, S. Bhowmick, L. Tucker-Kellogg","doi":"10.1145/3107411.3107470","DOIUrl":"https://doi.org/10.1145/3107411.3107470","url":null,"abstract":"Network similarity ranking attempts to rank a given set of networks based on its \"similarity\" to a reference network. State-of-the-art approaches tend to be general in the sense that they can be applied to networks in a variety of domains. Consequently, they are not designed to exploit domain-specific knowledge to find similar networks although such knowledge may yield interesting insights that are unique to specific problems, paving the way to solutions that are more effective. We propose Tintin which uses a novel target feature-based network similarity distance for ranking similar signaling networks. In contrast to state-of-the-art network similarity techniques, Tintin considers both topological and dynamic features in order to compute network similarity. Our empirical study on signaling networks from BioModels with real-world curated outcomes reveals that Tintin ranking is different from state-of-the-art approaches.","PeriodicalId":246388,"journal":{"name":"Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121933356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Constraints On Signaling Networks Logic Reveal Functional Subgraphs On Multiple Myeloma OMIC Data 信令网络逻辑约束揭示多发性骨髓瘤OMIC数据的功能子图
Bertrand Miannay, S. Minvielle, O. Roux, F. Magrangeas, Carito Guziolowski
The integration of gene expression profiles (GEPs) and large-scale biological networks derived from Pathways Databases is a subject which is being widely explored. Existing methods are based on network distance measures among significantly measured species. Only a small number of them include the directionality and underlying logic existing in biological networks. In this study we approach the GEP-networks integration problem by considering the network logic but our approach does not require a prior species selection according to their gene expression level. We start by modeling the biological network representing its underlying logic using Logic Programming. This model points to reachable network discrete states that maximize a notion of harmony between the molecular species active or inactive possible states and the directionality of the pathways reactions according to their activator or inhibitor control role. Only then, we confront these network states with the GEP. From this analysis independent graph components are derived, each of them related to a fixed and optimal assignment of active or inactive states. These components allow us to decompose a large-scale network into subgraphs and their molecular species state assignments have different degrees of similarity when compared to the same GEP. We applied our method to study the set of possible states derived from a subgraph from the NCI-PID Pathway Interaction Database. This graph linked Multiple Myeloma (MM) genes to known receptors for this blood cancer.
基因表达谱(GEPs)与pathway数据库衍生的大规模生物网络的整合是一个正在被广泛探索的课题。现有的方法是基于显著测量物种之间的网络距离测量。其中只有一小部分包含生物网络中存在的方向性和底层逻辑。在本研究中,我们通过考虑网络逻辑来解决gep -网络整合问题,但我们的方法不需要根据基因表达水平预先选择物种。我们首先使用逻辑编程对生物网络进行建模,表示其底层逻辑。该模型指出了可达的网络离散状态,这些状态最大化了分子物种活性或非活性可能状态之间的和谐概念,以及根据其激活剂或抑制剂控制作用的途径反应的方向性。只有这样,我们才能用GEP来面对这些网络状态。从这个分析中,导出了独立的图组件,每个组件都与活动或非活动状态的固定和最佳分配有关。这些组件允许我们将大规模网络分解成子图,并且与相同的GEP相比,它们的分子物种状态分配具有不同程度的相似性。我们应用我们的方法来研究从NCI-PID路径交互数据库的子图中导出的可能状态集。这张图表将多发性骨髓瘤(MM)基因与这种血癌的已知受体联系起来。
{"title":"Constraints On Signaling Networks Logic Reveal Functional Subgraphs On Multiple Myeloma OMIC Data","authors":"Bertrand Miannay, S. Minvielle, O. Roux, F. Magrangeas, Carito Guziolowski","doi":"10.1145/3107411.3110411","DOIUrl":"https://doi.org/10.1145/3107411.3110411","url":null,"abstract":"The integration of gene expression profiles (GEPs) and large-scale biological networks derived from Pathways Databases is a subject which is being widely explored. Existing methods are based on network distance measures among significantly measured species. Only a small number of them include the directionality and underlying logic existing in biological networks. In this study we approach the GEP-networks integration problem by considering the network logic but our approach does not require a prior species selection according to their gene expression level. We start by modeling the biological network representing its underlying logic using Logic Programming. This model points to reachable network discrete states that maximize a notion of harmony between the molecular species active or inactive possible states and the directionality of the pathways reactions according to their activator or inhibitor control role. Only then, we confront these network states with the GEP. From this analysis independent graph components are derived, each of them related to a fixed and optimal assignment of active or inactive states. These components allow us to decompose a large-scale network into subgraphs and their molecular species state assignments have different degrees of similarity when compared to the same GEP. We applied our method to study the set of possible states derived from a subgraph from the NCI-PID Pathway Interaction Database. This graph linked Multiple Myeloma (MM) genes to known receptors for this blood cancer.","PeriodicalId":246388,"journal":{"name":"Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123362901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Evolving Conformation Paths to Model Protein Structural Transitions 进化构象路径以模拟蛋白质结构转变
Emmanuel Sapin, K. D. Jong, Amarda Shehu
Proteins are dynamic biomolecules. A structure-by-structure characterization of a protein's transition between two different functional structures is central to elucidating the role of dynamics in modulating protein function and designing therapeutic drugs. Characterizing transitions challenges both dry and wet laboratories. Some computational methods compute discrete representations of the energy landscape that organizes structures of a protein by their potential energies. The representations support queries for paths (series of structures) connecting start and goal structures of interest. Here we address the problem of modeling protein structural transitions under the umbrella of stochastic optimization and propose a novel evolutionary algorithm (EA). The EA evolves paths without reconstructing the energy landscape, addressing two competing optimization objectives, energetic cost and structural resolution. Rather than seek one path, the EA yields an ensemble of paths to represent a transition. Preliminary applications suggest the EA is effective while operating under a reasonable computational budget.
蛋白质是动态的生物分子。蛋白质在两种不同功能结构之间转换的逐个结构表征对于阐明动力学在调节蛋白质功能和设计治疗药物中的作用至关重要。表征过渡对干湿实验室都是挑战。一些计算方法计算能量景观的离散表示,这些能量景观通过它们的势能组织蛋白质的结构。这些表示支持对连接感兴趣的开始和目标结构的路径(一系列结构)进行查询。本文提出了一种基于随机优化的蛋白质结构迁移建模方法,并提出了一种新的进化算法(EA)。EA在不重建能源格局的情况下发展路径,解决了两个相互竞争的优化目标,能源成本和结构解决方案。EA不是寻找一条路径,而是生成一系列路径来表示转换。初步应用表明,在合理的计算预算下,EA是有效的。
{"title":"Evolving Conformation Paths to Model Protein Structural Transitions","authors":"Emmanuel Sapin, K. D. Jong, Amarda Shehu","doi":"10.1145/3107411.3107498","DOIUrl":"https://doi.org/10.1145/3107411.3107498","url":null,"abstract":"Proteins are dynamic biomolecules. A structure-by-structure characterization of a protein's transition between two different functional structures is central to elucidating the role of dynamics in modulating protein function and designing therapeutic drugs. Characterizing transitions challenges both dry and wet laboratories. Some computational methods compute discrete representations of the energy landscape that organizes structures of a protein by their potential energies. The representations support queries for paths (series of structures) connecting start and goal structures of interest. Here we address the problem of modeling protein structural transitions under the umbrella of stochastic optimization and propose a novel evolutionary algorithm (EA). The EA evolves paths without reconstructing the energy landscape, addressing two competing optimization objectives, energetic cost and structural resolution. Rather than seek one path, the EA yields an ensemble of paths to represent a transition. Preliminary applications suggest the EA is effective while operating under a reasonable computational budget.","PeriodicalId":246388,"journal":{"name":"Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114068595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1