首页 > 最新文献

IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine最新文献

英文 中文
High performance computational biology and drug design on TianHe Supercomputers 基于天河超级计算机的高性能计算生物学和药物设计
Shaoliang Peng
Extremely powerful computers are needed to help scientists to handle high performance computational biology and drug design problems. The world's largest genomics institute BGI currently generates 6 TB data each day. The European Bioinformatics Institute (EBI) in Hinxton currently stores 20 petabytes (1 petabyte is 1015 bytes) of data and back-ups about genes, proteins and small molecules. TianHe supercomputers can speed up computational biology and drug design processing. In 2013, 2014, and 2015, Tianhe-2 topped the TOP500 list of fastest supercomputers in the world. Many well-known bioinformatics and drug design softwares (BWA, DOCK, SOAP3-dp, SOAPdenovo, SOAPsnp etc.) are developed and running on TH-2.
需要极其强大的计算机来帮助科学家处理高性能计算生物学和药物设计问题。世界上最大的基因组研究所华大基因目前每天产生6tb的数据。位于Hinxton的欧洲生物信息学研究所(EBI)目前存储了20拍字节(1拍字节等于1015字节)的数据,并备份了有关基因、蛋白质和小分子的数据。天河超级计算机可以加速计算生物学和药物设计处理。2013年、2014年和2015年,天河二号在全球最快超级计算机500强中名列前茅。许多知名的生物信息学和药物设计软件(BWA、DOCK、SOAP3-dp、SOAPdenovo、SOAPsnp等)都是在TH-2上开发和运行的。
{"title":"High performance computational biology and drug design on TianHe Supercomputers","authors":"Shaoliang Peng","doi":"10.1109/BIBM.2016.7822480","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822480","url":null,"abstract":"Extremely powerful computers are needed to help scientists to handle high performance computational biology and drug design problems. The world's largest genomics institute BGI currently generates 6 TB data each day. The European Bioinformatics Institute (EBI) in Hinxton currently stores 20 petabytes (1 petabyte is 1015 bytes) of data and back-ups about genes, proteins and small molecules. TianHe supercomputers can speed up computational biology and drug design processing. In 2013, 2014, and 2015, Tianhe-2 topped the TOP500 list of fastest supercomputers in the world. Many well-known bioinformatics and drug design softwares (BWA, DOCK, SOAP3-dp, SOAPdenovo, SOAPsnp etc.) are developed and running on TH-2.","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"117 1","pages":"7"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87084701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Networks and models for the integrated analysis of multi omics data 多组学数据集成分析的网络和模型
Sun Kim
These days, genome-wide measurements of genetic and epigenetics events, a.k.a omics data, are routinely produced; epigenetics is control mechanisms of genetics events as epi-means ‘on’ or ‘upon’. As a result, a huge amount of omics data measured from different genetic and epigenetic events are available. For example, the amount of data at The Cancer Genome Atlas(TCGA) alone exceeds 2.5 peta byte as of October 2016. Unfortunately, the dimensions of omics data is huge, typically tens to hundreds or even millions of thousands while the number of samples are limited typically a few to thousands. Thus mining genetic and epigenetic data measured in different phenotype conditions is a very challenging problem, that is, small data sets on extremely high dimensions. Furthermore, all genetic and epigenetic events are inter-related. Thus it is necessary to perform integrated analysis of omics data sets of different types, which is even more challenging. To address these technical challenges, the bioinformatics community has used virtually all known network based analysis techniques, including recently developed deep neural networks. My group has been trying the network based integrated analysis of omics data at three different levels. First, we have been investigating on computational methods for associating different genetic and epigenetic events, which can be viewed as methods for defining edges in the network. Second, we have been developing mining subnetworks on the phenotype and time dimensions. Third, we have recently begun to investigate on the use of deep learning techniques for the integrated analysis of omics data. An important goal of our research is to combine network analysis and deep learning techniques to construct models or draw maps of cancer cells at multiple levels such as genomic mutations, gene activation/suppressions, epigenetic events including DNA methylation, histone modifications, and miRNA interference, biological pathways, and finally at the whole cell level including tumor heterogeneity and clonal evolution.
如今,基因和表观遗传学事件的全基因组测量,也就是组学数据,已经成为常规;表观遗传学是遗传学事件的控制机制,epi的意思是“上”或“上”。因此,从不同的遗传和表观遗传事件中测量的大量组学数据是可用的。例如,截至2016年10月,仅癌症基因组图谱(TCGA)的数据量就超过了2.5 peta字节。不幸的是,组学数据的维度是巨大的,通常是几十到几百甚至几百万,而样本的数量通常是有限的,通常是几到几千。因此,挖掘在不同表型条件下测量的遗传和表观遗传数据是一个非常具有挑战性的问题,即在极高维度上的小数据集。此外,所有遗传和表观遗传事件都是相互关联的。因此,需要对不同类型的组学数据集进行综合分析,这更具挑战性。为了应对这些技术挑战,生物信息学社区几乎使用了所有已知的基于网络的分析技术,包括最近开发的深度神经网络。我的团队一直在尝试基于网络的三个不同层次的组学数据综合分析。首先,我们研究了关联不同遗传和表观遗传事件的计算方法,这些方法可以看作是定义网络中边缘的方法。其次,我们一直在开发表现型和时间维度的挖掘子网。第三,我们最近开始研究将深度学习技术用于组学数据的综合分析。我们研究的一个重要目标是将网络分析和深度学习技术结合起来,在基因组突变、基因激活/抑制、表观遗传事件(包括DNA甲基化、组蛋白修饰和miRNA干扰)、生物学途径等多个层面构建模型或绘制癌细胞图谱,最终在全细胞水平(包括肿瘤异质性和克隆进化)构建模型或绘制癌细胞图谱。
{"title":"Networks and models for the integrated analysis of multi omics data","authors":"Sun Kim","doi":"10.1109/BIBM.2016.7822479","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822479","url":null,"abstract":"These days, genome-wide measurements of genetic and epigenetics events, a.k.a omics data, are routinely produced; epigenetics is control mechanisms of genetics events as epi-means ‘on’ or ‘upon’. As a result, a huge amount of omics data measured from different genetic and epigenetic events are available. For example, the amount of data at The Cancer Genome Atlas(TCGA) alone exceeds 2.5 peta byte as of October 2016. Unfortunately, the dimensions of omics data is huge, typically tens to hundreds or even millions of thousands while the number of samples are limited typically a few to thousands. Thus mining genetic and epigenetic data measured in different phenotype conditions is a very challenging problem, that is, small data sets on extremely high dimensions. Furthermore, all genetic and epigenetic events are inter-related. Thus it is necessary to perform integrated analysis of omics data sets of different types, which is even more challenging. To address these technical challenges, the bioinformatics community has used virtually all known network based analysis techniques, including recently developed deep neural networks. My group has been trying the network based integrated analysis of omics data at three different levels. First, we have been investigating on computational methods for associating different genetic and epigenetic events, which can be viewed as methods for defining edges in the network. Second, we have been developing mining subnetworks on the phenotype and time dimensions. Third, we have recently begun to investigate on the use of deep learning techniques for the integrated analysis of omics data. An important goal of our research is to combine network analysis and deep learning techniques to construct models or draw maps of cancer cells at multiple levels such as genomic mutations, gene activation/suppressions, epigenetic events including DNA methylation, histone modifications, and miRNA interference, biological pathways, and finally at the whole cell level including tumor heterogeneity and clonal evolution.","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"38 1","pages":"6"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76070749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep-Learning: Investigating feed-forward deep Neural Networks for modeling high throughput chemical bioactivity data 深度学习:研究用于高通量化学生物活性数据建模的前馈深度神经网络
Jun Huan
In recent years, research in Artificial Neural Networks (ANNs) has resurged, now under the Deep-Learning umbrella, and grown extremely popular due to major breakthroughs in methodological and computing capabilities. Deep-Learning methods are part of representation-learning algorithms that attempt to extract and organize discriminative information from the data. Recently reported success of DL techniques in crowd-sourced QSARs and predictive toxicology competitions has showcased these methods as powerful tools for drug-discovery and toxicology research. Nevertheless, reported applications of Deep Learning techniques for modeling complex bioactivity data for small molecules remain still limited. In this talk I will present our recent work on optimizing feed-forward Deep Neural Nets (DNNs) hyperparameters and performance evaluation of these methods as compared to shallow methods. In our study 48 DNNs, 24 Random Forest, 20 SVM and 6 Naive Bayes arbitrary but reasonably selected configurations were compared employing 7 diverse bioactivity datasets assembled from ChEMBL repository combined with circular fingerprints as molecular descriptors. The non-parametric Wilcoxon paired singed-rank test was employed to compare the performance of DNN to RF, SVM and NB. Overall it was found that DNNs with 2 hidden layers, 2,000 neurons per each hidden layer, ReLU activation function and Dropout regularization technique achieved strong classification performance across all tested datasets. Our results demonstrate that DNNs are powerful modeling techniques for modeling complex bioactivity data.
近年来,人工神经网络(ann)的研究已经复苏,现在在深度学习的保护伞下,并且由于方法和计算能力的重大突破而变得非常受欢迎。深度学习方法是表征学习算法的一部分,它试图从数据中提取和组织判别信息。最近报道的DL技术在众包QSARs和预测毒理学竞赛中的成功表明,这些方法是药物发现和毒理学研究的有力工具。然而,据报道,深度学习技术在小分子复杂生物活性数据建模中的应用仍然有限。在这次演讲中,我将介绍我们最近在优化前馈深度神经网络(dnn)超参数方面的工作,以及与浅层方法相比,这些方法的性能评估。本研究采用从ChEMBL知识库中收集的7种不同的生物活性数据集,结合圆形指纹作为分子描述符,对48种dnn、24种随机森林、20种支持向量机和6种任意但合理选择的朴素贝叶斯配置进行了比较。采用非参数Wilcoxon配对单秩检验比较DNN与RF、SVM和NB的性能。总体而言,具有2个隐藏层,每个隐藏层2000个神经元,ReLU激活函数和Dropout正则化技术的dnn在所有测试数据集上都取得了较强的分类性能。我们的研究结果表明,深度神经网络是模拟复杂生物活性数据的强大建模技术。
{"title":"Deep-Learning: Investigating feed-forward deep Neural Networks for modeling high throughput chemical bioactivity data","authors":"Jun Huan","doi":"10.1109/BIBM.2016.7822478","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822478","url":null,"abstract":"In recent years, research in Artificial Neural Networks (ANNs) has resurged, now under the Deep-Learning umbrella, and grown extremely popular due to major breakthroughs in methodological and computing capabilities. Deep-Learning methods are part of representation-learning algorithms that attempt to extract and organize discriminative information from the data. Recently reported success of DL techniques in crowd-sourced QSARs and predictive toxicology competitions has showcased these methods as powerful tools for drug-discovery and toxicology research. Nevertheless, reported applications of Deep Learning techniques for modeling complex bioactivity data for small molecules remain still limited. In this talk I will present our recent work on optimizing feed-forward Deep Neural Nets (DNNs) hyperparameters and performance evaluation of these methods as compared to shallow methods. In our study 48 DNNs, 24 Random Forest, 20 SVM and 6 Naive Bayes arbitrary but reasonably selected configurations were compared employing 7 diverse bioactivity datasets assembled from ChEMBL repository combined with circular fingerprints as molecular descriptors. The non-parametric Wilcoxon paired singed-rank test was employed to compare the performance of DNN to RF, SVM and NB. Overall it was found that DNNs with 2 hidden layers, 2,000 neurons per each hidden layer, ReLU activation function and Dropout regularization technique achieved strong classification performance across all tested datasets. Our results demonstrate that DNNs are powerful modeling techniques for modeling complex bioactivity data.","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"32 1","pages":"5"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74922522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computational psychophysiology based research methodology for mental health 基于计算心理生理学的心理健康研究方法
Bin Hu
Computational psychophysiology is a new direction that broadens the field of psychophysiology by allowing for the identification and integration of multimodal signals to test specific models of mental states and psychological processes. Additionally, such approaches allows for the extraction of multiple signals from large-scale multidimensional data, with a greater ability to differentiate signals embedded in background noise. Further, these approaches allows for a better understanding of the complex psychophysiological processes underlying brain disorders such as autism spectrum disorder, depression, and anxiety. Given the widely acknowledged limitations of psychiatric nosology and the limited treatment options available, new computational models may provide the basis for a multidimensional diagnostic system and potentially new treatment approaches.
计算心理生理学是拓宽心理生理学领域的一个新方向,它允许识别和整合多模态信号来测试心理状态和心理过程的特定模型。此外,这种方法允许从大规模多维数据中提取多个信号,具有更好的区分嵌入背景噪声中的信号的能力。此外,这些方法可以更好地理解大脑疾病(如自闭症谱系障碍、抑郁和焦虑)背后的复杂心理生理过程。鉴于公认的精神病学的局限性和有限的治疗选择,新的计算模型可能为多维诊断系统和潜在的新治疗方法提供基础。
{"title":"Computational psychophysiology based research methodology for mental health","authors":"Bin Hu","doi":"10.1109/BIBM.2016.7822474","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822474","url":null,"abstract":"Computational psychophysiology is a new direction that broadens the field of psychophysiology by allowing for the identification and integration of multimodal signals to test specific models of mental states and psychological processes. Additionally, such approaches allows for the extraction of multiple signals from large-scale multidimensional data, with a greater ability to differentiate signals embedded in background noise. Further, these approaches allows for a better understanding of the complex psychophysiological processes underlying brain disorders such as autism spectrum disorder, depression, and anxiety. Given the widely acknowledged limitations of psychiatric nosology and the limited treatment options available, new computational models may provide the basis for a multidimensional diagnostic system and potentially new treatment approaches.","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"119 1","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77955451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-omic approaches for liver cancer biomarker discovery 肝癌生物标志物发现的多组学方法
Habtom W. Resson
Omic technologies offer the opportunity to characterize liver cancer at various molecular levels. In particular, characterizing the association of biomolecules such as metabolites and glycoproteins with liver cancer is a promising strategy to discover clinically relevant biomarkers. Metabolites are molecular fingerprints of what cells do at a particular point in time; they can reveal early signs of cancers when the chances for cure are highest. Also, the analysis of protein glycosylation is relevant to liver pathology because of the major influence of this organ on the homeostasis of blood glycoproteins. This talk will focus on the application of multi-omic approaches to identify biomarkers for early detection of liver cancer in patients with liver cirrhosis. Specifically, I will present transcriptomic, proteomic, glycomic/glycoproteomic, and metabolomic (TPGM) studies we conducted by analysis of samples from HCC cases and cirrhotic controls using multiple omic platforms such as next generation sequencing, liquid chromatography-mass spectrometry (LC-MS), and gas chromatography-mass spectrometry (GC-MS). In addition to candidate biomarkers discovered by evaluating the changes in the levels of transcripts, proteins, glycans, and metabolites between HCC cases and cirrhotic controls, I will present network-based methods we developed for integrative analysis of multi-omic data to identify aberrant pathways/network activities and biomarkers for early detection of liver cancer.
组学技术提供了在不同分子水平上表征肝癌的机会。特别是,表征生物分子(如代谢物和糖蛋白)与肝癌的关联是发现临床相关生物标志物的有希望的策略。代谢物是细胞在特定时间点活动的分子指纹;它们可以在治愈几率最高的时候揭示癌症的早期迹象。此外,蛋白质糖基化分析与肝脏病理有关,因为肝脏对血糖蛋白的稳态有重要影响。本次演讲将重点介绍多组学方法在肝硬化患者早期肝癌检测中的应用。具体来说,我将介绍转录组学、蛋白质组学、糖组学/糖蛋白质组学和代谢组学(TPGM)研究,我们使用多种组学平台(如下一代测序、液相色谱-质谱(LC-MS)和气相色谱-质谱(GC-MS))分析来自HCC病例和肝硬化对照的样本。除了通过评估HCC病例和肝硬化对照之间转录本、蛋白质、聚糖和代谢物水平的变化发现的候选生物标志物外,我还将介绍我们开发的基于网络的方法,用于多组学数据的综合分析,以识别异常通路/网络活动和早期检测肝癌的生物标志物。
{"title":"Multi-omic approaches for liver cancer biomarker discovery","authors":"Habtom W. Resson","doi":"10.1109/BIBM.2016.7822481","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822481","url":null,"abstract":"Omic technologies offer the opportunity to characterize liver cancer at various molecular levels. In particular, characterizing the association of biomolecules such as metabolites and glycoproteins with liver cancer is a promising strategy to discover clinically relevant biomarkers. Metabolites are molecular fingerprints of what cells do at a particular point in time; they can reveal early signs of cancers when the chances for cure are highest. Also, the analysis of protein glycosylation is relevant to liver pathology because of the major influence of this organ on the homeostasis of blood glycoproteins. This talk will focus on the application of multi-omic approaches to identify biomarkers for early detection of liver cancer in patients with liver cirrhosis. Specifically, I will present transcriptomic, proteomic, glycomic/glycoproteomic, and metabolomic (TPGM) studies we conducted by analysis of samples from HCC cases and cirrhotic controls using multiple omic platforms such as next generation sequencing, liquid chromatography-mass spectrometry (LC-MS), and gas chromatography-mass spectrometry (GC-MS). In addition to candidate biomarkers discovered by evaluating the changes in the levels of transcripts, proteins, glycans, and metabolites between HCC cases and cirrhotic controls, I will present network-based methods we developed for integrative analysis of multi-omic data to identify aberrant pathways/network activities and biomarkers for early detection of liver cancer.","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"39 1","pages":"8"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81694259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An algorithmic-information calculus for reprogramming biological networks 生物网络重编程的算法-信息演算
H. Zenil
Despite extensive attempts to characterize systems and networks based upon metrics drawn from traditional statistics, Shannon entropy, and graph theory to understand systems and networks to reveal their causal mechanisms without making too many unjustified assumptions remains still as one of the greatest challenges in complexity science and science in general, specially beyond traditional statistics and so-called machine learning. Knowing the causal mechanisms that govern a system allows not only the prediction of the system's behavior but the manipulation and controlled reprogramming of the system. Here we introduce a formal interventional calculus based upon universal principles drawn from the theory of computability and algorithmic probability, thereby enabling better approaches to the question of causal discovery. By performing sequences of fully controlled perturbations, changes in the algorithmic content of a system can be classified into the effects they have according to their shift towards or away from algorithmic randomness, thereby inducing a ranking of system's elements. This spectral dimension unmasks an algorithmic separation between components conditioned upon the perturbations and endowing us with a suite of powerful parameter-free algorithms to reprogram the system's underlying program. The predictive and explanatory power of these novel conceptual tools are introduced and numerical experiments are illustrated on various types of networks. We show how the algorithmic content of a network is connected to its possible dynamics and how the instant variation of the sensitivity, depth, and the number of attractors in a network is accessible by an analysis of its algorithmic information landscape. The results demonstrate how to unveil causal mechanisms to infer essential properties, including the dynamics of evolving networks. We introduce measures and methods for system reprogrammability even with no, or limited, access to the system kinetic equations or probability distributions. We expect this interventional calculus to be broadly applicable for predictive causal interventions and we anticipate it to be instrumental in the challenge of causality discovery in science from complex data.
尽管人们广泛尝试根据传统统计学的指标来描述系统和网络,但香农熵和图论在不做出太多不合理假设的情况下理解系统和网络以揭示其因果机制,仍然是复杂性科学和一般科学中最大的挑战之一,特别是超越传统统计学和所谓的机器学习。了解控制系统的因果机制不仅可以预测系统的行为,还可以操纵和控制系统的重新编程。在这里,我们介绍了一种基于可计算性理论和算法概率论中得出的普遍原理的正式介入演算,从而能够更好地解决因果发现问题。通过执行一系列完全可控的扰动,系统的算法内容的变化可以根据其向算法随机性或远离算法随机性的转变,分类为它们所产生的影响,从而归纳出系统元素的排名。这个谱维揭示了受扰动影响的组件之间的算法分离,并赋予我们一套强大的无参数算法来重新编程系统的底层程序。介绍了这些新概念工具的预测和解释能力,并在各种类型的网络上进行了数值实验。我们展示了网络的算法内容如何与其可能的动态相关联,以及如何通过分析其算法信息景观来访问网络中吸引子的灵敏度、深度和数量的即时变化。结果展示了如何揭示因果机制来推断基本属性,包括进化网络的动态。我们介绍了系统可重编程性的措施和方法,即使没有或有限地访问系统动力学方程或概率分布。我们希望这种介入演算能广泛适用于预测性因果干预,我们希望它能在复杂数据中发现科学因果关系的挑战中发挥重要作用。
{"title":"An algorithmic-information calculus for reprogramming biological networks","authors":"H. Zenil","doi":"10.1109/BIBM.2016.7822485","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822485","url":null,"abstract":"Despite extensive attempts to characterize systems and networks based upon metrics drawn from traditional statistics, Shannon entropy, and graph theory to understand systems and networks to reveal their causal mechanisms without making too many unjustified assumptions remains still as one of the greatest challenges in complexity science and science in general, specially beyond traditional statistics and so-called machine learning. Knowing the causal mechanisms that govern a system allows not only the prediction of the system's behavior but the manipulation and controlled reprogramming of the system. Here we introduce a formal interventional calculus based upon universal principles drawn from the theory of computability and algorithmic probability, thereby enabling better approaches to the question of causal discovery. By performing sequences of fully controlled perturbations, changes in the algorithmic content of a system can be classified into the effects they have according to their shift towards or away from algorithmic randomness, thereby inducing a ranking of system's elements. This spectral dimension unmasks an algorithmic separation between components conditioned upon the perturbations and endowing us with a suite of powerful parameter-free algorithms to reprogram the system's underlying program. The predictive and explanatory power of these novel conceptual tools are introduced and numerical experiments are illustrated on various types of networks. We show how the algorithmic content of a network is connected to its possible dynamics and how the instant variation of the sensitivity, depth, and the number of attractors in a network is accessible by an analysis of its algorithmic information landscape. The results demonstrate how to unveil causal mechanisms to infer essential properties, including the dynamics of evolving networks. We introduce measures and methods for system reprogrammability even with no, or limited, access to the system kinetic equations or probability distributions. We expect this interventional calculus to be broadly applicable for predictive causal interventions and we anticipate it to be instrumental in the challenge of causality discovery in science from complex data.","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"87 1","pages":"12"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83781674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The overview of research progress of the relationship between HBP and inspection information 综述了HBP与检测信息关系的研究进展
D. Zhu, Xueping Li, Zhaoxia Xu, Yiqin Wang, Jin Xu
{"title":"The overview of research progress of the relationship between HBP and inspection information","authors":"D. Zhu, Xueping Li, Zhaoxia Xu, Yiqin Wang, Jin Xu","doi":"10.1109/BIBM.2016.7822712","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822712","url":null,"abstract":"","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"67 1","pages":"1341-1345"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80241920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computational challenges in microbiome research 微生物组研究中的计算挑战
Mihai Pop
Millions of bacteria make our bodies their home. They help keep us healthy, and disruptions in the normal microbiota are believed to contribute to a number of diseases. Cost-effective sequencing technologies have made it possible to sequence the genomes of human-associated microbial communities, leading to the birth of a new scientific discipline - metagenomics. Analyzing the resulting data, however, poses significant computational challenges, in part due to the sheer size of the data-sets, and in part due to the fact that most of the existing computational framework has been established for single organisms. In my talk I will outline several analytical challenges posed by metagenomic applications, and will describe recent results from my lab in the development of tools for analyzing metagenomic data. In particular I will discuss insights from our analysis of diarrheal disease in developing countries, as well as the effective use of co-abundance approaches for linking together data from two large metagenomic studies.
数以百万计的细菌以我们的身体为家。它们有助于保持我们的健康,而正常微生物群的破坏被认为是导致许多疾病的原因。具有成本效益的测序技术使得对人类相关微生物群落的基因组进行测序成为可能,从而催生了一门新的科学学科——宏基因组学。然而,分析结果数据带来了重大的计算挑战,部分原因是数据集的绝对规模,部分原因是大多数现有的计算框架都是为单个生物体建立的。在我的演讲中,我将概述宏基因组应用带来的几个分析挑战,并将描述我的实验室在开发分析宏基因组数据的工具方面的最新成果。特别是,我将讨论我们对发展中国家腹泻病分析的见解,以及有效使用共同丰度方法将两个大型宏基因组研究的数据联系在一起。
{"title":"Computational challenges in microbiome research","authors":"Mihai Pop","doi":"10.1109/BIBM.2015.7359645","DOIUrl":"https://doi.org/10.1109/BIBM.2015.7359645","url":null,"abstract":"Millions of bacteria make our bodies their home. They help keep us healthy, and disruptions in the normal microbiota are believed to contribute to a number of diseases. Cost-effective sequencing technologies have made it possible to sequence the genomes of human-associated microbial communities, leading to the birth of a new scientific discipline - metagenomics. Analyzing the resulting data, however, poses significant computational challenges, in part due to the sheer size of the data-sets, and in part due to the fact that most of the existing computational framework has been established for single organisms. In my talk I will outline several analytical challenges posed by metagenomic applications, and will describe recent results from my lab in the development of tools for analyzing metagenomic data. In particular I will discuss insights from our analysis of diarrheal disease in developing countries, as well as the effective use of co-abundance approaches for linking together data from two large metagenomic studies.","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"62 1","pages":"2"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82322182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Big data in biomedicine - An NIH perspective 生物医学中的大数据——美国国立卫生研究院的观点
P. Bourne
Biomedical research is becoming increasingly data driven, analytical and hence digital. In recognition of this evolution NIH has established the Office for Data Science with trans NIH responsibility for maximizing the value of this digital enterprise. This effort brings together communities, policy changes and new infrastructure to be applied to existing and new areas of research such as precision medicine. We will review these changes from the perspective of research advances that are underway and highlight how this community can further engage in these activities.
生物医学研究越来越趋向于数据驱动、分析和数字化。认识到这一演变,美国国立卫生研究院建立了数据科学办公室,跨NIH负责最大化这一数字企业的价值。这一努力汇集了社区、政策变化和新的基础设施,以应用于现有和新的研究领域,如精准医学。我们将从正在进行的研究进展的角度来回顾这些变化,并强调这个社区如何进一步参与这些活动。
{"title":"Big data in biomedicine - An NIH perspective","authors":"P. Bourne","doi":"10.1109/BIBM.2015.7359644","DOIUrl":"https://doi.org/10.1109/BIBM.2015.7359644","url":null,"abstract":"Biomedical research is becoming increasingly data driven, analytical and hence digital. In recognition of this evolution NIH has established the Office for Data Science with trans NIH responsibility for maximizing the value of this digital enterprise. This effort brings together communities, policy changes and new infrastructure to be applied to existing and new areas of research such as precision medicine. We will review these changes from the perspective of research advances that are underway and highlight how this community can further engage in these activities.","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"26 1","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88364952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Exploration of topological torsion fingerprints 拓扑扭转指纹的探索
Skoda Petr, Hoksza David
The screening of chemical libraries is an important step in identification of new leads in the drug discovery process. It is the size of the existing chemical libraries that renders laboratory screening expensive. A solution is to incorporate virtual screening into the process in order to reduce the number of molecules to be screened in the wet lab. In this paper, we explore several approaches to modification of one of the best performing methods for molecular representation in virtual screening campaigns, the topological torsion fingerprints. The modifications include the change of path length, altering atom descriptors and introduction of the so-called field version of the descriptors. With the field-based modification, our improved version of topological torsion fingerprints shows improvements by up to four percent in terms of area under the curve (AUC). The new topological torsion fingerprint thus represents one of the best performing molecular representation today.
化学文库的筛选是药物发现过程中确定新先导物的重要步骤。现有化学文库的规模使得实验室筛选成本高昂。一种解决方案是将虚拟筛选纳入过程,以减少在湿实验室中筛选的分子数量。在本文中,我们探索了几种方法来修改在虚拟筛选活动中表现最好的分子表示方法之一,拓扑扭转指纹。这些修改包括路径长度的改变、原子描述符的改变以及描述符的字段版本的引入。通过基于场的修改,我们改进的拓扑扭转指纹在曲线下面积(AUC)方面提高了4%。因此,新的拓扑扭转指纹代表了当今表现最好的分子表征之一。
{"title":"Exploration of topological torsion fingerprints","authors":"Skoda Petr, Hoksza David","doi":"10.1109/BIBM.2015.7359792","DOIUrl":"https://doi.org/10.1109/BIBM.2015.7359792","url":null,"abstract":"The screening of chemical libraries is an important step in identification of new leads in the drug discovery process. It is the size of the existing chemical libraries that renders laboratory screening expensive. A solution is to incorporate virtual screening into the process in order to reduce the number of molecules to be screened in the wet lab. In this paper, we explore several approaches to modification of one of the best performing methods for molecular representation in virtual screening campaigns, the topological torsion fingerprints. The modifications include the change of path length, altering atom descriptors and introduction of the so-called field version of the descriptors. With the field-based modification, our improved version of topological torsion fingerprints shows improvements by up to four percent in terms of area under the curve (AUC). The new topological torsion fingerprint thus represents one of the best performing molecular representation today.","PeriodicalId":73283,"journal":{"name":"IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine","volume":"29 1","pages":"822-828"},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85122884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
IEEE International Conference on Bioinformatics and Biomedicine workshops. IEEE International Conference on Bioinformatics and Biomedicine
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1