首页 > 最新文献

2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)最新文献

英文 中文
mAMBER: A CPU/MIC collaborated parallel framework for AMBER on Tianhe-2 supercomputer 天河二号超级计算机AMBER的CPU/MIC协同并行框架
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822595
Shaoliang Peng, Xiaoyu Zhang, Yutong Lu, Xiangke Liao, Kai Lu, Canqun Yang, Jie Liu, Weiliang Zhu, Dongqing Wei
Molecular dynamics (MD) is a computer simulation method of studying physical movements of atoms and molecules that provide detailed microscopic sampling on molecular scale. With the continuous efforts and improvements, MD simulation gained popularity in materials science, biochemistry and biophysics with various application areas and expanding data scale. Assisted Model Building with Energy Refinement (AMBER) is one of the most widely used software packages for conducting MD simulations. However, the speed of AMBER MD simulations for system with millions of atoms in microsecond scale still need to be improved. In this paper, we propose a parallel acceleration strategy for AMBER on Tianhe-2 supercomputer. The parallel optimization of AMBER is carried out on three different levels: fine grained OpenMP parallel on a single MIC, single-node CPU/MIC collaborated parallel optimization and multi-node multi-MIC collaborated parallel acceleration. By the three levels of parallel acceleration strategy above, we achieved the highest speedup of 25–33 times compared with the original program. Source Code: https://github.com/tianhe2/mAMBER
分子动力学(MD)是一种研究原子和分子物理运动的计算机模拟方法,它提供了分子尺度上详细的微观采样。随着不断的努力和改进,MD仿真在材料科学、生物化学和生物物理学等领域得到了广泛的应用,数据规模不断扩大。辅助模型构建与能量细化(AMBER)是一个最广泛使用的软件包进行MD模拟。然而,在微秒尺度下,数百万原子系统的AMBER MD模拟速度仍有待提高。本文提出了AMBER在天河二号超级计算机上的并行加速策略。AMBER的并行优化分三个层次进行:单MIC上的细粒度OpenMP并行、单节点CPU/MIC协同并行优化和多节点多MIC协同并行加速。通过以上三个层次的并行加速策略,我们实现了与原方案相比最高25-33倍的加速。源代码:https://github.com/tianhe2/mAMBER
{"title":"mAMBER: A CPU/MIC collaborated parallel framework for AMBER on Tianhe-2 supercomputer","authors":"Shaoliang Peng, Xiaoyu Zhang, Yutong Lu, Xiangke Liao, Kai Lu, Canqun Yang, Jie Liu, Weiliang Zhu, Dongqing Wei","doi":"10.1109/BIBM.2016.7822595","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822595","url":null,"abstract":"Molecular dynamics (MD) is a computer simulation method of studying physical movements of atoms and molecules that provide detailed microscopic sampling on molecular scale. With the continuous efforts and improvements, MD simulation gained popularity in materials science, biochemistry and biophysics with various application areas and expanding data scale. Assisted Model Building with Energy Refinement (AMBER) is one of the most widely used software packages for conducting MD simulations. However, the speed of AMBER MD simulations for system with millions of atoms in microsecond scale still need to be improved. In this paper, we propose a parallel acceleration strategy for AMBER on Tianhe-2 supercomputer. The parallel optimization of AMBER is carried out on three different levels: fine grained OpenMP parallel on a single MIC, single-node CPU/MIC collaborated parallel optimization and multi-node multi-MIC collaborated parallel acceleration. By the three levels of parallel acceleration strategy above, we achieved the highest speedup of 25–33 times compared with the original program. Source Code: https://github.com/tianhe2/mAMBER","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134490291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Mathematical and computational analysis of CRISPR Cas9 sgRNA off-target homologies CRISPR Cas9 sgRNA脱靶同源性的数学和计算分析
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822558
M. Zhou, Daisy Li, X. Huan, Joseph Manthey, E. Lioutikova, Hong Zhou
The true power of genome editing mechanism known as RNA-programmable CRISPR Cas9 endonuclease system, lies in the fact that Cas9 can be guided to any loci complementary to a 20-nt RNA, single guide RNA (sgRNA), to cleave double stranded DNA, and therefore allows the introduction of wanted mutations. Unfortunately, sgRNA is prone to off-target homologous attachment, thus guiding Cas9 to cleave DNA sequences at unwanted sites. Using human genome and Streptococcus pyogenes Cas9 (SpCas9) as the example, this article analyzed the probabilities of off-target sites of sgRNAs and discovered that for large-size genomes such as human genome, off-target sites are nearly inevitable for sgRNA selection. Based on the mathematical analysis, it seems that the double nicking approach is currently the only feasible solution to promise genome editing specificity. An effective computational algorithm for off-target homology searching is also implemented to confirm the mathematical analysis.
被称为RNA可编程CRISPR Cas9内切酶系统的基因组编辑机制的真正力量在于,Cas9可以被引导到与20 nt RNA(单导RNA (sgRNA))互补的任何位点上,以切割双链DNA,从而允许引入所需的突变。不幸的是,sgRNA容易脱靶同源附着,从而引导Cas9在不需要的位点切割DNA序列。本文以人类基因组和化脓性链球菌Cas9 (SpCas9)为例,分析了sgRNA脱靶位点的概率,发现对于人类基因组这样的大尺度基因组,sgRNA选择的脱靶位点几乎是不可避免的。基于数学分析,双切口方法似乎是目前唯一可行的解决方案,以保证基因组编辑的特异性。实现了一种有效的脱靶同源搜索计算算法来验证数学分析。
{"title":"Mathematical and computational analysis of CRISPR Cas9 sgRNA off-target homologies","authors":"M. Zhou, Daisy Li, X. Huan, Joseph Manthey, E. Lioutikova, Hong Zhou","doi":"10.1109/BIBM.2016.7822558","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822558","url":null,"abstract":"The true power of genome editing mechanism known as RNA-programmable CRISPR Cas9 endonuclease system, lies in the fact that Cas9 can be guided to any loci complementary to a 20-nt RNA, single guide RNA (sgRNA), to cleave double stranded DNA, and therefore allows the introduction of wanted mutations. Unfortunately, sgRNA is prone to off-target homologous attachment, thus guiding Cas9 to cleave DNA sequences at unwanted sites. Using human genome and Streptococcus pyogenes Cas9 (SpCas9) as the example, this article analyzed the probabilities of off-target sites of sgRNAs and discovered that for large-size genomes such as human genome, off-target sites are nearly inevitable for sgRNA selection. Based on the mathematical analysis, it seems that the double nicking approach is currently the only feasible solution to promise genome editing specificity. An effective computational algorithm for off-target homology searching is also implemented to confirm the mathematical analysis.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134520727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
3D tracking swimming fish school using a master view tracking first strategy 采用主视图跟踪优先策略对游动鱼群进行三维跟踪
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822572
Shuohong Wang, Xiang Liu, Jingwen Zhao, Ye Liu, Y. Chen
3D motion data of fish school is more valuable than 2D data for behavior and other researches. This paper proposes to use a master view tracking first strategy based on a novel master-slave camera setup. On this basis, fish are firstly tracked in master view in 2D after being extracted via an eye-focused Gaussian Mixture Model (E-GMM) detector. Then 3D trajectories are reconstructed by associating 2D tracking results in master view and detection results in slave views after fish in slave views are localized using an eye-focused Gabor (E-Gabor) detector. Experiments on data sets with different fish densities demonstrate that the proposed method outperforms two state-of-the-art methods in terms of 5 evaluation metrics.
鱼群的三维运动数据比二维数据在行为和其他研究中更有价值。本文提出了一种基于主从摄像机设置的主视图优先跟踪策略。在此基础上,通过眼聚焦高斯混合模型(E-GMM)检测器提取鱼,首先在主视图中进行二维跟踪。然后利用眼聚焦Gabor (E-Gabor)检测器对从视图中的鱼进行定位,通过将主视图中的2D跟踪结果与从视图中的检测结果相关联,重建三维轨迹。在不同鱼类密度的数据集上进行的实验表明,该方法在5个评价指标方面优于两种最先进的方法。
{"title":"3D tracking swimming fish school using a master view tracking first strategy","authors":"Shuohong Wang, Xiang Liu, Jingwen Zhao, Ye Liu, Y. Chen","doi":"10.1109/BIBM.2016.7822572","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822572","url":null,"abstract":"3D motion data of fish school is more valuable than 2D data for behavior and other researches. This paper proposes to use a master view tracking first strategy based on a novel master-slave camera setup. On this basis, fish are firstly tracked in master view in 2D after being extracted via an eye-focused Gaussian Mixture Model (E-GMM) detector. Then 3D trajectories are reconstructed by associating 2D tracking results in master view and detection results in slave views after fish in slave views are localized using an eye-focused Gabor (E-Gabor) detector. Experiments on data sets with different fish densities demonstrate that the proposed method outperforms two state-of-the-art methods in terms of 5 evaluation metrics.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133171954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
An evaluation of data replication for bioinformatics workflows on NoSQL systems NoSQL系统中生物信息学工作流程的数据复制评估
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822644
Iasmini Lima, Matheus Oliveira, Diego S. Kieckbusch, M. Holanda, M. E. Walter, Aleteia P. F. Araujo, M. Victorino, Waldeyr M. C. Silva, Sérgio Lifschitz
Many research projects in bioinformatics may be viewed as scientific workflows. Biologists often run multiple times the same workflow with different parameters in order to refine their data analysis. These executions generate a large volume of files with different formats, which need to be stored for future evaluations. New database models, like NoSQL systems, could be considered to deal with large volumes of data, particularly in distributed systems. This work presents a data replication impact assessment from the execution of scientific workflows for two NoSQL database management systems: Cassandra and MongoDB.
生物信息学中的许多研究项目可被视为科学工作流程。生物学家经常用不同的参数运行多次相同的工作流程,以完善他们的数据分析。这些执行生成大量不同格式的文件,这些文件需要存储以供将来评估。新的数据库模型,如NoSQL系统,可以考虑处理大量数据,特别是在分布式系统中。这项工作提出了两个NoSQL数据库管理系统:Cassandra和MongoDB的科学工作流执行的数据复制影响评估。
{"title":"An evaluation of data replication for bioinformatics workflows on NoSQL systems","authors":"Iasmini Lima, Matheus Oliveira, Diego S. Kieckbusch, M. Holanda, M. E. Walter, Aleteia P. F. Araujo, M. Victorino, Waldeyr M. C. Silva, Sérgio Lifschitz","doi":"10.1109/BIBM.2016.7822644","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822644","url":null,"abstract":"Many research projects in bioinformatics may be viewed as scientific workflows. Biologists often run multiple times the same workflow with different parameters in order to refine their data analysis. These executions generate a large volume of files with different formats, which need to be stored for future evaluations. New database models, like NoSQL systems, could be considered to deal with large volumes of data, particularly in distributed systems. This work presents a data replication impact assessment from the execution of scientific workflows for two NoSQL database management systems: Cassandra and MongoDB.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133781263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Sparse singular value decomposition-based feature extraction for identifying differentially expressed genes 基于稀疏奇异值分解的差异表达基因特征提取
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822795
Jin-Xing Liu, Xiangzhen Kong, C. Zheng, J. Shang, Wei Zhang
Recently, feature extraction and dimensionality reduction have become fundamental tools for many data mining tasks, especially for processing high-dimensional data such as genome data. In this paper, a new feature extraction method based on sparse singular value decomposition (SSVD) is developed. SSVD algorithm is applied to extract differentially expressed genes from two different genome datasets that are all from The Cancer Genome Atlas (TCGA), and then the extracted genes are evaluated by the tools based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis. As a gene extraction method, SSVD is also compared with some existing feature extraction methods such as independent component analysis, the p-norm robust feature extraction and sparse principal component analysis. The experimental GO analysis results show that SSVD method outperforms the competitive algorithms. The KEGG analysis results demonstrate the genes which participate in the pathways in cancer. The elaborate experiments prove that SSVD is an effective feature selection method compared with the competitive methods. The KEGG analysis results may provide a meaningful reference to carry out further study for professionals in the field of biomedical science.
近年来,特征提取和降维已经成为许多数据挖掘任务的基本工具,特别是处理高维数据,如基因组数据。本文提出了一种基于稀疏奇异值分解(SSVD)的特征提取方法。采用SSVD算法从两个不同的基因组数据集(均来自The Cancer genome Atlas, TCGA)中提取差异表达基因,并基于基因本体(Gene Ontology, GO)和京都基因与基因组百科全书(Kyoto Encyclopedia of genes and Genomes, KEGG)途径富集分析工具对提取的基因进行评估。作为一种基因提取方法,SSVD还与独立成分分析、p范数鲁棒特征提取和稀疏主成分分析等现有特征提取方法进行了比较。实验结果表明,SSVD方法优于竞争算法。KEGG分析结果显示了参与癌症通路的基因。实验证明,与竞争方法相比,SSVD是一种有效的特征选择方法。KEGG分析结果可为生物医学领域的专业人员进一步开展研究提供有意义的参考。
{"title":"Sparse singular value decomposition-based feature extraction for identifying differentially expressed genes","authors":"Jin-Xing Liu, Xiangzhen Kong, C. Zheng, J. Shang, Wei Zhang","doi":"10.1109/BIBM.2016.7822795","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822795","url":null,"abstract":"Recently, feature extraction and dimensionality reduction have become fundamental tools for many data mining tasks, especially for processing high-dimensional data such as genome data. In this paper, a new feature extraction method based on sparse singular value decomposition (SSVD) is developed. SSVD algorithm is applied to extract differentially expressed genes from two different genome datasets that are all from The Cancer Genome Atlas (TCGA), and then the extracted genes are evaluated by the tools based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis. As a gene extraction method, SSVD is also compared with some existing feature extraction methods such as independent component analysis, the p-norm robust feature extraction and sparse principal component analysis. The experimental GO analysis results show that SSVD method outperforms the competitive algorithms. The KEGG analysis results demonstrate the genes which participate in the pathways in cancer. The elaborate experiments prove that SSVD is an effective feature selection method compared with the competitive methods. The KEGG analysis results may provide a meaningful reference to carry out further study for professionals in the field of biomedical science.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123004401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Modular reconfiguration of metabolic brain networks in health and cancer: A resting-state PET study 健康和癌症中代谢脑网络的模块化重构:静息状态PET研究
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822665
Zhijun Yao, Bin Hu, Xuejiao Chen, Yuanwei Xie, Lei Fang
Recent studies suggested that cognitive impairments and memory difficulties in cancer survivors were associated with topology changes of brain network, particularly in terms of the functional and structural abnormalities. However, little is known about the modular reconfiguration of metabolic brain network among this population. In this study, we recruited 78 patients with pre-treatment cancer and 80 age- and gender-matched normal controls (NCs), and constructed the metabolic brain networks derived from resting-state 18F-fluorodeoxyglucose positron emission tomography (FDG-PET) to assess the alters of modularity pattern in cancer. The measurements of the participation index (PI) and mutual information (MI) were calculated for the cancer and NC groups. Compared with NC group, one module composed by the hippocampus, the amygdala and frontal and temporal regions was absented in cancer group. Moreover, cancer patients showed abnormal topology pattern in their metabolic networks (i.e., increased local efficiency and reduced global efficiency). Although node-wise PI shared positive correlated with normalized metabolism uptake in both groups, the more energy consumption were observed in metabolism network of cancer group that might be indicative of reduced capability of information processing. In addition, the between-group MIs were gradually increased over a range of thresholds. Our results suggested that modular pattern of the metabolic brain network seemed to re-shape its organization in cancer, which might uncover the neurobiological mechanisms underlying cancer-related cognitive dysfunction.
近年来的研究表明,癌症幸存者的认知障碍和记忆困难与大脑网络的拓扑变化有关,特别是在功能和结构方面的异常。然而,对这一人群中代谢脑网络的模块化重构知之甚少。在这项研究中,我们招募了78名治疗前的癌症患者和80名年龄和性别匹配的正常对照(nc),并构建了静息状态18f -氟脱氧葡萄糖正电子发射断层扫描(FDG-PET)衍生的代谢脑网络,以评估癌症中模块模式的变化。计算癌症组和NC组的参与指数(PI)和相互信息(MI)。与NC组相比,癌症组海马、杏仁核、额颞叶区组成的一个模块缺失。此外,癌症患者的代谢网络拓扑结构出现异常(局部效率提高,整体效率降低)。尽管两组的节点智慧PI与标准化代谢摄取呈正相关,但癌症组代谢网络中观察到的能量消耗越多,可能表明信息处理能力降低。此外,组间MIs在一定阈值范围内逐渐升高。我们的研究结果表明,代谢脑网络的模块化模式似乎在癌症中重塑了其组织,这可能揭示癌症相关认知功能障碍的神经生物学机制。
{"title":"Modular reconfiguration of metabolic brain networks in health and cancer: A resting-state PET study","authors":"Zhijun Yao, Bin Hu, Xuejiao Chen, Yuanwei Xie, Lei Fang","doi":"10.1109/BIBM.2016.7822665","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822665","url":null,"abstract":"Recent studies suggested that cognitive impairments and memory difficulties in cancer survivors were associated with topology changes of brain network, particularly in terms of the functional and structural abnormalities. However, little is known about the modular reconfiguration of metabolic brain network among this population. In this study, we recruited 78 patients with pre-treatment cancer and 80 age- and gender-matched normal controls (NCs), and constructed the metabolic brain networks derived from resting-state 18F-fluorodeoxyglucose positron emission tomography (FDG-PET) to assess the alters of modularity pattern in cancer. The measurements of the participation index (PI) and mutual information (MI) were calculated for the cancer and NC groups. Compared with NC group, one module composed by the hippocampus, the amygdala and frontal and temporal regions was absented in cancer group. Moreover, cancer patients showed abnormal topology pattern in their metabolic networks (i.e., increased local efficiency and reduced global efficiency). Although node-wise PI shared positive correlated with normalized metabolism uptake in both groups, the more energy consumption were observed in metabolism network of cancer group that might be indicative of reduced capability of information processing. In addition, the between-group MIs were gradually increased over a range of thresholds. Our results suggested that modular pattern of the metabolic brain network seemed to re-shape its organization in cancer, which might uncover the neurobiological mechanisms underlying cancer-related cognitive dysfunction.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132394965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
COLT: COnstrained Lineage Tree Generation from sequence data 从序列数据生成约束谱系树
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822500
Keke Chen, Venkata Sai Abhishek Gogu, Di Wu, Jiang Ning
Lineage analysis has been an important method for understanding the mutation patterns and the diversity of genes, such as antibodies. A mutation lineage is typically represented as a tree structure, describing the possible mutation paths. Generating lineage trees from sequence data imposes two unique challenges: (1) Types of constraints might be defined on top of sequence data and tree structures, which have to be appropriately formulated and maintained by the algorithms. (2) Enumerating all possible trees that satisfy constraints is typically computationally intractable. In this paper, we present a COnstrained Lineage Tree generation framework (COLT) that builds lineage trees from sequences, based on local and global constraints specified by domain experts and heuristics derived from the mutation processes. Our formal analysis and experimental results show that this framework can efficiently generate valid lineage trees, while strictly satisfying the constraints specified by domain experts.
谱系分析已成为了解基因(如抗体)突变模式和多样性的重要方法。突变谱系通常以树状结构表示,描述可能的突变路径。从序列数据中生成谱系树有两个独特的挑战:(1)约束类型可能在序列数据和树结构之上定义,这些约束类型必须由算法适当地表述和维护。(2)枚举满足约束条件的所有可能的树在计算上通常是难以处理的。在本文中,我们提出了一个约束谱系树生成框架(COLT),该框架基于领域专家指定的局部和全局约束以及来自突变过程的启发式,从序列中构建谱系树。形式分析和实验结果表明,该框架能够有效地生成有效的谱系树,同时严格满足领域专家指定的约束条件。
{"title":"COLT: COnstrained Lineage Tree Generation from sequence data","authors":"Keke Chen, Venkata Sai Abhishek Gogu, Di Wu, Jiang Ning","doi":"10.1109/BIBM.2016.7822500","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822500","url":null,"abstract":"Lineage analysis has been an important method for understanding the mutation patterns and the diversity of genes, such as antibodies. A mutation lineage is typically represented as a tree structure, describing the possible mutation paths. Generating lineage trees from sequence data imposes two unique challenges: (1) Types of constraints might be defined on top of sequence data and tree structures, which have to be appropriately formulated and maintained by the algorithms. (2) Enumerating all possible trees that satisfy constraints is typically computationally intractable. In this paper, we present a COnstrained Lineage Tree generation framework (COLT) that builds lineage trees from sequences, based on local and global constraints specified by domain experts and heuristics derived from the mutation processes. Our formal analysis and experimental results show that this framework can efficiently generate valid lineage trees, while strictly satisfying the constraints specified by domain experts.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132477300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Weighted multiview learning for predicting drug-disease associations 用于预测药物-疾病关联的加权多视图学习
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822603
S. N. Chandrasekaran, Jun Huan
The paradigm of drug discovery has moved from finding new drugs that exhibit therapeutic properties for a disease to reusing existing approved drugs for a newer disease. The association between a drug and a disease involves a complex network of targets and pathways. In order to provide new insights, there has been a constant need for sophisticated tools that have the potential to discover new associations from the underlying drugs-disease interactions. In addition to computational tools, there has been an explosion of data available in terms of drugs, disease and their activity profiles. On one hand, researchers have been using existing machine learning tools that have shown great promise in predicting associations but on the other hand there has been a void in exploiting advance machine learning frameworks to handle this kind of data integration. In this paper, we propose a learning framework called weighted multi-view learning that is a variant of the Multi-view learning framework in which the views are assumed to contribute equally to the prediction whereas our method learns a weight for each view since we hypothesize that certain views might have better prediction capability than others.
药物发现的范式已经从寻找对某种疾病具有治疗特性的新药转变为重新使用现有的已批准药物治疗一种新疾病。药物和疾病之间的联系涉及一个复杂的靶点和途径网络。为了提供新的见解,一直需要有可能从潜在的药物-疾病相互作用中发现新的关联的复杂工具。除了计算工具之外,关于药物、疾病及其活动概况的可用数据也出现了爆炸式增长。一方面,研究人员一直在使用现有的机器学习工具,这些工具在预测关联方面显示出很大的希望,但另一方面,在利用先进的机器学习框架来处理这种数据集成方面一直存在空白。在本文中,我们提出了一种称为加权多视图学习的学习框架,它是多视图学习框架的一种变体,其中假设视图对预测的贡献相同,而我们的方法为每个视图学习权重,因为我们假设某些视图可能比其他视图具有更好的预测能力。
{"title":"Weighted multiview learning for predicting drug-disease associations","authors":"S. N. Chandrasekaran, Jun Huan","doi":"10.1109/BIBM.2016.7822603","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822603","url":null,"abstract":"The paradigm of drug discovery has moved from finding new drugs that exhibit therapeutic properties for a disease to reusing existing approved drugs for a newer disease. The association between a drug and a disease involves a complex network of targets and pathways. In order to provide new insights, there has been a constant need for sophisticated tools that have the potential to discover new associations from the underlying drugs-disease interactions. In addition to computational tools, there has been an explosion of data available in terms of drugs, disease and their activity profiles. On one hand, researchers have been using existing machine learning tools that have shown great promise in predicting associations but on the other hand there has been a void in exploiting advance machine learning frameworks to handle this kind of data integration. In this paper, we propose a learning framework called weighted multi-view learning that is a variant of the Multi-view learning framework in which the views are assumed to contribute equally to the prediction whereas our method learns a weight for each view since we hypothesize that certain views might have better prediction capability than others.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"100 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134190462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Multi-label classification for intelligent health risk prediction 智能健康风险预测的多标签分类
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822657
Runzhi Li, Hongling Zhao, Yusong Lin, Andrew S. Maxwell, Chaoyang Zhang
A Multi-Label Problem Transformation Joint Classification (MLPTJC) method is developed to solve the multi-label classification problem for the health and disease risk prediction based on physical examination records. We adopt a multi-class classification problem transformation method to transform the multi-label classification problem to a multi-class classification problem. Then We propose a Joint Decomposition Subset Classifier method to reduce the infrequent label sets to deal with the imbalance learning problem. Based on MLPTJC, existing cost-sensitive multi-class classification algorithms can be used to train the prediction models. We conduct some experiments to evaluate the performance of the MLPTJC method. The Support Vector Machine (SVM) and Random Forest (RF) algorithms are used for multi-class classification learning. We use the 10-fold cross-validation and metrics such as Average Accuracy, Precision, Recall and F-measure to evaluate the performance. The real physical examination records were employed, which include 62 examination items and 110, 300 anonymous patients. 8 types of diseases were predicted. The experimental results show that the MLPTJC method has better performance in terms of accuracy.
针对基于体检记录的健康与疾病风险预测的多标签分类问题,提出了一种多标签问题转换联合分类方法(MLPTJC)。我们采用多类分类问题转换方法,将多标签分类问题转化为多类分类问题。然后,我们提出了一种联合分解子集分类器方法来减少不频繁的标签集,以解决不平衡学习问题。基于MLPTJC,现有的代价敏感多类分类算法可用于训练预测模型。我们进行了一些实验来评估MLPTJC方法的性能。采用支持向量机(SVM)和随机森林(RF)算法进行多类分类学习。我们使用10倍交叉验证和指标,如平均准确度,精度,召回率和F-measure来评估性能。采用真实体检记录,共62项检查项目,匿名患者110300人。预测了8种疾病。实验结果表明,MLPTJC方法在精度方面具有较好的性能。
{"title":"Multi-label classification for intelligent health risk prediction","authors":"Runzhi Li, Hongling Zhao, Yusong Lin, Andrew S. Maxwell, Chaoyang Zhang","doi":"10.1109/BIBM.2016.7822657","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822657","url":null,"abstract":"A Multi-Label Problem Transformation Joint Classification (MLPTJC) method is developed to solve the multi-label classification problem for the health and disease risk prediction based on physical examination records. We adopt a multi-class classification problem transformation method to transform the multi-label classification problem to a multi-class classification problem. Then We propose a Joint Decomposition Subset Classifier method to reduce the infrequent label sets to deal with the imbalance learning problem. Based on MLPTJC, existing cost-sensitive multi-class classification algorithms can be used to train the prediction models. We conduct some experiments to evaluate the performance of the MLPTJC method. The Support Vector Machine (SVM) and Random Forest (RF) algorithms are used for multi-class classification learning. We use the 10-fold cross-validation and metrics such as Average Accuracy, Precision, Recall and F-measure to evaluate the performance. The real physical examination records were employed, which include 62 examination items and 110, 300 anonymous patients. 8 types of diseases were predicted. The experimental results show that the MLPTJC method has better performance in terms of accuracy.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134600494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Effects of propafenone on KCNH2-linked short QT syndrome: A modelling study 普罗帕酮对kcnh2相关短QT综合征的影响:一项模型研究
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822744
Cunjin Luo, Kuanquan Wang, Henggui Zhang
The identified genetic short QT syndrome (SQTS) is associated with an increased risk of arrhythmia and sudden death. This study was to investigate the potential effects of propafenone on KCNH2-linked short QT syndrome (SQT1) using a multi-scale biophysically detailed model of the heart developed by ten Tusscher and Panfilov. The ion electrical conductivities were reduced by propafenone in order to simulate the pharmacological effects in healthy and SQT1 cells. Based on the experimental data of McPate et al., the pharmacological effect of propafenone was modelled by dose-dependent IKr blocking. Action potential (AP) profiles and 1D tissue level were analyzed to predict the effects of propafenone on SQT1. Both low- and high- dose of propafenone prolonged APD and QT interval in SQT1 cells. It suggests the superior efficacy of high dose of propafenone on SQT1. However, propafenone did not significantly alter the healthy APD or QT interval at low dose, whereas markedly shortened them at high dose. Our simulation data show that propafenone has a dose-dependently anti-arrhythmic effect on SQT1, and a pro-arrhythmic effect on healthy cells. These computer simulations help to better understand the underlying mechanisms responsible for the initiation or termination of arrhythmias in healthy or SQT1 patients using propafenone.
已确定的遗传性短QT综合征(SQTS)与心律失常和猝死的风险增加有关。本研究旨在利用ten Tusscher和Panfilov建立的心脏多尺度生物物理详细模型,探讨普罗帕酮对kcnh2相关的短QT综合征(SQT1)的潜在影响。为了模拟正常细胞和SQT1细胞的药理作用,普罗帕酮降低了离子电导率。基于McPate等人的实验数据,采用剂量依赖性IKr阻断法模拟普罗帕酮的药理作用。通过分析动作电位(AP)谱和1D组织水平来预测普罗帕酮对SQT1的影响。低、高剂量普罗帕酮均可延长SQT1细胞APD和QT间期。提示大剂量普罗帕酮治疗SQT1疗效优越。然而,普罗帕酮在低剂量时没有显著改变APD或QT间期,而在高剂量时明显缩短。我们的模拟数据显示,普罗帕酮对SQT1具有剂量依赖性的抗心律失常作用,对健康细胞具有促心律失常作用。这些计算机模拟有助于更好地理解使用普罗帕酮的健康或SQT1患者心律失常发生或终止的潜在机制。
{"title":"Effects of propafenone on KCNH2-linked short QT syndrome: A modelling study","authors":"Cunjin Luo, Kuanquan Wang, Henggui Zhang","doi":"10.1109/BIBM.2016.7822744","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822744","url":null,"abstract":"The identified genetic short QT syndrome (SQTS) is associated with an increased risk of arrhythmia and sudden death. This study was to investigate the potential effects of propafenone on KCNH2-linked short QT syndrome (SQT1) using a multi-scale biophysically detailed model of the heart developed by ten Tusscher and Panfilov. The ion electrical conductivities were reduced by propafenone in order to simulate the pharmacological effects in healthy and SQT1 cells. Based on the experimental data of McPate et al., the pharmacological effect of propafenone was modelled by dose-dependent IKr blocking. Action potential (AP) profiles and 1D tissue level were analyzed to predict the effects of propafenone on SQT1. Both low- and high- dose of propafenone prolonged APD and QT interval in SQT1 cells. It suggests the superior efficacy of high dose of propafenone on SQT1. However, propafenone did not significantly alter the healthy APD or QT interval at low dose, whereas markedly shortened them at high dose. Our simulation data show that propafenone has a dose-dependently anti-arrhythmic effect on SQT1, and a pro-arrhythmic effect on healthy cells. These computer simulations help to better understand the underlying mechanisms responsible for the initiation or termination of arrhythmias in healthy or SQT1 patients using propafenone.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132437700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1