首页 > 最新文献

2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)最新文献

英文 中文
Accelerating protein-protein complex validation by GPU based funnel generation 基于GPU的漏斗生成加速蛋白质复合物验证
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822510
Michael Zabejansky, H. Wolfson
A major challenge in protein-protein docking is the distinction between near-native and decoy complex predictions. It has been shown that near native solutions are usually located at the bottom of deep and densely populated funnels in the binding energy plot of the complex. Thus exploration, whether the energy plot of the vicinity of a docking solution is “funnel like”, can serve as a validation of such a solution. Generation of such densely sampled plots, however, is a major computational challenge. We have designed an accurate and highly efficient parallel algorithm for generation of such energy plots and implemented it on a server with 4 GPU processors, each with 2880 cores. The algorithm achieved a speedup of about 150 compared to its serial counterpart, while even outperforming it in the achieved results. While the algorithm proved very useful for near native complex hypothesis validation, it still detects many funnels for decoy solutions, especially those with good shape complementarity.
蛋白质对接的一个主要挑战是近原生和诱饵复杂预测之间的区别。研究表明,在配合物结合能图中,近天然解通常位于深而密集的漏斗的底部。因此,探索对接解附近的能量图是否为“漏斗状”,可以作为对该解的验证。然而,生成如此密集的采样图是一个主要的计算挑战。我们设计了一种精确高效的并行算法来生成这样的能量图,并在具有4个GPU处理器的服务器上实现,每个处理器具有2880核。与串行算法相比,该算法实现了大约150倍的加速,而在实现的结果中甚至优于串行算法。虽然该算法对近原生复杂假设验证非常有用,但它仍然检测到许多诱饵解的漏斗,特别是那些具有良好形状互补性的诱饵解。
{"title":"Accelerating protein-protein complex validation by GPU based funnel generation","authors":"Michael Zabejansky, H. Wolfson","doi":"10.1109/BIBM.2016.7822510","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822510","url":null,"abstract":"A major challenge in protein-protein docking is the distinction between near-native and decoy complex predictions. It has been shown that near native solutions are usually located at the bottom of deep and densely populated funnels in the binding energy plot of the complex. Thus exploration, whether the energy plot of the vicinity of a docking solution is “funnel like”, can serve as a validation of such a solution. Generation of such densely sampled plots, however, is a major computational challenge. We have designed an accurate and highly efficient parallel algorithm for generation of such energy plots and implemented it on a server with 4 GPU processors, each with 2880 cores. The algorithm achieved a speedup of about 150 compared to its serial counterpart, while even outperforming it in the achieved results. While the algorithm proved very useful for near native complex hypothesis validation, it still detects many funnels for decoy solutions, especially those with good shape complementarity.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"529 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132317809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on early risk predictive model and discriminative feature selection of cancer based on real-world routine physical examination data 基于真实体检数据的癌症早期风险预测模型及判别特征选择研究
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822746
Guixia Kang, Zhuang Ni
most cancers at early stages show no obvious symptoms and curative treatment is not an option any more when cancer is diagnosed. Therefore, making accurate predictions for the risk of early cancer has become urgently necessary in the field of medicine. In this paper, our purpose is to fully utilize real-world routine physical examination data to analyze the most discriminative features of cancer based on ReliefF algorithm and generate early risk predictive model of cancer taking advantage of three machine learning (ML) algorithms. We use physical examination data with a return visit followed 1 month later derived from CiMing Health Checkup Center. The ReliefF algorithm selects the top 30 features written as Sub(30) based on weight value from our data collections consisting of 34 features and 2300 candidates. The 4-layer (2 hidden layers) deep neutral network (DNN) based on B-P algorithm, the support machine vector with the linear kernel and decision tree CART are proposed for predicting the risk of cancer by 5-fold cross validation. We implement these criteria such as predictive accuracy, AUC-ROC, sensitivity and specificity to identify the discriminative ability of three proposed method for cancer. The results show that compared with the other two methods, SVM obtains higher AUC and specificity of 0.926 and 95.27%, respectively. The superior predictive accuracy (86%) is achieved by DNN. Moreover, the fuzzy interval of threshold in DNN is proposed and the sensitivity, specificity and accuracy of DNN is 90.20%, 94.22% and 93.22%, respectively, using the revised threshold interval. The research indicates that the application of ML methods together with risk feature selection based on real-world routine physical examination data is meaningful and promising in the area of cancer prediction.
大多数癌症在早期阶段没有明显的症状,当癌症被诊断出来时,治愈性治疗不再是一种选择。因此,对早期癌症的风险进行准确的预测已成为医学领域的迫切需要。在本文中,我们的目的是充分利用真实世界的常规体检数据,基于ReliefF算法分析癌症最具判别性的特征,并利用三种机器学习(ML)算法生成癌症早期风险预测模型。我们使用慈明健康体检中心1个月后复诊的体检数据。ReliefF算法根据权重值从我们的数据集合(包含34个特征和2300个候选特征)中选择前30个写为Sub(30)的特征。提出了基于B-P算法的4层(2层隐藏层)深度神经网络(DNN)、线性核支持机向量和决策树CART,通过5次交叉验证预测癌症风险。我们运用预测准确度、AUC-ROC、敏感性和特异性等标准来鉴定三种方法对癌症的鉴别能力。结果表明,与其他两种方法相比,SVM的AUC和特异度分别为0.926和95.27%。DNN的预测准确率高达86%。提出了深度神经网络阈值的模糊区间,采用修正后的阈值区间,深度神经网络的灵敏度、特异性和准确性分别为90.20%、94.22%和93.22%。研究表明,将机器学习方法与基于真实常规体检数据的风险特征选择相结合,在癌症预测领域具有重要意义和前景。
{"title":"Research on early risk predictive model and discriminative feature selection of cancer based on real-world routine physical examination data","authors":"Guixia Kang, Zhuang Ni","doi":"10.1109/BIBM.2016.7822746","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822746","url":null,"abstract":"most cancers at early stages show no obvious symptoms and curative treatment is not an option any more when cancer is diagnosed. Therefore, making accurate predictions for the risk of early cancer has become urgently necessary in the field of medicine. In this paper, our purpose is to fully utilize real-world routine physical examination data to analyze the most discriminative features of cancer based on ReliefF algorithm and generate early risk predictive model of cancer taking advantage of three machine learning (ML) algorithms. We use physical examination data with a return visit followed 1 month later derived from CiMing Health Checkup Center. The ReliefF algorithm selects the top 30 features written as Sub(30) based on weight value from our data collections consisting of 34 features and 2300 candidates. The 4-layer (2 hidden layers) deep neutral network (DNN) based on B-P algorithm, the support machine vector with the linear kernel and decision tree CART are proposed for predicting the risk of cancer by 5-fold cross validation. We implement these criteria such as predictive accuracy, AUC-ROC, sensitivity and specificity to identify the discriminative ability of three proposed method for cancer. The results show that compared with the other two methods, SVM obtains higher AUC and specificity of 0.926 and 95.27%, respectively. The superior predictive accuracy (86%) is achieved by DNN. Moreover, the fuzzy interval of threshold in DNN is proposed and the sensitivity, specificity and accuracy of DNN is 90.20%, 94.22% and 93.22%, respectively, using the revised threshold interval. The research indicates that the application of ML methods together with risk feature selection based on real-world routine physical examination data is meaningful and promising in the area of cancer prediction.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133780512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
GIDAC: A prototype for bioimages annotation and clinical data integration GIDAC:生物图像注释和临床数据整合的原型
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822663
P. Vizza, P. Guzzi, P. Veltri, G. Cascini, R. Curia, Loredana Sisca
The analysis of bioimages and their correlated clinical patient information allows to investigate specific diseases and define the corresponding medical protocols. To perform a correct diagnosis and apply a precise therapy, bioimages must be collected and studied together with others relevant data as well as laboratory results, medical annotations and patient history. Today, the management of these data is performed by single systems inside hospital departments that often do not provide dedicated data integration platforms among different departments as well as different health structures to exchange of relevant clinical information. Also, images cannot be annotated or enriched by physicians to trace temporal studies for patients or even among patients with similar diseases. In this contribution, we report the results of a research project called GIDAC (standing for Gestione Integrata DAti Clinici) that aims to define a general purpose framework for the bioimages management and annotations as well as clinical data view and integration in a simple-to-use information system. The proposed framework does not substitute any existing clinical information system but is able in gathering and integrating data by using a XML-based module. The novelty also consists in allowing annotations on DICOM images by means of simple user-interface to take trace of changes intra images as well as comparisons among patients. This system supports oncologists in the management of DICOM images from different devices (e.g., ecograph or PACS) to extract relevant information necessary to query (annotate) images and study similar clinical cases.
生物图像及其相关临床患者信息的分析允许调查特定疾病并确定相应的医疗方案。为了进行正确的诊断和应用精确的治疗,必须收集生物图像并与其他相关数据以及实验室结果、医学注释和患者病史一起研究。目前,这些数据的管理是由医院部门内部的单一系统完成的,这些系统往往没有在不同部门和不同医疗机构之间提供专用的数据集成平台来交换相关的临床信息。此外,医生无法对图像进行注释或丰富,以追踪患者甚至患有类似疾病的患者的时间研究。在这篇文章中,我们报告了一个名为GIDAC (Gestione Integrata DAti Clinici)的研究项目的结果,该项目旨在定义一个通用框架,用于生物图像管理和注释,以及临床数据视图和集成在一个简单易用的信息系统中。该框架不替代任何现有的临床信息系统,而是能够使用基于xml的模块收集和集成数据。其新颖之处还在于允许通过简单的用户界面对DICOM图像进行注释,以跟踪图像内的变化以及患者之间的比较。该系统支持肿瘤学家管理来自不同设备(如ecograph或PACS)的DICOM图像,以提取查询(注释)图像和研究类似临床病例所需的相关信息。
{"title":"GIDAC: A prototype for bioimages annotation and clinical data integration","authors":"P. Vizza, P. Guzzi, P. Veltri, G. Cascini, R. Curia, Loredana Sisca","doi":"10.1109/BIBM.2016.7822663","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822663","url":null,"abstract":"The analysis of bioimages and their correlated clinical patient information allows to investigate specific diseases and define the corresponding medical protocols. To perform a correct diagnosis and apply a precise therapy, bioimages must be collected and studied together with others relevant data as well as laboratory results, medical annotations and patient history. Today, the management of these data is performed by single systems inside hospital departments that often do not provide dedicated data integration platforms among different departments as well as different health structures to exchange of relevant clinical information. Also, images cannot be annotated or enriched by physicians to trace temporal studies for patients or even among patients with similar diseases. In this contribution, we report the results of a research project called GIDAC (standing for Gestione Integrata DAti Clinici) that aims to define a general purpose framework for the bioimages management and annotations as well as clinical data view and integration in a simple-to-use information system. The proposed framework does not substitute any existing clinical information system but is able in gathering and integrating data by using a XML-based module. The novelty also consists in allowing annotations on DICOM images by means of simple user-interface to take trace of changes intra images as well as comparisons among patients. This system supports oncologists in the management of DICOM images from different devices (e.g., ecograph or PACS) to extract relevant information necessary to query (annotate) images and study similar clinical cases.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132232367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Evaluation of CD-HIT for constructing non-redundant databases 构建非冗余数据库的CD-HIT评价
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822604
Qingyu Chen, Yu Wan, Yang Lei, J. Zobel, Karin M. Verspoor
CD-HIT is one of the most popular tools for reducing sequence redundancy, and is considered to be the state-of-art method. It tries to minimise redundancy by reducing an input database into several representative sequences, under a user-defined threshold of sequence identity. We present a comprehensive assessment of the redundancy in the outputs of CD-HIT, exploring the impact of different identity thresholds and new evaluation data on the redundancy. We demonstrate that the relationship between threshold and redundancies is surprising weak. Applications of CD-HIT that set low identity threshold values also may suffer from substantial degradation in both efficiency and accuracy.
CD-HIT是减少序列冗余的最流行的工具之一,被认为是最先进的方法。在用户定义的序列身份阈值下,它试图通过将输入数据库减少到几个代表性序列来最小化冗余。我们对CD-HIT输出中的冗余进行了全面评估,探讨了不同身份阈值和新的评估数据对冗余的影响。我们证明了阈值和冗余之间的关系是惊人的弱。设置较低身份阈值的CD-HIT应用也可能在效率和准确性方面受到严重影响。
{"title":"Evaluation of CD-HIT for constructing non-redundant databases","authors":"Qingyu Chen, Yu Wan, Yang Lei, J. Zobel, Karin M. Verspoor","doi":"10.1109/BIBM.2016.7822604","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822604","url":null,"abstract":"CD-HIT is one of the most popular tools for reducing sequence redundancy, and is considered to be the state-of-art method. It tries to minimise redundancy by reducing an input database into several representative sequences, under a user-defined threshold of sequence identity. We present a comprehensive assessment of the redundancy in the outputs of CD-HIT, exploring the impact of different identity thresholds and new evaluation data on the redundancy. We demonstrate that the relationship between threshold and redundancies is surprising weak. Applications of CD-HIT that set low identity threshold values also may suffer from substantial degradation in both efficiency and accuracy.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"65 Suppl 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133375890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
A novel identified temporal protein complexes strategy inspired by density-distance and brainstorming process 由密度距离和头脑风暴过程启发的一种新的识别时间蛋白复合物策略
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822701
Xianjun Shen, Jin Zhou, Xingpeng Jiang, Xiaohua Hu, Tingting He, Jincai Yang, Dan Xie
Detection of protein complexes and functional modules plays a crucial role for strengthening the comprehension of cellular organization and biological functions on the dynamic protein-protein interaction network. In this article, we put forward a new strategy to identify temporal protein complexes. Integrating time-course gene expression data into static protein interaction data, a series of time-sequenced subnetworks were constructed. Then we combined the network topology and gene ontology information for defining the distance between proteins in PPI network. A novel method to find the cluster centers and then form initial clusters was based on the idea that cluster centers are usually recognized as nodes with higher densities than their neighbors and with a relatively larger distance from other cluster centers. Finally, inspired by the brainstorming discussion process, two ways are introduced to update the initial clusters for achieving the optimal results. After the filtering and merging procedure, experimental results demonstrated that the proposed strategy had a good performance comparing with the other four advanced algorithms - MCODE, FAG-EC, HC-PIN, and CNC.
蛋白质复合物和功能模块的检测对于加强对蛋白质-蛋白质动态相互作用网络中细胞组织和生物功能的理解起着至关重要的作用。在本文中,我们提出了一种新的识别颞叶蛋白复合物的策略。将时序基因表达数据整合到静态蛋白相互作用数据中,构建了一系列时序子网络。然后结合网络拓扑和基因本体信息来定义PPI网络中蛋白质之间的距离。基于集群中心通常被识别为密度高于相邻节点且与其他集群中心距离相对较大的节点,提出了一种寻找集群中心并形成初始集群的新方法。最后,受头脑风暴讨论过程的启发,引入了两种方法来更新初始聚类以获得最佳结果。经过滤波和合并后的实验结果表明,与MCODE、FAG-EC、HC-PIN和CNC等四种先进算法相比,该策略具有良好的性能。
{"title":"A novel identified temporal protein complexes strategy inspired by density-distance and brainstorming process","authors":"Xianjun Shen, Jin Zhou, Xingpeng Jiang, Xiaohua Hu, Tingting He, Jincai Yang, Dan Xie","doi":"10.1109/BIBM.2016.7822701","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822701","url":null,"abstract":"Detection of protein complexes and functional modules plays a crucial role for strengthening the comprehension of cellular organization and biological functions on the dynamic protein-protein interaction network. In this article, we put forward a new strategy to identify temporal protein complexes. Integrating time-course gene expression data into static protein interaction data, a series of time-sequenced subnetworks were constructed. Then we combined the network topology and gene ontology information for defining the distance between proteins in PPI network. A novel method to find the cluster centers and then form initial clusters was based on the idea that cluster centers are usually recognized as nodes with higher densities than their neighbors and with a relatively larger distance from other cluster centers. Finally, inspired by the brainstorming discussion process, two ways are introduced to update the initial clusters for achieving the optimal results. After the filtering and merging procedure, experimental results demonstrated that the proposed strategy had a good performance comparing with the other four advanced algorithms - MCODE, FAG-EC, HC-PIN, and CNC.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132928378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Multi-norm constrained optimization methods for calling copy number variants in single cell sequencing data 单细胞测序数据中拷贝数变量调用的多规范约束优化方法
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822511
Changsheng Zhang, Hongmin Cai, Jingying Huang, Bo Xu
The revolutionary invention of single-cell sequencing technology carves out a new way to delineate intra tumor heterogeneity and the evolution of single cells at the molecular level. Since single-cell sequencing requires a special genome amplification step to accumulate enough samples, a large number of bias were introduced, making the calling of copy number variants rather challenging. Accurately modeling this process and effectively detecting copy number variations (CNVs) are the major roadblock for single-cell sequencing data analysis. Recent advances manifested that the underlying copy numbers are corrupted by noise, which could be approximated by negative binomial distribution. In this paper, we formulated a general mathematical model for copy number reconstruction from read depth signal, and presented its two specific variants, namely Poisson-CNV and NB-CNV to catering for various reads distribution. Efficient numerical solution based on the classical alternating direction minimization method was designed to solve the proposed models. Extensive experiments on both synthetic datasets and empirical single-cell sequencing datasets were conducted to compare the performance of the two models. The results show that the proposed model of NB-CNV achieved superior performance in calling the CNV for single-cell sequencing data.
单细胞测序技术的革命性发明为在分子水平上描述肿瘤内异质性和单细胞进化开辟了新的途径。由于单细胞测序需要一个特殊的基因组扩增步骤来积累足够的样本,因此引入了大量的偏倚,使得拷贝数变异的调用相当具有挑战性。准确地模拟这一过程并有效地检测拷贝数变异(CNVs)是单细胞测序数据分析的主要障碍。最近的研究表明,潜在的拷贝数受到噪声的破坏,噪声可以近似为负二项分布。本文建立了从读取深度信号重构拷贝数的通用数学模型,并针对不同的读取分布,提出了该模型的两种具体变体泊松- cnv和NB-CNV。基于经典的交替方向最小化方法,设计了求解该模型的高效数值解。在合成数据集和经验单细胞测序数据集上进行了大量实验,比较了两种模型的性能。结果表明,所提出的NB-CNV模型在调用单细胞测序数据的CNV方面取得了优异的性能。
{"title":"Multi-norm constrained optimization methods for calling copy number variants in single cell sequencing data","authors":"Changsheng Zhang, Hongmin Cai, Jingying Huang, Bo Xu","doi":"10.1109/BIBM.2016.7822511","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822511","url":null,"abstract":"The revolutionary invention of single-cell sequencing technology carves out a new way to delineate intra tumor heterogeneity and the evolution of single cells at the molecular level. Since single-cell sequencing requires a special genome amplification step to accumulate enough samples, a large number of bias were introduced, making the calling of copy number variants rather challenging. Accurately modeling this process and effectively detecting copy number variations (CNVs) are the major roadblock for single-cell sequencing data analysis. Recent advances manifested that the underlying copy numbers are corrupted by noise, which could be approximated by negative binomial distribution. In this paper, we formulated a general mathematical model for copy number reconstruction from read depth signal, and presented its two specific variants, namely Poisson-CNV and NB-CNV to catering for various reads distribution. Efficient numerical solution based on the classical alternating direction minimization method was designed to solve the proposed models. Extensive experiments on both synthetic datasets and empirical single-cell sequencing datasets were conducted to compare the performance of the two models. The results show that the proposed model of NB-CNV achieved superior performance in calling the CNV for single-cell sequencing data.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116640622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Unconstrained optimization in projection method for indefinite SVMs 不定支持向量机投影法的无约束优化
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822585
Hao Jiang, W. Ching, Yushan Qiu, Xiaoqing Cheng
Positive semi-definiteness is a critical property in Support Vector Machine (SVM) methods to ensure efficient solutions through convex quadratic programming. In this paper, we introduce a projection matrix on indefinite kernels to formulate a positive semi-definite one. The proposed model can be regarded as a generalized version of the spectrum method (denoising method and flipping method) by varying parameter λ. In particular, our suggested optimal λ under the Bregman matrix divergence theory can be obtained using unconstrained optimization. Experimental results on 4 real world data sets ranging from glycan classification to cancer prediction show that the proposed model can achieve better or competitive performance when compared to the related indefinite kernel methods. This may suggest a new way in motif extractions or cancer predictions.
正半确定性是支持向量机方法中保证凸二次规划有效解的关键性质。本文引入不定核上的投影矩阵,从而得到一个正的半定投影矩阵。该模型可以看作是通过改变参数λ的谱法(去噪法和翻转法)的广义版本。特别地,我们建议的最优λ在Bregman矩阵散度理论下可以使用无约束优化得到。从聚糖分类到癌症预测的4个真实数据集的实验结果表明,与相关的不确定核方法相比,所提出的模型可以获得更好的或有竞争力的性能。这可能为基序提取或癌症预测提供一种新的方法。
{"title":"Unconstrained optimization in projection method for indefinite SVMs","authors":"Hao Jiang, W. Ching, Yushan Qiu, Xiaoqing Cheng","doi":"10.1109/BIBM.2016.7822585","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822585","url":null,"abstract":"Positive semi-definiteness is a critical property in Support Vector Machine (SVM) methods to ensure efficient solutions through convex quadratic programming. In this paper, we introduce a projection matrix on indefinite kernels to formulate a positive semi-definite one. The proposed model can be regarded as a generalized version of the spectrum method (denoising method and flipping method) by varying parameter λ. In particular, our suggested optimal λ under the Bregman matrix divergence theory can be obtained using unconstrained optimization. Experimental results on 4 real world data sets ranging from glycan classification to cancer prediction show that the proposed model can achieve better or competitive performance when compared to the related indefinite kernel methods. This may suggest a new way in motif extractions or cancer predictions.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116833785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A representational analysis of a temporal indeterminancy display in clinical events 临床事件中时间不确定性表现的代表性分析
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822673
M. Madkour, Hsing-yi Song, Jingcheng Du, Cui Tao
This paper describes a proposition for representing temporal indeterminacy in events from clinical narratives using fuzzy sets membership functions. This approach leverages both temporal and semantic information of events and has been proved by representational analysis evaluation method. We demonstrate that membership functions' graphs can be used for representing temporal approximation and granularity of events. We also show that this approach is helpful for the construction of fine timeline of clinical events, and can be used for calculating accurate metrics for ordering events.
本文描述了一个用模糊集隶属函数表示临床叙述事件时间不确定性的命题。该方法充分利用了事件的时间信息和语义信息,并通过表征分析评价方法得到了验证。我们证明了隶属函数图可以用来表示事件的时间逼近和粒度。我们还表明,该方法有助于构建临床事件的精细时间线,并可用于计算精确的事件排序指标。
{"title":"A representational analysis of a temporal indeterminancy display in clinical events","authors":"M. Madkour, Hsing-yi Song, Jingcheng Du, Cui Tao","doi":"10.1109/BIBM.2016.7822673","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822673","url":null,"abstract":"This paper describes a proposition for representing temporal indeterminacy in events from clinical narratives using fuzzy sets membership functions. This approach leverages both temporal and semantic information of events and has been proved by representational analysis evaluation method. We demonstrate that membership functions' graphs can be used for representing temporal approximation and granularity of events. We also show that this approach is helpful for the construction of fine timeline of clinical events, and can be used for calculating accurate metrics for ordering events.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116688591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Visual orchestration and autonomous execution of distributed and heterogeneous computational biology pipelines 分布式和异构计算生物学管道的可视化编排和自主执行
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822615
Xin Mou, H. Jamil, R. Rinker
Data integration continues to baffle researchers even though substantial progress has been made. Although the emergence of technologies such as XML, web services, semantic web and cloud computing have helped, a system in which biologists are comfortable articulating new applications and developing them without technical assistance from a computing expert is yet to be realized. The distance between a friendly graphical interface that does little, and a “traditional” system though clunky yet powerful, is deemed too great more often than not. The question that remains unanswered is, if a user can state her query involving a set of complex, heterogeneous and distributed life sciences resources in an easy to use language and execute it without further help from a computer savvy programmer. In this paper, we present a declarative meta-language, called VisFlow, for requirement specification, and a translator for mapping requirements into executable queries in a variant of SQL augmented with integration artifacts.
尽管已经取得了实质性进展,但数据整合仍然困扰着研究人员。尽管诸如XML、web服务、语义网和云计算等技术的出现有所帮助,但生物学家在没有计算专家的技术帮助下轻松表述和开发新应用程序的系统尚未实现。一个没有什么功能的友好图形界面和一个虽然笨重但功能强大的“传统”系统之间的距离往往被认为太大了。仍未解决的问题是,用户是否可以用一种易于使用的语言陈述涉及一组复杂、异构和分布式生命科学资源的查询,并在没有精通计算机的程序员进一步帮助的情况下执行该查询。在本文中,我们提出了一种声明性元语言,称为VisFlow,用于需求规范,以及一种转换器,用于将需求映射到带有集成构件的SQL变体中的可执行查询。
{"title":"Visual orchestration and autonomous execution of distributed and heterogeneous computational biology pipelines","authors":"Xin Mou, H. Jamil, R. Rinker","doi":"10.1109/BIBM.2016.7822615","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822615","url":null,"abstract":"Data integration continues to baffle researchers even though substantial progress has been made. Although the emergence of technologies such as XML, web services, semantic web and cloud computing have helped, a system in which biologists are comfortable articulating new applications and developing them without technical assistance from a computing expert is yet to be realized. The distance between a friendly graphical interface that does little, and a “traditional” system though clunky yet powerful, is deemed too great more often than not. The question that remains unanswered is, if a user can state her query involving a set of complex, heterogeneous and distributed life sciences resources in an easy to use language and execute it without further help from a computer savvy programmer. In this paper, we present a declarative meta-language, called VisFlow, for requirement specification, and a translator for mapping requirements into executable queries in a variant of SQL augmented with integration artifacts.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116451952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Wide line detection with water flow 宽线检测水流
Pub Date : 2016-12-01 DOI: 10.1109/BIBM.2016.7822715
Yangyang Hu, Wenqiang Zhang, Hong Lu, Fufeng Li, Weifei Zhang
Line detection plays a vital role in visual analysis tasks like Traditional Chinese Medicine (TCM) image analytics. However, most of the current methods ignore line thickness and perform poorly for the lines with different widths. This paper proposes a novel line detection method by using the water flow method. Unlike most edge-based and region-based line detectors, the water flow method is applied to obtaining the whole line response map by simply imitating the movement of water in the image smoothed by guided filter, which is viewed as a geomorphological map. In addition, this paper also proposes an adaptive parameter selection method so that the line detection can be more robust and accurate. Experimental results demonstrate the effectiveness of the proposed method on tongue crack images in comparison to the existing line extraction methods.
在中医图像分析等视觉分析任务中,线检测起着至关重要的作用。然而,目前大多数方法忽略了线粗细,对于不同宽度的线表现不佳。本文提出了一种利用水流法进行直线检测的新方法。与大多数基于边缘和区域的线检测器不同,水流法通过简单地模拟经过引导滤波平滑的图像中的水的运动来获得整个线响应图,将其视为地形图。此外,本文还提出了一种自适应参数选择方法,使直线检测具有更强的鲁棒性和准确性。实验结果表明,与现有的线提取方法相比,该方法对舌裂纹图像的提取是有效的。
{"title":"Wide line detection with water flow","authors":"Yangyang Hu, Wenqiang Zhang, Hong Lu, Fufeng Li, Weifei Zhang","doi":"10.1109/BIBM.2016.7822715","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822715","url":null,"abstract":"Line detection plays a vital role in visual analysis tasks like Traditional Chinese Medicine (TCM) image analytics. However, most of the current methods ignore line thickness and perform poorly for the lines with different widths. This paper proposes a novel line detection method by using the water flow method. Unlike most edge-based and region-based line detectors, the water flow method is applied to obtaining the whole line response map by simply imitating the movement of water in the image smoothed by guided filter, which is viewed as a geomorphological map. In addition, this paper also proposes an adaptive parameter selection method so that the line detection can be more robust and accurate. Experimental results demonstrate the effectiveness of the proposed method on tongue crack images in comparison to the existing line extraction methods.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115087121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1