Pub Date : 2016-12-15DOI: 10.1109/BIBM.2016.7822643
G. Lightbody, Fiona Browne, Huiru Zheng, Valeriia Haberland, J. Blayney
We have reached the era of full genome sequencing using high throughput sequencing technologies pouring out gigabases of reads in a day. To fully benefit from such a profusion of data high performance tools and systems are needed to extract the information lying within the sequences. This paper provides an overview of the evolution of high-throughput sequencing and the tools, infrastructure and data management developing in this space to support a key area in personalized medicine. The paper concludes by providing an outlook in the future of such technologies and their applications and how they might shape clinical governance.
{"title":"The role of high performance, grid and cloud computing in high-throughput sequencing","authors":"G. Lightbody, Fiona Browne, Huiru Zheng, Valeriia Haberland, J. Blayney","doi":"10.1109/BIBM.2016.7822643","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822643","url":null,"abstract":"We have reached the era of full genome sequencing using high throughput sequencing technologies pouring out gigabases of reads in a day. To fully benefit from such a profusion of data high performance tools and systems are needed to extract the information lying within the sequences. This paper provides an overview of the evolution of high-throughput sequencing and the tools, infrastructure and data management developing in this space to support a key area in personalized medicine. The paper concludes by providing an outlook in the future of such technologies and their applications and how they might shape clinical governance.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122631981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822705
Ziwei Zhou, Yingyi Gui, Zhihao Yang, Xiaoxia Liu, Lei Wang, Yin Zhang, Hongfei Lin, Jian Wang
High-throughput experimental techniques have produced a large amount of human protein-protein interactions, making it possible to construct a large-scale human PPI network and detect human protein complexes from the network with computational approaches. However, most of current complex detection methods are based on graph theory which can't utilize the information of the known complexes. In this paper, we present a supervised learning method to detect protein complexes in a human PPI network. In this method, biological characteristics and properties of the network are taken into consideration to construct a rich feature set to train a regression model for protein complex detection. In addition, the specific disease related PPIs are extracted from biomedical literatures and then integrated into the original PPI network for detecting the disease-specific protein complexes more effectively. Experimental results show that the performance of our method is superior to other existing state-of-the-art methods. Furthermore, through the analysis of the breast cancer specific complexes detected with our method, more biological insights for breast cancer (e.g., some candidate susceptible genes of breast cancer) are provided.
{"title":"Disease-specific protein complex detection in the human protein interaction network with a supervised learning method","authors":"Ziwei Zhou, Yingyi Gui, Zhihao Yang, Xiaoxia Liu, Lei Wang, Yin Zhang, Hongfei Lin, Jian Wang","doi":"10.1109/BIBM.2016.7822705","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822705","url":null,"abstract":"High-throughput experimental techniques have produced a large amount of human protein-protein interactions, making it possible to construct a large-scale human PPI network and detect human protein complexes from the network with computational approaches. However, most of current complex detection methods are based on graph theory which can't utilize the information of the known complexes. In this paper, we present a supervised learning method to detect protein complexes in a human PPI network. In this method, biological characteristics and properties of the network are taken into consideration to construct a rich feature set to train a regression model for protein complex detection. In addition, the specific disease related PPIs are extracted from biomedical literatures and then integrated into the original PPI network for detecting the disease-specific protein complexes more effectively. Experimental results show that the performance of our method is superior to other existing state-of-the-art methods. Furthermore, through the analysis of the breast cancer specific complexes detected with our method, more biological insights for breast cancer (e.g., some candidate susceptible genes of breast cancer) are provided.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115622720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822611
Xiu-juan Lei, Yuchen Zhang, Fang-Xiang Wu, A. Zhang
Identification of protein complexes is very important to investigate the characteristics of biological processes. Most of existing protein complex clustering algorithms were often run only on a static protein-protein interaction (PPI) network. The dynamic characteristics of interactions were ignored. In order to solve the problem, a new clustering algorithm (TP-WDPIN) was proposed which is based on the concept of topological potential to measure the importance of proteins in the process of detecting seed proteins and then to mine protein complexes from weighted dynamic PPI network. The algorithm used features of core-attachment of complexes and split low density cores to improve density of cores for achieving better clustering results. Experiment results showed that the proposed TP-WDPIN algorithm has better performance than other algorithms on two PPI databases.
{"title":"Mining protein complexes based on topology potential from weighted dynamic PPI network","authors":"Xiu-juan Lei, Yuchen Zhang, Fang-Xiang Wu, A. Zhang","doi":"10.1109/BIBM.2016.7822611","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822611","url":null,"abstract":"Identification of protein complexes is very important to investigate the characteristics of biological processes. Most of existing protein complex clustering algorithms were often run only on a static protein-protein interaction (PPI) network. The dynamic characteristics of interactions were ignored. In order to solve the problem, a new clustering algorithm (TP-WDPIN) was proposed which is based on the concept of topological potential to measure the importance of proteins in the process of detecting seed proteins and then to mine protein complexes from weighted dynamic PPI network. The algorithm used features of core-attachment of complexes and split low density cores to improve density of cores for achieving better clustering results. Experiment results showed that the proposed TP-WDPIN algorithm has better performance than other algorithms on two PPI databases.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125882313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822682
P. Vizza, P. Guzzi, P. Veltri, Annalisa Papa, G. Cascini, G. Sesti, E. Succurro
Quantitative analysis of PET images is a useful as well as essential practice to perform an objective measurement of a physiological process. It allows to study diseases, evaluating treatment response and comparing patients data by quantify images. The analysis consists in estimating the quantity of radionuclide tracer uptaken by tissues. We focus on quantitative analysis of dynamic PET studies to evaluate the diseases of coronary artery and myocardium perfusion. We report experiences on quantitative cardiac PET analysis by using a commercial and largely used software to evaluate viable myocardium through Patlak method. We report also results obtained on PET images provided by clinical departments of the Magna Graecia University Medical School of Catanzaro.
PET图像的定量分析是一种有用的,也是必要的实践,以执行生理过程的客观测量。它允许研究疾病,评估治疗反应,并通过量化图像比较患者数据。分析包括估计组织吸收放射性核素示踪剂的量。我们的重点是定量分析动态PET研究,以评估冠状动脉疾病和心肌灌注。我们报告了定量心脏PET分析的经验,使用商业和广泛使用的软件,通过Patlak法评估存活心肌。我们还报告了由Magna Graecia University Medical School of Catanzaro的临床部门提供的PET图像的结果。
{"title":"Experiences on quantitative cardiac PET analysis","authors":"P. Vizza, P. Guzzi, P. Veltri, Annalisa Papa, G. Cascini, G. Sesti, E. Succurro","doi":"10.1109/BIBM.2016.7822682","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822682","url":null,"abstract":"Quantitative analysis of PET images is a useful as well as essential practice to perform an objective measurement of a physiological process. It allows to study diseases, evaluating treatment response and comparing patients data by quantify images. The analysis consists in estimating the quantity of radionuclide tracer uptaken by tissues. We focus on quantitative analysis of dynamic PET studies to evaluate the diseases of coronary artery and myocardium perfusion. We report experiences on quantitative cardiac PET analysis by using a commercial and largely used software to evaluate viable myocardium through Patlak method. We report also results obtained on PET images provided by clinical departments of the Magna Graecia University Medical School of Catanzaro.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126033210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822697
Kai Che, Xiaoyan Liu, Maozu Guo, Junwei Zhang, Lei Wang, Yin Zhang
Detecting single nucleotide polymorphism (SNP) epistasis contributes to understand disease susceptibility and discover disease pathogenesis underlying complex disease. In this paper, we propose an approach called permutation-based Gradient Boosting Machine (pGBM) to detect pure epistasis by estimating the power of a GBM classifier which is influenced by permuting SNP pairs. pGBM is based on two permutation strategies and gradient boosting machine model. To extend pGBM to detect pure epistasis well on unbalanced dataset, average AUC difference value is chosen as the metric that quantifies the SNP interactions intensity. The experiment results demonstrate that our method has a high success rate with both balanced/unbalanced simulation and real dataset. In addition, pGBM shows great potential to detect pure SNP epistasis to uncover more complex disease pathogenesis.
{"title":"Epistasis detection using a permutation-based Gradient Boosting Machine","authors":"Kai Che, Xiaoyan Liu, Maozu Guo, Junwei Zhang, Lei Wang, Yin Zhang","doi":"10.1109/BIBM.2016.7822697","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822697","url":null,"abstract":"Detecting single nucleotide polymorphism (SNP) epistasis contributes to understand disease susceptibility and discover disease pathogenesis underlying complex disease. In this paper, we propose an approach called permutation-based Gradient Boosting Machine (pGBM) to detect pure epistasis by estimating the power of a GBM classifier which is influenced by permuting SNP pairs. pGBM is based on two permutation strategies and gradient boosting machine model. To extend pGBM to detect pure epistasis well on unbalanced dataset, average AUC difference value is chosen as the metric that quantifies the SNP interactions intensity. The experiment results demonstrate that our method has a high success rate with both balanced/unbalanced simulation and real dataset. In addition, pGBM shows great potential to detect pure SNP epistasis to uncover more complex disease pathogenesis.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126769940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822802
Jian Yang, Tianhong Wang, Xiaona Shen, Xing Chen, Kehui Zhao, Jing Wang, Yi Zhang, Jing Zhao, Yang Ga
In clinical practice at Tibetan area of China, Traditional Tibetan Medicine formula Wuwei-Ganlu-Yaoyu-Keli(WGYK) is widely added in warm water of bath therapy to treat rheumatoid arthritis (RA). However, its action mechanism is not well interpreted yet. In this study, based on gene expression data from microarray experiments, we apply approaches of network pharmacology to further reveal the action mechanism that WGYK exerts on RA from perspective of protein-protein interactions and pathways. This study may facilitate our understanding of anti-RA effect of WGYK from perspective of network pharmacology.
{"title":"Network based study for the anti-rheumatic mechanism of Tibetan medicated-bath therapy","authors":"Jian Yang, Tianhong Wang, Xiaona Shen, Xing Chen, Kehui Zhao, Jing Wang, Yi Zhang, Jing Zhao, Yang Ga","doi":"10.1109/BIBM.2016.7822802","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822802","url":null,"abstract":"In clinical practice at Tibetan area of China, Traditional Tibetan Medicine formula Wuwei-Ganlu-Yaoyu-Keli(WGYK) is widely added in warm water of bath therapy to treat rheumatoid arthritis (RA). However, its action mechanism is not well interpreted yet. In this study, based on gene expression data from microarray experiments, we apply approaches of network pharmacology to further reveal the action mechanism that WGYK exerts on RA from perspective of protein-protein interactions and pathways. This study may facilitate our understanding of anti-RA effect of WGYK from perspective of network pharmacology.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126945652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822534
J. Chen, Z. Yue, Michael T. Neylon, Thanh Nguyen, Nafisa Bulsara, Itika Arora, Timothy Ratliff
In this article, we described a new computational framework to construct “Super Gene Sets”-Pathways, Annotated list, and Gene signatures (PAGs), regulatory (r-type) PAG-PAG relationships. To construct PAGs, we aggregate singleton PAGs (sPAGs) upstream/downstream of a common shared multi-gene PAG (mPAGs). Then, we iteratively remove a member gene to calculate its Cohesion Coefficient (CoCo), which helps assess the degree of biological relevance beyond random chance, until the CoCo score achieves the maximal value at a specific level. The new relationship between aggregated mPAG (m'PAG) and the shared mPAG will, therefore, have distinct m'PAG-mPAG relationships. Our results suggest the following. First, the new m'PAGs have sufficiently high CoCo scores, suggesting high biological relevance, and distinct gene ontology annotations different from their regulated PAG targets; however, there are significant enrichments of shared GO annotations between each pair of identified m'PAG-mPAG relationships. Second, new m'PAGs are relatively robust against data noise based on noise characteristic simulations. Third, by applying our framework to real cancer microarray analysis data, we demonstrated that our new framework is effective in helping build multi-scale biomolecular systems models that are easy to interpret by biologists.
{"title":"Towards constructing “Super Gene Sets” regulatory networks","authors":"J. Chen, Z. Yue, Michael T. Neylon, Thanh Nguyen, Nafisa Bulsara, Itika Arora, Timothy Ratliff","doi":"10.1109/BIBM.2016.7822534","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822534","url":null,"abstract":"In this article, we described a new computational framework to construct “Super Gene Sets”-Pathways, Annotated list, and Gene signatures (PAGs), regulatory (r-type) PAG-PAG relationships. To construct PAGs, we aggregate singleton PAGs (sPAGs) upstream/downstream of a common shared multi-gene PAG (mPAGs). Then, we iteratively remove a member gene to calculate its Cohesion Coefficient (CoCo), which helps assess the degree of biological relevance beyond random chance, until the CoCo score achieves the maximal value at a specific level. The new relationship between aggregated mPAG (m'PAG) and the shared mPAG will, therefore, have distinct m'PAG-mPAG relationships. Our results suggest the following. First, the new m'PAGs have sufficiently high CoCo scores, suggesting high biological relevance, and distinct gene ontology annotations different from their regulated PAG targets; however, there are significant enrichments of shared GO annotations between each pair of identified m'PAG-mPAG relationships. Second, new m'PAGs are relatively robust against data noise based on noise characteristic simulations. Third, by applying our framework to real cancer microarray analysis data, we demonstrated that our new framework is effective in helping build multi-scale biomolecular systems models that are easy to interpret by biologists.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115327587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
predicting drug side effects is a critical task in the drug discovery, which attracts great attentions in both academy and industry. Although lots of machine learning methods have been proposed, great challenges arise with boom of precision medicine. On one hand, many methods are based on the assumption that similar drugs may share same side effects, but measuring the drug-drug similarity appropriately is challenging. One the other hand, multi-source data provide diverse information for the analysis of side effects, and should be integrated for the high-accuracy prediction. In this paper, we tackle the side effect prediction problem through linear neighborhoods and multi-source data integration. In the feature space, linear neighborhoods are constructed to extract the drug-drug similarity, namely “linear neighborhood similarity”. By transferring the similarity into the side effect space, known side effect information is propagated through the similarity-based graph. Thus, we propose the linear neighborhood similarity method (LNSM), which utilizes single-source data for the side effect prediction. Further, we extend LNSM to deal with multi-source data, and propose two data integration methods: similarity matrix integration method (LNSM-SMI) and cost minimization integration method (LNSM-CMI), which integrate drug substructure data, drug target data, drug transporter data, drug enzyme data, drug pathway data and drug indication data to improve the prediction accuracy. The proposed methods are evaluated on the benchmark datasets. The linear neighborhood similarity method (LNSM) can produce satisfying results on the single-source data. Data integration methods (LNSM-SMI and LNSM-CMI) can effectively integrate multi-source data, and outperform other state-of-the-art side effect prediction methods in the cross validation and independent test. The proposed methods are promising for the drug side effect prediction.
{"title":"Drug side effect prediction through linear neighborhoods and multiple data source integration","authors":"Wen Zhang, Yanlin Chen, Shikui Tu, Feng Liu, Qianlong Qu","doi":"10.1109/BIBM.2016.7822555","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822555","url":null,"abstract":"predicting drug side effects is a critical task in the drug discovery, which attracts great attentions in both academy and industry. Although lots of machine learning methods have been proposed, great challenges arise with boom of precision medicine. On one hand, many methods are based on the assumption that similar drugs may share same side effects, but measuring the drug-drug similarity appropriately is challenging. One the other hand, multi-source data provide diverse information for the analysis of side effects, and should be integrated for the high-accuracy prediction. In this paper, we tackle the side effect prediction problem through linear neighborhoods and multi-source data integration. In the feature space, linear neighborhoods are constructed to extract the drug-drug similarity, namely “linear neighborhood similarity”. By transferring the similarity into the side effect space, known side effect information is propagated through the similarity-based graph. Thus, we propose the linear neighborhood similarity method (LNSM), which utilizes single-source data for the side effect prediction. Further, we extend LNSM to deal with multi-source data, and propose two data integration methods: similarity matrix integration method (LNSM-SMI) and cost minimization integration method (LNSM-CMI), which integrate drug substructure data, drug target data, drug transporter data, drug enzyme data, drug pathway data and drug indication data to improve the prediction accuracy. The proposed methods are evaluated on the benchmark datasets. The linear neighborhood similarity method (LNSM) can produce satisfying results on the single-source data. Data integration methods (LNSM-SMI and LNSM-CMI) can effectively integrate multi-source data, and outperform other state-of-the-art side effect prediction methods in the cross validation and independent test. The proposed methods are promising for the drug side effect prediction.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122705588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822751
Zhiyong Yang, Jingcheng Lu, Taohong Zhang
Classification methods has become increasingly popular for biomedical and bioinformatical data analysis. However, due to the difficulty of data acquisition, sometimes we could only obtain small-scale datasets which may leads to unreasonable generalization performances. For SVM-like algorithms, we could resort to Large Margin theory to find out solutions for such dilemma. Recent studies on large margin theory show that, besides maximizing the minimum margin of a given training dataset, it is also necessary to optimization the margin distribution to boost the overall generalization ability. Correspondingly, a novel SVM-like algorithm called Large Margin Distribution Machine (LDM) realizes this idea by maximizing the average of margin and minimizing the variance of margin simultaneously. And a series of applications has been reported thereafter. There is another well-known machine learning algorithm called Extreme Learning Machine (ELM) which shares similar framework with SVM. It is believed in this paper ELM could also benefit from the virtues of margin distribution optimization. Bearing this in mind, a novel algorithm called Extreme Large Margin Distribution Machine(ELDM) is proposed in this paper by bridging the advantages of ELM and LDM. And an efficient extension of ELDM for multi-class classifications under One vs. All Scheme is proposed subsequently. Finally, the experiment results on both benchmark datasets and biomedical classification datasets show the effectiveness of our proposed algorithm.
分类方法在生物医学和生物信息学数据分析中越来越受欢迎。然而,由于数据采集的困难,有时我们只能获得小规模的数据集,这可能会导致不合理的泛化性能。对于类svm算法,我们可以借助大边际理论来解决这种困境。最近关于大边际理论的研究表明,除了最大化给定训练数据集的最小边际外,还需要优化边际分布以提高整体泛化能力。相应的,一种新的类似svm的算法——大额保证金分布机(Large Margin Distribution Machine, LDM)通过同时最大化保证金均值和最小化保证金方差来实现这一思想。此后有一系列的应用报道。还有另一种著名的机器学习算法称为极限学习机(ELM),它与支持向量机有相似的框架。本文认为,边际收益管理也可以受益于边际分配优化的优点。考虑到这一点,本文通过桥接ELM和LDM的优点,提出了一种新的算法,称为极大边际分布机(Extreme Large Margin Distribution Machine, ELDM)。在此基础上,提出了一种适用于多类分类的有效扩展方法。最后,在基准数据集和生物医学分类数据集上的实验结果表明了本文算法的有效性。
{"title":"Extreme Large Margin Distribution Machine and its applications for biomedical datasets","authors":"Zhiyong Yang, Jingcheng Lu, Taohong Zhang","doi":"10.1109/BIBM.2016.7822751","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822751","url":null,"abstract":"Classification methods has become increasingly popular for biomedical and bioinformatical data analysis. However, due to the difficulty of data acquisition, sometimes we could only obtain small-scale datasets which may leads to unreasonable generalization performances. For SVM-like algorithms, we could resort to Large Margin theory to find out solutions for such dilemma. Recent studies on large margin theory show that, besides maximizing the minimum margin of a given training dataset, it is also necessary to optimization the margin distribution to boost the overall generalization ability. Correspondingly, a novel SVM-like algorithm called Large Margin Distribution Machine (LDM) realizes this idea by maximizing the average of margin and minimizing the variance of margin simultaneously. And a series of applications has been reported thereafter. There is another well-known machine learning algorithm called Extreme Learning Machine (ELM) which shares similar framework with SVM. It is believed in this paper ELM could also benefit from the virtues of margin distribution optimization. Bearing this in mind, a novel algorithm called Extreme Large Margin Distribution Machine(ELDM) is proposed in this paper by bridging the advantages of ELM and LDM. And an efficient extension of ELDM for multi-class classifications under One vs. All Scheme is proposed subsequently. Finally, the experiment results on both benchmark datasets and biomedical classification datasets show the effectiveness of our proposed algorithm.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"70 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122900879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2016-12-01DOI: 10.1109/BIBM.2016.7822774
Yaoxin Nie, Linlin Zhu, Yipeng Su, Xudong Li, Zhendong Niu
Currently, dynamic causal modeling (DCM) is one of the most widely used models for an effective brain connectivity network, but it also has some disadvantages (e.g., researchers' selection of cerebral regions of interest [ROIs] is subjective, a substantial time is required for computation, etc.). Statistical Parametric Mapping (SPM) is the most popular statistical data analysis software for brain function, but its settings cumbersome, especially the data preprocessing section. In response to these disadvantages of DCM and SPM, we designed and created a computer-aided system for an effective brain connectivity network, modularized the data preprocessing section of SPM, and we explored the cerebral ROIs and possible co-activation network based on our proposed approach. The co-activation network has as a prior interconnection relationship, and it is used to assist in the selection of ROIs in similar cognitive experiments; thus, the testing of meaningless noise connection modes by the DCM is prevented, the number of models DMC is decreased, and the accuracy of the conclusions and computational efficiency of the DCM are improved.
{"title":"Development of a computer-aided system for an effective brain connectivity network","authors":"Yaoxin Nie, Linlin Zhu, Yipeng Su, Xudong Li, Zhendong Niu","doi":"10.1109/BIBM.2016.7822774","DOIUrl":"https://doi.org/10.1109/BIBM.2016.7822774","url":null,"abstract":"Currently, dynamic causal modeling (DCM) is one of the most widely used models for an effective brain connectivity network, but it also has some disadvantages (e.g., researchers' selection of cerebral regions of interest [ROIs] is subjective, a substantial time is required for computation, etc.). Statistical Parametric Mapping (SPM) is the most popular statistical data analysis software for brain function, but its settings cumbersome, especially the data preprocessing section. In response to these disadvantages of DCM and SPM, we designed and created a computer-aided system for an effective brain connectivity network, modularized the data preprocessing section of SPM, and we explored the cerebral ROIs and possible co-activation network based on our proposed approach. The co-activation network has as a prior interconnection relationship, and it is used to assist in the selection of ROIs in similar cognitive experiments; thus, the testing of meaningless noise connection modes by the DCM is prevented, the number of models DMC is decreased, and the accuracy of the conclusions and computational efficiency of the DCM are improved.","PeriodicalId":345384,"journal":{"name":"2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122852979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}