首页 > 最新文献

2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)最新文献

英文 中文
Indoor signage detection based on saliency map and bipartite graph matching 基于显著性图和二部图匹配的室内标识检测
Shuihua Wang, Yingli Tian
Object detection plays a very important role in many applications such as image retrieval, surveillance, robot navigation, wayfinding, etc. In this paper, we propose a novel approach to detect indoor signage to help blind people find their destinations in unfamiliar environments. Our method first extracts the attended areas by using a saliency map. Then the signage is detected in the attended areas by using bipartite graph matching. The proposed method can handle multiple signage detection. Experimental results on our collected indoor signage dataset demonstrate the effectiveness and efficiency of our proposed method. Furthermore, saliency maps could eliminate the interference information and improve the accuracy of the detection results.
目标检测在图像检索、监控、机器人导航、寻路等应用中起着非常重要的作用。在本文中,我们提出了一种新的方法来检测室内标志,以帮助盲人在陌生的环境中找到他们的目的地。我们的方法首先使用显著性图提取出席区域。然后,利用二部图匹配的方法对标识进行检测。该方法可以处理多个标识的检测。在室内标牌数据集上的实验结果证明了该方法的有效性和高效性。此外,显著性图可以消除干扰信息,提高检测结果的准确性。
{"title":"Indoor signage detection based on saliency map and bipartite graph matching","authors":"Shuihua Wang, Yingli Tian","doi":"10.1109/BIBMW.2011.6112422","DOIUrl":"https://doi.org/10.1109/BIBMW.2011.6112422","url":null,"abstract":"Object detection plays a very important role in many applications such as image retrieval, surveillance, robot navigation, wayfinding, etc. In this paper, we propose a novel approach to detect indoor signage to help blind people find their destinations in unfamiliar environments. Our method first extracts the attended areas by using a saliency map. Then the signage is detected in the attended areas by using bipartite graph matching. The proposed method can handle multiple signage detection. Experimental results on our collected indoor signage dataset demonstrate the effectiveness and efficiency of our proposed method. Furthermore, saliency maps could eliminate the interference information and improve the accuracy of the detection results.","PeriodicalId":6345,"journal":{"name":"2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)","volume":"22 1","pages":"518-525"},"PeriodicalIF":0.0,"publicationDate":"2011-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78059896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Modular neural network model based foetal state classification 基于模块化神经网络模型的胎儿状态分类
S. Jadhav, S. Nalbalwar, A. Ghatol
Cardiotocography (CTG) is a simultaneous recording of foetal heart rate (FHR) and uterine contractions (UC) and it is one of the most common diagnostic techniques to evaluate maternal and foetal well-being during pregnancy and before delivery. Assessment of the foetal state can be verified only after delivery using the foetal (newborn) outcome data. One of the most important features defining the abnormal foetal outcome is low birth weight. This paper proposes a multi-class classification algorithm using Modular neural network (MNN) models. It tries to boost two conflicting main objectives of multi-class classifiers: a high correct classification rate level and a high classification rate for each class. Using a Cardiotocography database of normal, suspect and pathological cases, we trained MNN classifiers with 23 real valued diagnostic features collected from total 2126 foetal CTG signal recordings data from UCI Machine Learning Repository. We used the classification in a detection process. The proposed methodology is presented, which then is tested on UCI Cardiotocography unseen testing data sets. Experimental results are promising paving the way for further research in that direction.
心脏造影(CTG)是一种同时记录胎儿心率(FHR)和子宫收缩(UC)的技术,是评估怀孕期间和分娩前母体和胎儿健康状况的最常用诊断技术之一。只有在分娩后使用胎儿(新生儿)结局数据才能对胎儿状态进行评估。定义异常胎儿结局的最重要特征之一是低出生体重。提出了一种基于模块化神经网络(MNN)模型的多类分类算法。它试图提高多类分类器的两个相互冲突的主要目标:高正确分类率水平和每个类的高分类率。使用正常、可疑和病理病例的心脏数据库,我们使用从UCI机器学习库中收集的2126个胎儿CTG信号记录数据中收集的23个真实价值诊断特征来训练MNN分类器。我们在检测过程中使用了分类。提出了该方法,然后在UCI心脏造影未见测试数据集上进行了测试。实验结果很有希望为这一方向的进一步研究铺平道路。
{"title":"Modular neural network model based foetal state classification","authors":"S. Jadhav, S. Nalbalwar, A. Ghatol","doi":"10.1109/BIBMW.2011.6112501","DOIUrl":"https://doi.org/10.1109/BIBMW.2011.6112501","url":null,"abstract":"Cardiotocography (CTG) is a simultaneous recording of foetal heart rate (FHR) and uterine contractions (UC) and it is one of the most common diagnostic techniques to evaluate maternal and foetal well-being during pregnancy and before delivery. Assessment of the foetal state can be verified only after delivery using the foetal (newborn) outcome data. One of the most important features defining the abnormal foetal outcome is low birth weight. This paper proposes a multi-class classification algorithm using Modular neural network (MNN) models. It tries to boost two conflicting main objectives of multi-class classifiers: a high correct classification rate level and a high classification rate for each class. Using a Cardiotocography database of normal, suspect and pathological cases, we trained MNN classifiers with 23 real valued diagnostic features collected from total 2126 foetal CTG signal recordings data from UCI Machine Learning Repository. We used the classification in a detection process. The proposed methodology is presented, which then is tested on UCI Cardiotocography unseen testing data sets. Experimental results are promising paving the way for further research in that direction.","PeriodicalId":6345,"journal":{"name":"2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)","volume":"5 1","pages":"915-917"},"PeriodicalIF":0.0,"publicationDate":"2011-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78167534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Improved compression ratio for model-based ECG compression using differential coding 利用差分编码改进基于模型的心电压缩的压缩比
Z. Passand, M. Azarnoosh
This article proposes a technique to improve compression ratio for model-based ECG compression techniques. The proposed technique takes advantage of the quasi-periodic nature of ECG signals and uses differential coding to increase the compression ratio. It is shown that the proposed technique increase the compression ratio by a factor of about two compared to conventional compression ratio for model-based ECG compressions.
本文提出了一种提高基于模型的心电压缩技术的压缩比的方法。该方法利用心电信号的准周期特性,采用差分编码提高压缩比。结果表明,与传统的基于模型的心电压缩相比,该方法将压缩比提高了约两倍。
{"title":"Improved compression ratio for model-based ECG compression using differential coding","authors":"Z. Passand, M. Azarnoosh","doi":"10.1109/BIBMW.2011.6112502","DOIUrl":"https://doi.org/10.1109/BIBMW.2011.6112502","url":null,"abstract":"This article proposes a technique to improve compression ratio for model-based ECG compression techniques. The proposed technique takes advantage of the quasi-periodic nature of ECG signals and uses differential coding to increase the compression ratio. It is shown that the proposed technique increase the compression ratio by a factor of about two compared to conventional compression ratio for model-based ECG compressions.","PeriodicalId":6345,"journal":{"name":"2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)","volume":"1 1","pages":"918-918"},"PeriodicalIF":0.0,"publicationDate":"2011-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78554878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CUDA-LR: CUDA-accelerated logistic regression analysis tool for gene-gene interaction for genome-wide association study CUDA-LR:用于全基因组关联研究的基因相互作用的cuda加速逻辑回归分析工具
Sungyoung Lee, Min-Seok Kwon, Iksoo Huh, T. Park
In genome-wide association studies (GWAS), logistic regression (LR) has been most commonly used for finding an association between a disease phenotype and genetic variants such as single nucleotide polymorphism (SNP). Since logistic regression model requires iterative algorithms to get the parameter estimates, its application to GWAS has been limited to the identification of the individual SNPs. Thus, there have been limited applications of LR to multiple SNP analysis including gene-gene interaction analysis in large scale GWAS data. To overcome this computational burden, we developed a logistic regression analysis tool named CUDA-LR, based on the new programming architecture using Graphics Processing Unit (GPU). CUDA-LR supports not only the simple model with single SNP but also more complex model with two SNPs including the interaction. In addition, CUDA-LR provides various parameters to gain more acceleration and perform specified analysis. In the comparison between our analysis and the other methods, CUDA-LR showed almost 700-folds of acceleration and highly reliable results by our GPU specified optimization techniques. We believe that the CUDA-LR now is a useful logistic regression analysis tool for interaction analysis of large scale GWAS datasets.
在全基因组关联研究(GWAS)中,逻辑回归(LR)最常用于发现疾病表型与遗传变异(如单核苷酸多态性(SNP))之间的关联。由于逻辑回归模型需要迭代算法来获得参数估计,因此其在GWAS中的应用仅限于单个snp的识别。因此,LR在大规模GWAS数据中包括基因-基因互作分析在内的多SNP分析中的应用有限。为了克服这一计算负担,我们开发了一种名为CUDA-LR的逻辑回归分析工具,该工具基于使用图形处理单元(GPU)的新编程架构。CUDA-LR不仅支持单SNP的简单模型,也支持包含相互作用的双SNP的复杂模型。此外,CUDA-LR提供了各种参数,以获得更多的加速和执行指定的分析。在我们的分析与其他方法的比较中,CUDA-LR通过我们的GPU指定优化技术显示了近700倍的加速和高度可靠的结果。我们相信CUDA-LR现在是一个有用的逻辑回归分析工具,用于大规模GWAS数据集的交互分析。
{"title":"CUDA-LR: CUDA-accelerated logistic regression analysis tool for gene-gene interaction for genome-wide association study","authors":"Sungyoung Lee, Min-Seok Kwon, Iksoo Huh, T. Park","doi":"10.1109/BIBMW.2011.6112454","DOIUrl":"https://doi.org/10.1109/BIBMW.2011.6112454","url":null,"abstract":"In genome-wide association studies (GWAS), logistic regression (LR) has been most commonly used for finding an association between a disease phenotype and genetic variants such as single nucleotide polymorphism (SNP). Since logistic regression model requires iterative algorithms to get the parameter estimates, its application to GWAS has been limited to the identification of the individual SNPs. Thus, there have been limited applications of LR to multiple SNP analysis including gene-gene interaction analysis in large scale GWAS data. To overcome this computational burden, we developed a logistic regression analysis tool named CUDA-LR, based on the new programming architecture using Graphics Processing Unit (GPU). CUDA-LR supports not only the simple model with single SNP but also more complex model with two SNPs including the interaction. In addition, CUDA-LR provides various parameters to gain more acceleration and perform specified analysis. In the comparison between our analysis and the other methods, CUDA-LR showed almost 700-folds of acceleration and highly reliable results by our GPU specified optimization techniques. We believe that the CUDA-LR now is a useful logistic regression analysis tool for interaction analysis of large scale GWAS datasets.","PeriodicalId":6345,"journal":{"name":"2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)","volume":"47 1","pages":"691-695"},"PeriodicalIF":0.0,"publicationDate":"2011-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77888398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Weighted pooling high-throughput gene expression data sets to maximize the functional coherence of the top rank genes 加权池高通量基因表达数据集,以最大限度地提高顶级基因的功能一致性
Xiaodong Zhou, E. George
In a typical gene expression study with high throughput technique, such as microarray, a biologist usually focuses on the top genes ranked by the P-values to establish gene functional relationship / network, biological pathway, and microbiologically ramifications of the gene's selection. With more datasets publically available, researchers pool data from independent experiments, typically by pooling P-values with equal weight assigned to each dataset, aiming to fetch more biological information from the pooled data. However, the qualities of datasets may vary substantially. Assigning equal weights may not guarantee the optimal result. Applying the equal weights approach to six independent datasets, we observe the top rank genes of data pooled with this approach have less functional coherence than the single dataset that has highest functional coherence. We propose a procedure based on enhanced simulated annealing (ESA) and literature semantic indexing cohesive (LSI-c) analysis to assign optimal weights to datasets so as to maximize the functional coherence of the top rank genes ordered by their pooled P-values. We observe significantly more functional coherence in optimally pooled data than any single dataset or data pooled with equal weights. Identification of top rank genes through our optimal procedure should improve the downstream analysis.
在典型的高通量基因表达研究中,如微阵列技术,生物学家通常关注p值排名靠前的基因,以建立基因功能关系/网络、生物学途径和基因选择的微生物学后果。随着越来越多的数据集公开可用,研究人员将来自独立实验的数据汇集在一起,通常是通过将每个数据集赋予相同权重的p值汇集在一起,旨在从汇集的数据中获取更多的生物信息。然而,数据集的质量可能会有很大差异。分配相等的权重可能不能保证最佳结果。将等权方法应用于6个独立数据集,我们观察到,与具有最高功能相干性的单个数据集相比,该方法汇集的顶级基因具有更低的功能相干性。我们提出了一种基于增强模拟退火(ESA)和文献语义索引内聚(LSI-c)分析的程序,为数据集分配最优权重,从而最大限度地提高按其汇集的p值排序的顶级基因的功能一致性。我们观察到,与任何单一数据集或具有相同权重的数据池相比,优化池数据中的功能一致性明显更高。通过我们的最优程序鉴定顶级基因将改善下游分析。
{"title":"Weighted pooling high-throughput gene expression data sets to maximize the functional coherence of the top rank genes","authors":"Xiaodong Zhou, E. George","doi":"10.1109/BIBMW.2011.6112550","DOIUrl":"https://doi.org/10.1109/BIBMW.2011.6112550","url":null,"abstract":"In a typical gene expression study with high throughput technique, such as microarray, a biologist usually focuses on the top genes ranked by the P-values to establish gene functional relationship / network, biological pathway, and microbiologically ramifications of the gene's selection. With more datasets publically available, researchers pool data from independent experiments, typically by pooling P-values with equal weight assigned to each dataset, aiming to fetch more biological information from the pooled data. However, the qualities of datasets may vary substantially. Assigning equal weights may not guarantee the optimal result. Applying the equal weights approach to six independent datasets, we observe the top rank genes of data pooled with this approach have less functional coherence than the single dataset that has highest functional coherence. We propose a procedure based on enhanced simulated annealing (ESA) and literature semantic indexing cohesive (LSI-c) analysis to assign optimal weights to datasets so as to maximize the functional coherence of the top rank genes ordered by their pooled P-values. We observe significantly more functional coherence in optimally pooled data than any single dataset or data pooled with equal weights. Identification of top rank genes through our optimal procedure should improve the downstream analysis.","PeriodicalId":6345,"journal":{"name":"2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)","volume":"91 1","pages":"1033-1033"},"PeriodicalIF":0.0,"publicationDate":"2011-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73054824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Viral quasispecies reconstruction from amplicon 454 pyrosequencing reads 从扩增子454焦磷酸测序读取的病毒准种重建
Nicholas Mancuso, Bassam Tork, P. Skums, I. Măndoiu, A. Zelikovsky
We consider the quasispecies spectrum reconstruction problem in amplicon reads. The main contribution of this paper is several methods to reconstruct HCV quasispecies from simulated error-free amplicon reads. Our comparison with existing methods for quasispecies spectrum reconstruction both based on shotgun and amplicon reads show significant advantages of the proposed technique. In most of the cases, even low coverage allows to reconstruct majority of quasispecies and very accurately estimate their frequencies in the simulated samples. The source code for all implemented algorithms is available at https://bitbucket.org/nmancuso/bioa/
我们考虑了扩增子读取中的准种谱重建问题。本文的主要贡献是几种从模拟无错误扩增子读取重建HCV准种的方法。通过与现有的基于霰弹枪和扩增子读取的准物种光谱重建方法的比较,我们发现了该技术的显著优势。在大多数情况下,即使覆盖率很低,也可以重建大多数准物种,并非常准确地估计它们在模拟样本中的频率。所有实现算法的源代码可在https://bitbucket.org/nmancuso/bioa/获得
{"title":"Viral quasispecies reconstruction from amplicon 454 pyrosequencing reads","authors":"Nicholas Mancuso, Bassam Tork, P. Skums, I. Măndoiu, A. Zelikovsky","doi":"10.1109/BIBMW.2011.6112360","DOIUrl":"https://doi.org/10.1109/BIBMW.2011.6112360","url":null,"abstract":"We consider the quasispecies spectrum reconstruction problem in amplicon reads. The main contribution of this paper is several methods to reconstruct HCV quasispecies from simulated error-free amplicon reads. Our comparison with existing methods for quasispecies spectrum reconstruction both based on shotgun and amplicon reads show significant advantages of the proposed technique. In most of the cases, even low coverage allows to reconstruct majority of quasispecies and very accurately estimate their frequencies in the simulated samples. The source code for all implemented algorithms is available at https://bitbucket.org/nmancuso/bioa/","PeriodicalId":6345,"journal":{"name":"2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)","volume":"127 1","pages":"94-101"},"PeriodicalIF":0.0,"publicationDate":"2011-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75928566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Diagnosis based on decision tree and discrimination analysis for chronic hepatitis b in TCM 基于决策树的慢性乙型肝炎中医诊断与鉴别分析
Xiaoyu Chen, Lizhuang Ma, Na Chu, Yiyang Hu
Accurate discriminants of relationship between syndromes and syndrome information (symptoms, and lab indicators) are much desired in medical diagnosis applications. Although discriminants have been applied widely, the researches and applications of discriminant diagnosis model (DDT) are still blanks in diagnosis of chronic hepatitis B in traditional Chinese medicine (TCM). In this paper, a new discriminant diagnosis model constructed by attribute selection, decision tree C5.0 algorithm and discrimination analysis is proposed, which consists of two phases. One is attribute selection. The critical attributes are filtered out from the original attributes. The other is modeling phase to acquire discriminants between syndromes of chronic hepatitis B and syndrome information in TCM. From our experiments, combinations of TCM clinical symptoms and lab indicators are selected to provide formulas for syndrome differentiation of chronic hepatitis B in TCM from original 247 symptoms initially, and the model shows a better prospect for application in TCM diagnosis.
准确区分证候和证候信息(症状和实验室指标)之间的关系在医学诊断应用中是非常需要的。虽然判别法已被广泛应用,但判别诊断模型(DDT)的研究和应用在慢性乙型肝炎的中医诊断中仍是空白。本文提出了一种基于属性选择、决策树C5.0算法和判别分析的新型判别诊断模型,该模型分为两个阶段。一个是属性选择。从原始属性中过滤出关键属性。二是建模阶段,获取慢性乙型肝炎证候与中医证候信息的判别。通过实验,选择中医临床症状与实验室指标相结合,初步从247种慢性乙型肝炎症状中提供中医辨证方药,该模型在中医诊断中具有较好的应用前景。
{"title":"Diagnosis based on decision tree and discrimination analysis for chronic hepatitis b in TCM","authors":"Xiaoyu Chen, Lizhuang Ma, Na Chu, Yiyang Hu","doi":"10.1109/BIBMW.2011.6112478","DOIUrl":"https://doi.org/10.1109/BIBMW.2011.6112478","url":null,"abstract":"Accurate discriminants of relationship between syndromes and syndrome information (symptoms, and lab indicators) are much desired in medical diagnosis applications. Although discriminants have been applied widely, the researches and applications of discriminant diagnosis model (DDT) are still blanks in diagnosis of chronic hepatitis B in traditional Chinese medicine (TCM). In this paper, a new discriminant diagnosis model constructed by attribute selection, decision tree C5.0 algorithm and discrimination analysis is proposed, which consists of two phases. One is attribute selection. The critical attributes are filtered out from the original attributes. The other is modeling phase to acquire discriminants between syndromes of chronic hepatitis B and syndrome information in TCM. From our experiments, combinations of TCM clinical symptoms and lab indicators are selected to provide formulas for syndrome differentiation of chronic hepatitis B in TCM from original 247 symptoms initially, and the model shows a better prospect for application in TCM diagnosis.","PeriodicalId":6345,"journal":{"name":"2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)","volume":"26 1","pages":"817-822"},"PeriodicalIF":0.0,"publicationDate":"2011-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75946419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
An adaptive feature reduction algorithm for cancer classification using wavelet decomposition of serum proteomic and DNA microarray data 基于血清蛋白质组和DNA微阵列数据的小波分解自适应特征约简算法
S. Rashid, G. M. Maruf
A significant challenge in DNA microarray and mass spectrometric data analysis can be attributed to the problem of having a large number of features with a small number of samples or patients in the data set. Particular care is required to deal with such a problem as the low classification accuracy of a model brought about by the small number of features may depict a low predictive capability. To overcome the associated challenges, proper approaches for data preprocessing, feature reduction and identifying the optimal set of features are critical. In this paper, a novel technique has been proposed for feature reduction and cancer classification; which is applicable for two different types of biological data. The proposed method has been implemented on Surface enhanced laser desorption/ionization time-of-flight mass spectrometric (SELDI-TOF-MS) and DNA microarray data sets. This technique is self adaptive and independent of the type data sets. We have developed a two step strategy for feature reduction such as (1) data preprocessing which includes merging and t-testing and (2) wavelet decomposition. For classification purpose, support vector machine (SVM) has been proposed. By evaluating the performance of the proposed algorithm on the two types of datasets it has been shown that the classification accuracy, sensitivity and specificity obtained by the features selected by the proposed method consistently give excellent performance.
DNA微阵列和质谱数据分析的一个重大挑战可归因于数据集中具有少量样本或患者的大量特征的问题。需要特别注意的是,由于特征数量少而导致的模型分类精度低,可能说明模型的预测能力较低。为了克服相关的挑战,适当的数据预处理、特征缩减和识别最佳特征集的方法至关重要。本文提出了一种新的特征还原和肿瘤分类技术;这适用于两种不同类型的生物数据。该方法已在表面增强激光解吸/电离飞行时间质谱(SELDI-TOF-MS)和DNA微阵列数据集上实现。这种技术是自适应的,独立于类型数据集。我们已经开发了一个两步的特征约简策略,如:(1)数据预处理,包括合并和t检验;(2)小波分解。为了实现分类目的,提出了支持向量机(SVM)。通过对算法在两类数据集上的性能进行评价,结果表明,算法所选择的特征所获得的分类精度、灵敏度和特异性均具有优异的表现。
{"title":"An adaptive feature reduction algorithm for cancer classification using wavelet decomposition of serum proteomic and DNA microarray data","authors":"S. Rashid, G. M. Maruf","doi":"10.1109/BIBMW.2011.6112391","DOIUrl":"https://doi.org/10.1109/BIBMW.2011.6112391","url":null,"abstract":"A significant challenge in DNA microarray and mass spectrometric data analysis can be attributed to the problem of having a large number of features with a small number of samples or patients in the data set. Particular care is required to deal with such a problem as the low classification accuracy of a model brought about by the small number of features may depict a low predictive capability. To overcome the associated challenges, proper approaches for data preprocessing, feature reduction and identifying the optimal set of features are critical. In this paper, a novel technique has been proposed for feature reduction and cancer classification; which is applicable for two different types of biological data. The proposed method has been implemented on Surface enhanced laser desorption/ionization time-of-flight mass spectrometric (SELDI-TOF-MS) and DNA microarray data sets. This technique is self adaptive and independent of the type data sets. We have developed a two step strategy for feature reduction such as (1) data preprocessing which includes merging and t-testing and (2) wavelet decomposition. For classification purpose, support vector machine (SVM) has been proposed. By evaluating the performance of the proposed algorithm on the two types of datasets it has been shown that the classification accuracy, sensitivity and specificity obtained by the features selected by the proposed method consistently give excellent performance.","PeriodicalId":6345,"journal":{"name":"2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)","volume":"40 1","pages":"305-312"},"PeriodicalIF":0.0,"publicationDate":"2011-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72936030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Prediction of Trans-regulators of Recombination Hotspots in Mouse Genome 小鼠基因组重组热点反式调控因子的预测
Min Wu, C. Kwoh, T. Przytycka, Jing Li, Jie Zheng
The regulatory mechanism of recombination is a fundamental problem in genomics, with wide applications in genome wide association studies, birth-defect diseases, molecular evolution, cancer research, etc. In mammalian genomes, recombination events cluster into short genomic regions called ¡§recombination hotspots¡¨. Recently, a 13-mer motif enriched in hotspots is identified as a candidate cis-regulatory element of human recombination hotspots, moreover, a zinc finger protein, PRDM9, binds to this motif and is associated with variation of recombination phenotype in human and mouse genomes, thus is a trans-acting regulator of recombination hotspots. However, this pair of cis and trans-regulators covers only a fraction of hotspots, thus other regulators of recombination hotspots remain to be discovered. In this paper, we propose an approach to predicting additional trans-regulators from DNA-binding proteins by comparing their enrichment of binding sites in hotspots. Applying this approach on newly mapped mouse hotspots genome-wide, we confirmed that PRDM9 is a major trans-regulator of hotspots. In addition, a list of top candidate trans-regulators of mouse hotspots is reported. Using GO analysis we observed that the top genes are enriched with function of his tone modification, highlighting the epigenetic regulatory mechanisms of recombination hotspots.
重组调控机制是基因组学的一个基本问题,在全基因组关联研究、先天性缺陷疾病、分子进化、癌症研究等方面有着广泛的应用。在哺乳动物基因组中,重组事件聚集在称为“重组热点”的短基因组区域中。最近,一个富含热点的13-mer基序被确定为人类重组热点的候选顺式调控元件,并且锌指蛋白PRDM9与该基序结合,并与人类和小鼠基因组的重组表型变异有关,因此是重组热点的反式调控因子。然而,这对顺式和反式调控子只覆盖了一小部分热点,因此重组热点的其他调控子仍有待发现。在本文中,我们提出了一种通过比较dna结合蛋白在热点结合位点的富集程度来预测其他反式调节因子的方法。将这种方法应用于新绘制的小鼠全基因组热点,我们证实了PRDM9是热点的主要反式调节因子。此外,还报告了小鼠热点的顶级候选反式调节因子列表。通过氧化石墨烯分析,我们观察到顶端基因富集了他的音调修饰功能,突出了重组热点的表观遗传调控机制。
{"title":"Prediction of Trans-regulators of Recombination Hotspots in Mouse Genome","authors":"Min Wu, C. Kwoh, T. Przytycka, Jing Li, Jie Zheng","doi":"10.1109/BIBM.2011.77","DOIUrl":"https://doi.org/10.1109/BIBM.2011.77","url":null,"abstract":"The regulatory mechanism of recombination is a fundamental problem in genomics, with wide applications in genome wide association studies, birth-defect diseases, molecular evolution, cancer research, etc. In mammalian genomes, recombination events cluster into short genomic regions called ¡§recombination hotspots¡¨. Recently, a 13-mer motif enriched in hotspots is identified as a candidate cis-regulatory element of human recombination hotspots, moreover, a zinc finger protein, PRDM9, binds to this motif and is associated with variation of recombination phenotype in human and mouse genomes, thus is a trans-acting regulator of recombination hotspots. However, this pair of cis and trans-regulators covers only a fraction of hotspots, thus other regulators of recombination hotspots remain to be discovered. In this paper, we propose an approach to predicting additional trans-regulators from DNA-binding proteins by comparing their enrichment of binding sites in hotspots. Applying this approach on newly mapped mouse hotspots genome-wide, we confirmed that PRDM9 is a major trans-regulator of hotspots. In addition, a list of top candidate trans-regulators of mouse hotspots is reported. Using GO analysis we observed that the top genes are enriched with function of his tone modification, highlighting the epigenetic regulatory mechanisms of recombination hotspots.","PeriodicalId":6345,"journal":{"name":"2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)","volume":"104 1","pages":"57-62"},"PeriodicalIF":0.0,"publicationDate":"2011-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74267351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Robust analysis of related samples under the presence of population substructure 种群子结构存在下相关样本的鲁棒性分析
Sungkyoung Choi, Sungho Won
We propose a new method for genome-wide association analysis with a family-based design. The proposed method is robust against population substructure while it is more efficient than the traditional method such as transmission disequilibrium test for related samples. The proposed method estimates the correlation matrix between individuals and then the principal component analysis is applied. To maximize the statistical power, we consider the additive polygenic model and a best linear unbiased predictor is used as offset. We confirmed that the proposed method is always efficient by simulation studies. The method will be applied to Framingham Heart study.
我们提出了一种基于家族设计的全基因组关联分析新方法。该方法对种群子结构具有较强的鲁棒性,同时比传统的相关样本传输不平衡检验等方法效率更高。该方法首先估计个体间的相关矩阵,然后进行主成分分析。为了使统计能力最大化,我们考虑了加性多基因模型,并使用最佳线性无偏预测器作为偏移。通过仿真研究,验证了该方法的有效性。该方法将应用于Framingham心脏研究。
{"title":"Robust analysis of related samples under the presence of population substructure","authors":"Sungkyoung Choi, Sungho Won","doi":"10.1109/BIBMW.2011.6112458","DOIUrl":"https://doi.org/10.1109/BIBMW.2011.6112458","url":null,"abstract":"We propose a new method for genome-wide association analysis with a family-based design. The proposed method is robust against population substructure while it is more efficient than the traditional method such as transmission disequilibrium test for related samples. The proposed method estimates the correlation matrix between individuals and then the principal component analysis is applied. To maximize the statistical power, we consider the additive polygenic model and a best linear unbiased predictor is used as offset. We confirmed that the proposed method is always efficient by simulation studies. The method will be applied to Framingham Heart study.","PeriodicalId":6345,"journal":{"name":"2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)","volume":"11 1","pages":"714-720"},"PeriodicalIF":0.0,"publicationDate":"2011-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84950649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1