Binary Markov Random Fields and interpretable mass spectra discrimination.

IF 0.8 4区 数学 Q4 BIOCHEMISTRY & MOLECULAR BIOLOGY Statistical Applications in Genetics and Molecular Biology Pub Date : 2017-02-11 DOI:10.1515/sagmb-2016-0019
Ao Kong, Robert Azencott
{"title":"Binary Markov Random Fields and interpretable mass spectra discrimination.","authors":"Ao Kong, Robert Azencott","doi":"10.1515/sagmb-2016-0019","DOIUrl":null,"url":null,"abstract":"<p><p>For mass spectra acquired from cancer patients by MALDI or SELDI techniques, automated discrimination between cancer types or stages has often been implemented by machine learning algorithms. Nevertheless, these techniques typically lack interpretability in terms of biomarkers. In this paper, we propose a new mass spectra discrimination algorithm by parameterized Markov Random Fields to automatically generate interpretable classifiers with small groups of scored biomarkers. A dataset of 238 MALDI colorectal mass spectra and two datasets of 216 and 253 SELDI ovarian mass spectra respectively were used to test our approach. The results show that our approach reaches accuracies of 81% to 100% to discriminate between patients from different colorectal and ovarian cancer stages, and performs as well or better than previous studies on similar datasets. Moreover, our approach enables efficient planar-displays to visualize mass spectra discrimination and has good asymptotic performance for large datasets. Thus, our classifiers should facilitate the choice and planning of further experiments for biological interpretation of cancer discriminating signatures. In our experiments, the number of mass spectra for each colorectal cancer stage is roughly half of that for each ovarian cancer stage, so that we reach lower discrimination accuracy for colorectal cancer than for ovarian cancer.</p>","PeriodicalId":48980,"journal":{"name":"Statistical Applications in Genetics and Molecular Biology","volume":" ","pages":""},"PeriodicalIF":0.8000,"publicationDate":"2017-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Applications in Genetics and Molecular Biology","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1515/sagmb-2016-0019","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

For mass spectra acquired from cancer patients by MALDI or SELDI techniques, automated discrimination between cancer types or stages has often been implemented by machine learning algorithms. Nevertheless, these techniques typically lack interpretability in terms of biomarkers. In this paper, we propose a new mass spectra discrimination algorithm by parameterized Markov Random Fields to automatically generate interpretable classifiers with small groups of scored biomarkers. A dataset of 238 MALDI colorectal mass spectra and two datasets of 216 and 253 SELDI ovarian mass spectra respectively were used to test our approach. The results show that our approach reaches accuracies of 81% to 100% to discriminate between patients from different colorectal and ovarian cancer stages, and performs as well or better than previous studies on similar datasets. Moreover, our approach enables efficient planar-displays to visualize mass spectra discrimination and has good asymptotic performance for large datasets. Thus, our classifiers should facilitate the choice and planning of further experiments for biological interpretation of cancer discriminating signatures. In our experiments, the number of mass spectra for each colorectal cancer stage is roughly half of that for each ovarian cancer stage, so that we reach lower discrimination accuracy for colorectal cancer than for ovarian cancer.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
二元马尔可夫随机场和可解释质谱鉴别。
对于通过 MALDI 或 SELDI 技术从癌症患者身上获取的质谱,通常采用机器学习算法自动区分癌症类型或阶段。然而,这些技术通常缺乏生物标记的可解释性。在本文中,我们提出了一种新的质谱判别算法,该算法通过参数化马尔可夫随机场自动生成可解释的分类器,并对生物标记物进行分组评分。我们使用了一个包含 238 个 MALDI 大肠癌质谱的数据集和两个分别包含 216 个和 253 个 SELDI 卵巢癌质谱的数据集来测试我们的方法。结果表明,我们的方法在区分不同阶段的结直肠癌和卵巢癌患者方面达到了 81% 至 100% 的准确率,与之前类似数据集的研究结果相比,我们的方法表现更好甚至更好。此外,我们的方法还能以高效的平面显示方式可视化质谱鉴别,并对大型数据集具有良好的渐进性能。因此,我们的分类器有助于选择和规划进一步的实验,以对癌症鉴别特征进行生物学解释。在我们的实验中,每个结直肠癌阶段的质谱数量大约是每个卵巢癌阶段的一半,因此我们对结直肠癌的判别准确率低于卵巢癌。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Statistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology BIOCHEMISTRY & MOLECULAR BIOLOGY-MATHEMATICAL & COMPUTATIONAL BIOLOGY
自引率
11.10%
发文量
8
期刊介绍: Statistical Applications in Genetics and Molecular Biology seeks to publish significant research on the application of statistical ideas to problems arising from computational biology. The focus of the papers should be on the relevant statistical issues but should contain a succinct description of the relevant biological problem being considered. The range of topics is wide and will include topics such as linkage mapping, association studies, gene finding and sequence alignment, protein structure prediction, design and analysis of microarray data, molecular evolution and phylogenetic trees, DNA topology, and data base search strategies. Both original research and review articles will be warmly received.
期刊最新文献
When is the allele-sharing dissimilarity between two populations exceeded by the allele-sharing dissimilarity of a population with itself? Sparse latent factor regression models for genome-wide and epigenome-wide association studies Low variability in the underlying cellular landscape adversely affects the performance of interaction-based approaches for conducting cell-specific analyses of DNA methylation in bulk samples. AdaReg: data adaptive robust estimation in linear regression with application in GTEx gene expressions. Collocation based training of neural ordinary differential equations.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1