Characterization of Cancer Types by Applying Machine Learning Methods on Blood RNA-Sequencing Data

Cem Bugra Alkan, Z. Işik
{"title":"Characterization of Cancer Types by Applying Machine Learning Methods on Blood RNA-Sequencing Data","authors":"Cem Bugra Alkan, Z. Işik","doi":"10.1109/ISMSIT.2019.8932905","DOIUrl":null,"url":null,"abstract":"RNA-sequencing data is used to measure mRNA levels of genes based on tissue or blood samples. The critical changes in transcriptome can be observed more accurately by using RNA-sequencing data that eventually leads to understanding different behavior of the disease. In this study, different feature selection methods and machine learning algorithms are compared for the accurate classification of cancer types by using RNA-sequencing data from blood samples. In the analysis, seven cancer types were compared with each other and healthy samples. Correlation coefficient and information gain analysis are applied as feature selection methods. The selected genes are provided as the input of Support Vector Machine (SVM), Naïve Bayes (NB), and Random Forest (RF) methods. All machine learning methods were evaluated by applying 10-fold cross-validation. In the experiments, machine learning models achieved higher than 85% accuracy in the discrimination of hepatobiliary, lung, and pancreatic cancer types. When machine learning models are evaluated in terms of accuracy, RF and SVM were more successful than NB in many cases.","PeriodicalId":169791,"journal":{"name":"2019 3rd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 3rd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISMSIT.2019.8932905","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

RNA-sequencing data is used to measure mRNA levels of genes based on tissue or blood samples. The critical changes in transcriptome can be observed more accurately by using RNA-sequencing data that eventually leads to understanding different behavior of the disease. In this study, different feature selection methods and machine learning algorithms are compared for the accurate classification of cancer types by using RNA-sequencing data from blood samples. In the analysis, seven cancer types were compared with each other and healthy samples. Correlation coefficient and information gain analysis are applied as feature selection methods. The selected genes are provided as the input of Support Vector Machine (SVM), Naïve Bayes (NB), and Random Forest (RF) methods. All machine learning methods were evaluated by applying 10-fold cross-validation. In the experiments, machine learning models achieved higher than 85% accuracy in the discrimination of hepatobiliary, lung, and pancreatic cancer types. When machine learning models are evaluated in terms of accuracy, RF and SVM were more successful than NB in many cases.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在血液rna测序数据上应用机器学习方法表征癌症类型
rna测序数据用于测量基于组织或血液样本的基因mRNA水平。通过使用rna测序数据,可以更准确地观察转录组的关键变化,从而最终了解疾病的不同行为。在本研究中,通过使用来自血液样本的rna测序数据,比较了不同的特征选择方法和机器学习算法对癌症类型的准确分类。在分析中,将七种癌症类型与健康样本进行了比较。采用相关系数分析和信息增益分析作为特征选择方法。选择的基因作为支持向量机(SVM)、Naïve贝叶斯(NB)和随机森林(RF)方法的输入。通过应用10倍交叉验证对所有机器学习方法进行评估。在实验中,机器学习模型在区分肝胆癌、肺癌和胰腺癌类型方面的准确率超过85%。当评估机器学习模型的准确性时,RF和SVM在许多情况下比NB更成功。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Machine Learning Applications in Disease Surveillance Open-Source Web-Based Software for Performing Permutation Tests Graph-Based Representation of Customer Reviews for Online Stores Aynı Şartlar Altında Farklı Üretici Çekişmeli Ağların Karşılaştırılması Keratinocyte Carcinoma Detection via Convolutional Neural Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1