MiningBreastCancer: Selection of Candidate Gene Associated with Breast Cancer via Comparison between Data Mining of TCGA and Text Mining of PubMed

Chou-Cheng Chen, Y. Kuo, Chi-Hui Chiang
{"title":"MiningBreastCancer: Selection of Candidate Gene Associated with Breast Cancer via Comparison between Data Mining of TCGA and Text Mining of PubMed","authors":"Chou-Cheng Chen, Y. Kuo, Chi-Hui Chiang","doi":"10.1145/3440943.3444718","DOIUrl":null,"url":null,"abstract":"In 2016, 12,676 new cases of breast cancer were diagnosed among Taiwan women. In 2018 the standardized death rate of breast cancer was 12.5 per 100,000 persons. Previous studies have integrated data and text mining to yield fusion genes, identify genetic factors for breast cancer and select single-gene feature sets for colon cancer discrimination. However, our study is the first to select significantly different expression between breast normal tissue and cancer using TCGA data and biostatistics, excluding know genes using abstracts from PubMed and natural language processing. The top twenty genes for research potential from the selection of Mining-BreastCancer are EML3, ABCB9, GRASP, KANK3, GPR146, ZNF623, CCDC9, ADCY4, DLL1, ADAM33, GRRP1, LRRN4CL, C14orf180, ABCD4, ABCC6P1, PEAR1, FAM43A, C20orf160, KIF21A and PP-FIA3. Few studies for these genes exist, but they hold significantly different expressions between breast cancer and normal tissue, each pathologic tumor and lymph node, or between each pathologic metastasis. These results show that MiningBreastCancer can help scientists select genes for research potential. MiningBreastCancer is available through http://bio.yungyun.com.tw/MiningBreastCancer.aspx.","PeriodicalId":310247,"journal":{"name":"Proceedings of the 2020 ACM International Conference on Intelligent Computing and its Emerging Applications","volume":"120 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 ACM International Conference on Intelligent Computing and its Emerging Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3440943.3444718","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In 2016, 12,676 new cases of breast cancer were diagnosed among Taiwan women. In 2018 the standardized death rate of breast cancer was 12.5 per 100,000 persons. Previous studies have integrated data and text mining to yield fusion genes, identify genetic factors for breast cancer and select single-gene feature sets for colon cancer discrimination. However, our study is the first to select significantly different expression between breast normal tissue and cancer using TCGA data and biostatistics, excluding know genes using abstracts from PubMed and natural language processing. The top twenty genes for research potential from the selection of Mining-BreastCancer are EML3, ABCB9, GRASP, KANK3, GPR146, ZNF623, CCDC9, ADCY4, DLL1, ADAM33, GRRP1, LRRN4CL, C14orf180, ABCD4, ABCC6P1, PEAR1, FAM43A, C20orf160, KIF21A and PP-FIA3. Few studies for these genes exist, but they hold significantly different expressions between breast cancer and normal tissue, each pathologic tumor and lymph node, or between each pathologic metastasis. These results show that MiningBreastCancer can help scientists select genes for research potential. MiningBreastCancer is available through http://bio.yungyun.com.tw/MiningBreastCancer.aspx.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
挖掘乳腺癌:通过TCGA数据挖掘和PubMed文本挖掘的比较选择乳腺癌相关候选基因
2016年,台湾女性确诊乳腺癌新病例12676例。2018年,乳腺癌标准化死亡率为12.5 / 10万人。先前的研究将数据和文本挖掘结合起来,以产生融合基因,确定乳腺癌的遗传因素,并选择单基因特征集用于结肠癌的区分。然而,我们的研究是第一个使用TCGA数据和生物统计学来选择乳腺正常组织和癌症之间显著不同的表达,排除了使用PubMed摘要和自然语言处理的已知基因。在采矿型乳腺癌的筛选中,具有研究潜力的前20个基因分别是EML3、ABCB9、GRASP、KANK3、GPR146、ZNF623、CCDC9、ADCY4、DLL1、ADAM33、GRRP1、LRRN4CL、C14orf180、ABCD4、ABCC6P1、PEAR1、FAM43A、C20orf160、KIF21A和PP-FIA3。对这些基因的研究很少,但它们在乳腺癌与正常组织、各病理性肿瘤与淋巴结、各病理性转移之间的表达存在显著差异。这些结果表明,挖掘乳腺癌可以帮助科学家选择具有研究潜力的基因。MiningBreastCancer可通过http://bio.yungyun.com.tw/MiningBreastCancer.aspx获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
An Image Processing Approach for Improving the Recognition of Cluster-like Spheroidized Carbides XGBoost based Packer Identification study using Entry point Machine Learning-Based Profiling Attack Method in RSA Prime Multiplication A Classification method of Fake News based on Ensemble Learning Intelligent Controlling System in Aquaculture
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1