支持向量机在生物医学文献分类中的应用。

Nalini Polavarapu, Shamkant B Navathe, Ramprasad Ramnarayanan, Abrar ul Haque, Saurav Sahay, Ying Liu
{"title":"支持向量机在生物医学文献分类中的应用。","authors":"Nalini Polavarapu,&nbsp;Shamkant B Navathe,&nbsp;Ramprasad Ramnarayanan,&nbsp;Abrar ul Haque,&nbsp;Saurav Sahay,&nbsp;Ying Liu","doi":"10.1109/csb.2005.36","DOIUrl":null,"url":null,"abstract":"<p><p>Specific topic search in the PubMed Database, one of the most important information resources for scientific community, presents a big challenge to the users. The researcher typically formulates boolean queries followed by scanning the retrieved records for relevance, which is very time consuming and error prone. We applied Support Vector Machines (SVM) for automatic retrieval of PubMed articles related to Human genome epidemiological research at CDC (Center for disease Control and Prevention). In this paper, we discuss various investigations into biomedical literature classification and analyze the effect of various issues related to the choice of keywords, training sets, kernel functions and parameters for the SVM technique. We report on the various factors above to show that SVM is a viable technique for automatic classification of biomedical literature into topics of interest such as epidemiology, cancer, birth defects etc. In all our experiments, we achieved high values of PPV, sensitivity and specificity.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"366-74"},"PeriodicalIF":0.0000,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.36","citationCount":"32","resultStr":"{\"title\":\"Investigation into biomedical literature classification using support vector machines.\",\"authors\":\"Nalini Polavarapu,&nbsp;Shamkant B Navathe,&nbsp;Ramprasad Ramnarayanan,&nbsp;Abrar ul Haque,&nbsp;Saurav Sahay,&nbsp;Ying Liu\",\"doi\":\"10.1109/csb.2005.36\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Specific topic search in the PubMed Database, one of the most important information resources for scientific community, presents a big challenge to the users. The researcher typically formulates boolean queries followed by scanning the retrieved records for relevance, which is very time consuming and error prone. We applied Support Vector Machines (SVM) for automatic retrieval of PubMed articles related to Human genome epidemiological research at CDC (Center for disease Control and Prevention). In this paper, we discuss various investigations into biomedical literature classification and analyze the effect of various issues related to the choice of keywords, training sets, kernel functions and parameters for the SVM technique. We report on the various factors above to show that SVM is a viable technique for automatic classification of biomedical literature into topics of interest such as epidemiology, cancer, birth defects etc. In all our experiments, we achieved high values of PPV, sensitivity and specificity.</p>\",\"PeriodicalId\":87417,\"journal\":{\"name\":\"Proceedings. IEEE Computational Systems Bioinformatics Conference\",\"volume\":\" \",\"pages\":\"366-74\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/csb.2005.36\",\"citationCount\":\"32\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. IEEE Computational Systems Bioinformatics Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/csb.2005.36\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE Computational Systems Bioinformatics Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/csb.2005.36","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 32

摘要

PubMed数据库是科学界最重要的信息资源之一,在PubMed数据库中进行专题检索给用户带来了很大的挑战。研究人员通常制定布尔查询,然后扫描检索到的记录的相关性,这是非常耗时和容易出错的。应用支持向量机(SVM)自动检索美国疾病控制与预防中心(CDC)与人类基因组流行病学研究相关的PubMed文章。本文讨论了生物医学文献分类的各种研究,并分析了支持向量机技术中关键字、训练集、核函数和参数选择等问题对支持向量机分类的影响。我们报告了上述各种因素,以表明支持向量机是一种可行的技术,用于将生物医学文献自动分类为感兴趣的主题,如流行病学,癌症,出生缺陷等。在我们所有的实验中,我们都获得了很高的PPV值,灵敏度和特异性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Investigation into biomedical literature classification using support vector machines.

Specific topic search in the PubMed Database, one of the most important information resources for scientific community, presents a big challenge to the users. The researcher typically formulates boolean queries followed by scanning the retrieved records for relevance, which is very time consuming and error prone. We applied Support Vector Machines (SVM) for automatic retrieval of PubMed articles related to Human genome epidemiological research at CDC (Center for disease Control and Prevention). In this paper, we discuss various investigations into biomedical literature classification and analyze the effect of various issues related to the choice of keywords, training sets, kernel functions and parameters for the SVM technique. We report on the various factors above to show that SVM is a viable technique for automatic classification of biomedical literature into topics of interest such as epidemiology, cancer, birth defects etc. In all our experiments, we achieved high values of PPV, sensitivity and specificity.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Tree decomposition based fast search of RNA structures including pseudoknots in genomes. An algebraic geometry approach to protein structure determination from NMR data. A tree-decomposition approach to protein structure prediction. A pivoting algorithm for metabolic networks in the presence of thermodynamic constraints. A topological measurement for weighted protein interaction network.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1