特征选择方法对基于机器学习的印尼药用植物化合物与HIV-1蛋白酶对接预测的影响

Rahman Pujianto, Yohanes Gultom, A. Wibisono, Arry Yanuar, H. Suhartanto
{"title":"特征选择方法对基于机器学习的印尼药用植物化合物与HIV-1蛋白酶对接预测的影响","authors":"Rahman Pujianto, Yohanes Gultom, A. Wibisono, Arry Yanuar, H. Suhartanto","doi":"10.1109/ICACSIS47736.2019.8979672","DOIUrl":null,"url":null,"abstract":"This work evaluates usage feature selection methods to reduce the number of features required to predict docking results between Indonesian medicinal plant compounds and HIV protease. Two feature selection methods, Recursive Feature Elimination (RFE) and Wrapper Method (WM), are trained with a dataset of 7,330 samples and 667 features from PubChem Bioassay and DUD-E decoys. To evaluate the selected features, a dataset of 368 Indonesian herbal chemical compounds labeled by manually docking to PDB HIV-1 protease is used to benchmark the performance of linear SVM classifier using different sets of features. Our experiments show that a set of 471 features selected by RFE and 249 by WM achieve a reduction of classification time by 4.0 and 8.2 seconds respectively. Although the accuracy and sensitivity are also increased by 8% and 16%, no meaningful improvement observed for precision and specificity.","PeriodicalId":165090,"journal":{"name":"2019 International Conference on Advanced Computer Science and information Systems (ICACSIS)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The impact of feature selection methods on machine learning-based docking prediction of Indonesian medicinal plant compounds and HIV-1 protease\",\"authors\":\"Rahman Pujianto, Yohanes Gultom, A. Wibisono, Arry Yanuar, H. Suhartanto\",\"doi\":\"10.1109/ICACSIS47736.2019.8979672\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This work evaluates usage feature selection methods to reduce the number of features required to predict docking results between Indonesian medicinal plant compounds and HIV protease. Two feature selection methods, Recursive Feature Elimination (RFE) and Wrapper Method (WM), are trained with a dataset of 7,330 samples and 667 features from PubChem Bioassay and DUD-E decoys. To evaluate the selected features, a dataset of 368 Indonesian herbal chemical compounds labeled by manually docking to PDB HIV-1 protease is used to benchmark the performance of linear SVM classifier using different sets of features. Our experiments show that a set of 471 features selected by RFE and 249 by WM achieve a reduction of classification time by 4.0 and 8.2 seconds respectively. Although the accuracy and sensitivity are also increased by 8% and 16%, no meaningful improvement observed for precision and specificity.\",\"PeriodicalId\":165090,\"journal\":{\"name\":\"2019 International Conference on Advanced Computer Science and information Systems (ICACSIS)\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Advanced Computer Science and information Systems (ICACSIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICACSIS47736.2019.8979672\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Advanced Computer Science and information Systems (ICACSIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACSIS47736.2019.8979672","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

这项工作评估了使用特征选择方法,以减少预测印度尼西亚药用植物化合物与HIV蛋白酶之间对接结果所需的特征数量。两种特征选择方法,递归特征消除(RFE)和包装方法(WM),使用来自PubChem Bioassay和ddu - e诱饵的7,330个样本和667个特征的数据集进行训练。为了评估所选择的特征,使用368个印度尼西亚草药化合物的数据集,通过人工对接PDB HIV-1蛋白酶进行标记,对使用不同特征集的线性支持向量机分类器的性能进行基准测试。我们的实验表明,RFE选择471个特征集,WM选择249个特征集,分类时间分别减少4.0秒和8.2秒。虽然准确度和灵敏度也分别提高了8%和16%,但在精密度和特异性方面没有明显的提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
The impact of feature selection methods on machine learning-based docking prediction of Indonesian medicinal plant compounds and HIV-1 protease
This work evaluates usage feature selection methods to reduce the number of features required to predict docking results between Indonesian medicinal plant compounds and HIV protease. Two feature selection methods, Recursive Feature Elimination (RFE) and Wrapper Method (WM), are trained with a dataset of 7,330 samples and 667 features from PubChem Bioassay and DUD-E decoys. To evaluate the selected features, a dataset of 368 Indonesian herbal chemical compounds labeled by manually docking to PDB HIV-1 protease is used to benchmark the performance of linear SVM classifier using different sets of features. Our experiments show that a set of 471 features selected by RFE and 249 by WM achieve a reduction of classification time by 4.0 and 8.2 seconds respectively. Although the accuracy and sensitivity are also increased by 8% and 16%, no meaningful improvement observed for precision and specificity.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Evaluation of Instructional and User Interface Design for MOOC: Short and Free FutureLearn Courses Evaluation and Recommendations for the Instructional Design and User Interface Design of Coursera MOOC Platform Adult Content Classification on Indonesian Tweets using LSTM Neural Network Development of the Online Collaborative Summarizing Feature on Student-Centered E-Learning Environment Discriminating Unknown Software Using Distance Model
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1