降胆固醇肽的机器学习分类模型

Jose Isagani B. Janairo
{"title":"降胆固醇肽的机器学习分类模型","authors":"Jose Isagani B. Janairo","doi":"10.1016/j.aichem.2023.100026","DOIUrl":null,"url":null,"abstract":"<div><p>Cholesterol-lowering peptides (CLPs) are bioactive biomolecules often derived from food proteins. These short peptides bind with bile acids leading to decreased intestinal absorption of cholesterol. CLPs are promising bioceuticals that can possibly be used to support interventions for the management of high cholesterol. Integrating machine learning (ML) in the screening and discovery workflow for CLP can reduce trial-and-error thereby accelerating and increase the efficiency of the overall process. In this study, a support vector machine model that can distinguish CLPs from non-CLPs is presented. The model was built on a diverse dataset of 1840 peptides, with sequence length that ranges from 4 to 7. The ML model only needs 8 features (VHSE scores), and the most important features were found to be related to peptide polarity and hydrophobicity based on feature importance analysis utilizing Shapley and permutation-based method. The formulated ML classifier is reliable, as demonstrated by AUC &gt;0.7 for a diverse test dataset and AUC &gt;0.9 for a conservative validation dataset composed mainly of the top and bottom CLPs. Overall, the presented ML model presents incremental yet meaningful advances to the application of ML for understanding the nature of CLPs, and their discovery and development.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S294974772300026X/pdfft?md5=0835f2ca55b7c8185903061e3f9f59c0&pid=1-s2.0-S294974772300026X-main.pdf","citationCount":"0","resultStr":"{\"title\":\"A machine learning classification model for cholesterol-lowering peptides\",\"authors\":\"Jose Isagani B. Janairo\",\"doi\":\"10.1016/j.aichem.2023.100026\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Cholesterol-lowering peptides (CLPs) are bioactive biomolecules often derived from food proteins. These short peptides bind with bile acids leading to decreased intestinal absorption of cholesterol. CLPs are promising bioceuticals that can possibly be used to support interventions for the management of high cholesterol. Integrating machine learning (ML) in the screening and discovery workflow for CLP can reduce trial-and-error thereby accelerating and increase the efficiency of the overall process. In this study, a support vector machine model that can distinguish CLPs from non-CLPs is presented. The model was built on a diverse dataset of 1840 peptides, with sequence length that ranges from 4 to 7. The ML model only needs 8 features (VHSE scores), and the most important features were found to be related to peptide polarity and hydrophobicity based on feature importance analysis utilizing Shapley and permutation-based method. The formulated ML classifier is reliable, as demonstrated by AUC &gt;0.7 for a diverse test dataset and AUC &gt;0.9 for a conservative validation dataset composed mainly of the top and bottom CLPs. Overall, the presented ML model presents incremental yet meaningful advances to the application of ML for understanding the nature of CLPs, and their discovery and development.</p></div>\",\"PeriodicalId\":72302,\"journal\":{\"name\":\"Artificial intelligence chemistry\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S294974772300026X/pdfft?md5=0835f2ca55b7c8185903061e3f9f59c0&pid=1-s2.0-S294974772300026X-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial intelligence chemistry\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S294974772300026X\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence chemistry","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S294974772300026X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

降胆固醇肽(CLPs)是一种生物活性分子,通常来源于食物蛋白质。这些短肽与胆汁酸结合,导致肠道对胆固醇的吸收减少。clp是很有前途的生物药品,可能用于支持干预高胆固醇的管理。将机器学习(ML)集成到CLP的筛选和发现工作流程中可以减少试错,从而加快并提高整个流程的效率。在本研究中,提出了一种能够区分clp和非clp的支持向量机模型。该模型建立在1840个肽的多样化数据集上,序列长度从4到7不等。ML模型只需要8个特征(VHSE评分),利用Shapley和基于置换的方法进行特征重要性分析,发现最重要的特征与肽极性和疏水性有关。所建立的ML分类器是可靠的,对于不同的测试数据集AUC > 0.7,对于主要由顶部和底部clp组成的保守验证数据集AUC > 0.9。总的来说,所提出的ML模型为ML在理解clp的本质及其发现和开发方面的应用提供了增量但有意义的进展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A machine learning classification model for cholesterol-lowering peptides

Cholesterol-lowering peptides (CLPs) are bioactive biomolecules often derived from food proteins. These short peptides bind with bile acids leading to decreased intestinal absorption of cholesterol. CLPs are promising bioceuticals that can possibly be used to support interventions for the management of high cholesterol. Integrating machine learning (ML) in the screening and discovery workflow for CLP can reduce trial-and-error thereby accelerating and increase the efficiency of the overall process. In this study, a support vector machine model that can distinguish CLPs from non-CLPs is presented. The model was built on a diverse dataset of 1840 peptides, with sequence length that ranges from 4 to 7. The ML model only needs 8 features (VHSE scores), and the most important features were found to be related to peptide polarity and hydrophobicity based on feature importance analysis utilizing Shapley and permutation-based method. The formulated ML classifier is reliable, as demonstrated by AUC >0.7 for a diverse test dataset and AUC >0.9 for a conservative validation dataset composed mainly of the top and bottom CLPs. Overall, the presented ML model presents incremental yet meaningful advances to the application of ML for understanding the nature of CLPs, and their discovery and development.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Artificial intelligence chemistry
Artificial intelligence chemistry Chemistry (General)
自引率
0.00%
发文量
0
审稿时长
21 days
期刊最新文献
Molecular similarity: Theory, applications, and perspectives Large-language models: The game-changers for materials science research Conf-GEM: A geometric information-assisted direct conformation generation model Top 20 influential AI-based technologies in chemistry User-friendly and industry-integrated AI for medicinal chemists and pharmaceuticals
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1