揭开蛋白质冠层构成的神秘面纱:利用重采样嵌入和机器学习进行预测

IF 5.6 1区 医学 Q1 MATERIALS SCIENCE, BIOMATERIALS Regenerative Biomaterials Pub Date : 2023-12-12 DOI:10.1093/rb/rbad082
Rong Liao, Yan Zhuang, Xiangfeng Li, Ke Chen, Xingming Wang, Cong Feng, Guangfu Yin, Xiangdong Zhu, Jiangli Lin, Xingdong Zhang
{"title":"揭开蛋白质冠层构成的神秘面纱:利用重采样嵌入和机器学习进行预测","authors":"Rong Liao, Yan Zhuang, Xiangfeng Li, Ke Chen, Xingming Wang, Cong Feng, Guangfu Yin, Xiangdong Zhu, Jiangli Lin, Xingdong Zhang","doi":"10.1093/rb/rbad082","DOIUrl":null,"url":null,"abstract":"Biomaterials with surface nanostructures effectively enhance protein secretion and stimulate tissue regeneration. When nanoparticles (NPs) enter the living system, they quickly interact with proteins in the body fluid, forming the protein corona (PC). The accurate prediction of the PC composition is critical for analyzing the osteoinductivity of biomaterials and guiding the reverse design of NPs. However, achieving accurate predictions remains a significant challenge. Although several machine learning (ML) models like RandomForest (RF) have been used for PC prediction, they often fail to consider the extreme values in the abundance region of PC absorption and struggle to improve accuracy due to the imbalanced data distribution. In this study, resampling embedding was introduced to resolve the issue of imbalanced distribution in PC data. Various ML models were evaluated, and RF model was finally used for prediction, and good correlation coefficient (R2) and Root-mean-square deviation (RMSE) values were obtained. Our ablation experiments demonstrated that the proposed method achieved an R2 of 0.68, indicating an improvement of approximately 10%, and an RMSE of 0.90, representing a reduction of approximately 10%. Furthermore, through the verification of label-free quantification of 4 NPs: hydroxyapatite (HA), titanium dioxide (TiO2), silicon dioxide (SiO2) and silver (Ag), and we achieved a prediction performance with an R2 value above 0.70 using Random Oversampling. Additionally, the feature analysis revealed that the composition of the PC is most significantly influenced by the incubation plasma concentration, PDI and surface modification.","PeriodicalId":20929,"journal":{"name":"Regenerative Biomaterials","volume":"25 1","pages":""},"PeriodicalIF":5.6000,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unveiling Protein Corona Composition: Predicting with Resampling Embedding and Machine Learning\",\"authors\":\"Rong Liao, Yan Zhuang, Xiangfeng Li, Ke Chen, Xingming Wang, Cong Feng, Guangfu Yin, Xiangdong Zhu, Jiangli Lin, Xingdong Zhang\",\"doi\":\"10.1093/rb/rbad082\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Biomaterials with surface nanostructures effectively enhance protein secretion and stimulate tissue regeneration. When nanoparticles (NPs) enter the living system, they quickly interact with proteins in the body fluid, forming the protein corona (PC). The accurate prediction of the PC composition is critical for analyzing the osteoinductivity of biomaterials and guiding the reverse design of NPs. However, achieving accurate predictions remains a significant challenge. Although several machine learning (ML) models like RandomForest (RF) have been used for PC prediction, they often fail to consider the extreme values in the abundance region of PC absorption and struggle to improve accuracy due to the imbalanced data distribution. In this study, resampling embedding was introduced to resolve the issue of imbalanced distribution in PC data. Various ML models were evaluated, and RF model was finally used for prediction, and good correlation coefficient (R2) and Root-mean-square deviation (RMSE) values were obtained. Our ablation experiments demonstrated that the proposed method achieved an R2 of 0.68, indicating an improvement of approximately 10%, and an RMSE of 0.90, representing a reduction of approximately 10%. Furthermore, through the verification of label-free quantification of 4 NPs: hydroxyapatite (HA), titanium dioxide (TiO2), silicon dioxide (SiO2) and silver (Ag), and we achieved a prediction performance with an R2 value above 0.70 using Random Oversampling. Additionally, the feature analysis revealed that the composition of the PC is most significantly influenced by the incubation plasma concentration, PDI and surface modification.\",\"PeriodicalId\":20929,\"journal\":{\"name\":\"Regenerative Biomaterials\",\"volume\":\"25 1\",\"pages\":\"\"},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2023-12-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Regenerative Biomaterials\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1093/rb/rbad082\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATERIALS SCIENCE, BIOMATERIALS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Regenerative Biomaterials","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1093/rb/rbad082","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
引用次数: 0

摘要

具有表面纳米结构的生物材料能有效提高蛋白质分泌,刺激组织再生。纳米粒子(NPs)进入生命系统后,会迅速与体液中的蛋白质相互作用,形成蛋白电晕(PC)。准确预测 PC 的组成对于分析生物材料的骨诱导性和指导 NPs 的逆向设计至关重要。然而,实现准确预测仍然是一项重大挑战。虽然随机森林(RandomForest,RF)等机器学习(ML)模型已被用于 PC 预测,但由于数据分布不平衡,这些模型往往无法考虑 PC 吸收丰度区域的极端值,也难以提高准确性。本研究引入了重采样嵌入来解决 PC 数据分布不平衡的问题。对各种 ML 模型进行了评估,最终采用 RF 模型进行预测,并获得了良好的相关系数(R2)和均方根偏差(RMSE)值。我们的消融实验表明,所提方法的 R2 值为 0.68,提高了约 10%,RMSE 值为 0.90,降低了约 10%。此外,通过对羟基磷灰石(HA)、二氧化钛(TiO2)、二氧化硅(SiO2)和银(Ag)这 4 种 NPs 进行无标记定量验证,我们利用随机过采样实现了 R2 值高于 0.70 的预测性能。此外,特征分析表明 PC 的组成受孵育等离子浓度、PDI 和表面改性的影响最大。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Unveiling Protein Corona Composition: Predicting with Resampling Embedding and Machine Learning
Biomaterials with surface nanostructures effectively enhance protein secretion and stimulate tissue regeneration. When nanoparticles (NPs) enter the living system, they quickly interact with proteins in the body fluid, forming the protein corona (PC). The accurate prediction of the PC composition is critical for analyzing the osteoinductivity of biomaterials and guiding the reverse design of NPs. However, achieving accurate predictions remains a significant challenge. Although several machine learning (ML) models like RandomForest (RF) have been used for PC prediction, they often fail to consider the extreme values in the abundance region of PC absorption and struggle to improve accuracy due to the imbalanced data distribution. In this study, resampling embedding was introduced to resolve the issue of imbalanced distribution in PC data. Various ML models were evaluated, and RF model was finally used for prediction, and good correlation coefficient (R2) and Root-mean-square deviation (RMSE) values were obtained. Our ablation experiments demonstrated that the proposed method achieved an R2 of 0.68, indicating an improvement of approximately 10%, and an RMSE of 0.90, representing a reduction of approximately 10%. Furthermore, through the verification of label-free quantification of 4 NPs: hydroxyapatite (HA), titanium dioxide (TiO2), silicon dioxide (SiO2) and silver (Ag), and we achieved a prediction performance with an R2 value above 0.70 using Random Oversampling. Additionally, the feature analysis revealed that the composition of the PC is most significantly influenced by the incubation plasma concentration, PDI and surface modification.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Regenerative Biomaterials
Regenerative Biomaterials Materials Science-Biomaterials
CiteScore
7.90
自引率
16.40%
发文量
92
审稿时长
10 weeks
期刊介绍: Regenerative Biomaterials is an international, interdisciplinary, peer-reviewed journal publishing the latest advances in biomaterials and regenerative medicine. The journal provides a forum for the publication of original research papers, reviews, clinical case reports, and commentaries on the topics relevant to the development of advanced regenerative biomaterials concerning novel regenerative technologies and therapeutic approaches for the regeneration and repair of damaged tissues and organs. The interactions of biomaterials with cells and tissue, especially with stem cells, will be of particular focus.
期刊最新文献
Correction to: Nanocarrier of Pin1 inhibitor based on supercritical fluid technology inhibits cancer metastasis by blocking multiple signaling pathways. Cell-microsphere based living microhybrids for osteogenesis regulating to boosting biomineralization. Determination of DNA content as quality control in decellularized tissues: challenges and pitfalls. Injectable drug-loaded thermosensitive hydrogel delivery system for protecting retina ganglion cells in traumatic optic neuropathy. Correction to: Constructing a highly efficient multifunctional carbon quantum dot platform for the treatment of infectious wounds.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1