Impact of Multi-Factor Features on Protein Secondary Structure Prediction

IF 4.8 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Biomolecules Pub Date : 2024-09-13 DOI:10.3390/biom14091155
Benzhi Dong, Zheng Liu, Dali Xu, Chang Hou, Na Niu, Guohua Wang
{"title":"Impact of Multi-Factor Features on Protein Secondary Structure Prediction","authors":"Benzhi Dong, Zheng Liu, Dali Xu, Chang Hou, Na Niu, Guohua Wang","doi":"10.3390/biom14091155","DOIUrl":null,"url":null,"abstract":"Protein secondary structure prediction (PSSP) plays a crucial role in resolving protein functions and properties. Significant progress has been made in this field in recent years, and the use of a variety of protein-related features, including amino acid sequences, position-specific score matrices (PSSM), amino acid properties, and secondary structure trend factors, to improve prediction accuracy is an important technical route for it. However, a comprehensive evaluation of the impact of these factor features in secondary structure prediction is lacking in the current work. This study quantitatively analyzes the impact of several major factors on secondary structure prediction models using a more explanatory four-class machine learning approach. The applicability of each factor in the different types of methods, the extent to which the different methods work on each factor, and the evaluation of the effect of multi-factor combinations are explored in detail. Through experiments and analyses, it was found that PSSM performs best in methods with strong high-dimensional features and complex feature extraction capabilities, while amino acid sequences, although performing poorly overall, perform relatively well in methods with strong linear processing capabilities. Also, the combination of amino acid properties and trend factors significantly improved the prediction performance. This study provides empirical evidence for future researchers to optimize multi-factor feature combinations and apply them to protein secondary structure prediction models, which is beneficial in further optimizing the use of these factors to enhance the performance of protein secondary structure prediction models.","PeriodicalId":8943,"journal":{"name":"Biomolecules","volume":null,"pages":null},"PeriodicalIF":4.8000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomolecules","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.3390/biom14091155","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Protein secondary structure prediction (PSSP) plays a crucial role in resolving protein functions and properties. Significant progress has been made in this field in recent years, and the use of a variety of protein-related features, including amino acid sequences, position-specific score matrices (PSSM), amino acid properties, and secondary structure trend factors, to improve prediction accuracy is an important technical route for it. However, a comprehensive evaluation of the impact of these factor features in secondary structure prediction is lacking in the current work. This study quantitatively analyzes the impact of several major factors on secondary structure prediction models using a more explanatory four-class machine learning approach. The applicability of each factor in the different types of methods, the extent to which the different methods work on each factor, and the evaluation of the effect of multi-factor combinations are explored in detail. Through experiments and analyses, it was found that PSSM performs best in methods with strong high-dimensional features and complex feature extraction capabilities, while amino acid sequences, although performing poorly overall, perform relatively well in methods with strong linear processing capabilities. Also, the combination of amino acid properties and trend factors significantly improved the prediction performance. This study provides empirical evidence for future researchers to optimize multi-factor feature combinations and apply them to protein secondary structure prediction models, which is beneficial in further optimizing the use of these factors to enhance the performance of protein secondary structure prediction models.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
多因素特征对蛋白质二级结构预测的影响
蛋白质二级结构预测(PSSP)在解析蛋白质功能和性质方面发挥着至关重要的作用。近年来该领域取得了重大进展,利用氨基酸序列、位置特异性评分矩阵(PSSM)、氨基酸性质和二级结构趋势因子等多种蛋白质相关特征提高预测精度是其重要的技术路线。然而,目前的工作还缺乏对这些因子特征在二级结构预测中的影响的全面评估。本研究采用解释性更强的四类机器学习方法,定量分析了几个主要因素对二级结构预测模型的影响。详细探讨了各因素在不同类型方法中的适用性、不同方法对各因素的作用程度以及多因素组合的效果评估。通过实验和分析发现,PSSM 在具有较强的高维特征和复杂特征提取能力的方法中表现最佳,而氨基酸序列虽然整体表现不佳,但在具有较强线性处理能力的方法中表现相对较好。此外,氨基酸特性和趋势因子的结合也显著提高了预测性能。本研究为今后研究人员优化多因素特征组合并将其应用于蛋白质二级结构预测模型提供了实证依据,有利于进一步优化这些因素的使用,提高蛋白质二级结构预测模型的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Biomolecules
Biomolecules Biochemistry, Genetics and Molecular Biology-Molecular Biology
CiteScore
9.40
自引率
3.60%
发文量
1640
审稿时长
18.28 days
期刊介绍: Biomolecules (ISSN 2218-273X) is an international, peer-reviewed open access journal focusing on biogenic substances and their biological functions, structures, interactions with other molecules, and their microenvironment as well as biological systems. Biomolecules publishes reviews, regular research papers and short communications.  Our aim is to encourage scientists to publish their experimental and theoretical results in as much detail as possible. There is no restriction on the length of the papers. The full experimental details must be provided so that the results can be reproduced.
期刊最新文献
Chitosan-Modified AgNPs Efficiently Inhibit Swine Coronavirus-Induced Host Cell Infections via Targeting the Spike Protein Impact of Multi-Factor Features on Protein Secondary Structure Prediction Special Issue “Phytohormones 2022–2023” The Effects of Kynurenic Acid in Zebrafish Embryos and Adult Rainbow Trout Sheng Xue Ning as a Novel Agent that Promotes SCF-Driven Hematopoietic Stem/Progenitor Cell Proliferation to Promote Erythropoiesis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1