首页 > 最新文献

Proteins-Structure Function and Bioinformatics最新文献

英文 中文
Engaging the Community: CASP Special Interest Groups. 参与社区:CASP特别兴趣小组。
IF 2.8 4区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-01 Epub Date: 2025-04-30 DOI: 10.1002/prot.26833
Arne Elofsson, Rachael C Kretsch, Marcin Magnus, Gaetano T Montelione

The Critical Assessment of Structure Prediction (CASP) brings together a diverse group of scientists, from deep learning experts to NMR specialists, all aimed at developing accurate prediction algorithms that can effectively characterize the structural aspects of biomolecules relevant to their functions. Engagement within the CASP community has traditionally been limited to the prediction season and the conference, with limited discourse in the 1.5 years between CASP seasons. CASP special interest groups (SIGs) were established in 2023 to encourage continuous dialogue within the community. The online seminar series has drawn global participation from across disciplines and career stages. This has facilitated cross-disciplinary discussions fostering collaborations. The archives of these seminars have become a vital learning tool for newcomers to the field, lowering the barrier to entry.

结构预测关键评估(CASP)汇集了不同的科学家群体,从深度学习专家到核磁共振专家,所有这些都旨在开发准确的预测算法,可以有效地表征与其功能相关的生物分子的结构方面。传统上,CASP社区的参与仅限于预测季节和会议,在CASP季节之间的1.5年里,讨论有限。CASP特别兴趣小组(SIGs)成立于2023年,旨在鼓励社区内的持续对话。该在线系列研讨会吸引了来自全球各个学科和职业阶段的参与者。这促进了跨学科讨论,促进了合作。这些研讨会的档案已成为该领域新手的重要学习工具,降低了进入门槛。
{"title":"Engaging the Community: CASP Special Interest Groups.","authors":"Arne Elofsson, Rachael C Kretsch, Marcin Magnus, Gaetano T Montelione","doi":"10.1002/prot.26833","DOIUrl":"10.1002/prot.26833","url":null,"abstract":"<p><p>The Critical Assessment of Structure Prediction (CASP) brings together a diverse group of scientists, from deep learning experts to NMR specialists, all aimed at developing accurate prediction algorithms that can effectively characterize the structural aspects of biomolecules relevant to their functions. Engagement within the CASP community has traditionally been limited to the prediction season and the conference, with limited discourse in the 1.5 years between CASP seasons. CASP special interest groups (SIGs) were established in 2023 to encourage continuous dialogue within the community. The online seminar series has drawn global participation from across disciplines and career stages. This has facilitated cross-disciplinary discussions fostering collaborations. The archives of these seminars have become a vital learning tool for newcomers to the field, lowering the barrier to entry.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"432-434"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12353253/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144043417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing RNA 3D Structure Prediction in CASP16: Integrating Physics-Based Modeling With Machine Learning for Improved Predictions. 在CASP16中增强RNA 3D结构预测:将基于物理的建模与机器学习集成以改进预测。
IF 2.8 4区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-01 Epub Date: 2025-06-09 DOI: 10.1002/prot.26856
Sicheng Zhang, Jun Li, Yuanzhe Zhou, Shi-Jie Chen

During the 16th Critical Assessment of Structure Prediction (CASP16), the Vfold team participated in the two RNA categories: RNA Monomers and RNA Multimers. The Vfold RNA structure prediction method is hierarchical and hybrid, incorporating physics-based models (Vfold2D and VfoldMCPX) for 2D structure prediction, template-based and molecular dynamics simulation-based models (Vfold-Pipeline, IsRNA and RNAJP) for 3D structure prediction. Additionally, Vfold integrates knowledge from templates and the state-of-the-art machine learning model AlphaFold3 into our physics-based models. This integration enhances the prediction accuracy. Here we describe the Vfold approach in CASP16 using selected targets and show how the integration of traditional structure prediction methods with machine learning models can improve RNA structure prediction accuracy.

在第16届结构预测关键评估(CASP16)期间,Vfold团队参与了RNA单体和RNA多聚体这两个RNA类别的测试。Vfold RNA结构预测方法是分层混合的,结合基于物理模型(Vfold2D和VfoldMCPX)进行二维结构预测,基于模板和基于分子动力学模拟模型(Vfold- pipeline、IsRNA和RNAJP)进行三维结构预测。此外,Vfold将模板中的知识和最先进的机器学习模型AlphaFold3集成到我们基于物理的模型中。这种集成提高了预测的准确性。在这里,我们使用选定的靶点描述了CASP16中的Vfold方法,并展示了传统结构预测方法与机器学习模型的集成如何提高RNA结构预测的准确性。
{"title":"Enhancing RNA 3D Structure Prediction in CASP16: Integrating Physics-Based Modeling With Machine Learning for Improved Predictions.","authors":"Sicheng Zhang, Jun Li, Yuanzhe Zhou, Shi-Jie Chen","doi":"10.1002/prot.26856","DOIUrl":"10.1002/prot.26856","url":null,"abstract":"<p><p>During the 16th Critical Assessment of Structure Prediction (CASP16), the Vfold team participated in the two RNA categories: RNA Monomers and RNA Multimers. The Vfold RNA structure prediction method is hierarchical and hybrid, incorporating physics-based models (Vfold2D and VfoldMCPX) for 2D structure prediction, template-based and molecular dynamics simulation-based models (Vfold-Pipeline, IsRNA and RNAJP) for 3D structure prediction. Additionally, Vfold integrates knowledge from templates and the state-of-the-art machine learning model AlphaFold3 into our physics-based models. This integration enhances the prediction accuracy. Here we describe the Vfold approach in CASP16 using selected targets and show how the integration of traditional structure prediction methods with machine learning models can improve RNA structure prediction accuracy.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"239-248"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12354339/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144250981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CASP16 Protein Monomer Structure Prediction Assessment. CASP16蛋白单体结构预测评估。
IF 2.8 4区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-01 Epub Date: 2025-08-17 DOI: 10.1002/prot.70031
Rongqing Yuan, Jing Zhang, Andriy Kryshtafovych, R Dustin Schaeffer, Jian Zhou, Qian Cong, Nick V Grishin

The assessment of monomer targets in the Critical Assessment of Structure Prediction Round 16 (CASP16) underscores that the problem of single-domain protein fold prediction is nearly solved-no target folds were incorrectly predicted across all Evaluation Units. However, challenges remain in accurately modeling truncated sequences, irregular secondary structures, and interaction-induced conformational changes. The release of AlphaFold3 (AF3) during CASP16, and its effective integration by many groups, demonstrated its superiority over AlphaFold2 (AF2), particularly in confidence estimation and model selection. Additional improvements in multiple sequence alignments (MSAs) and fragment-based prediction, that is, selecting the optimal fragment of the full sequence for modeling, also contributed to enhanced prediction accuracy. The top three groups-all from the Yang lab-consistently outperformed others across CASP16 monomer targets, reflecting their robust modeling pipelines and successful adoption of AF3. CASP16 also introduced three new challenges: Phase 0, in which stoichiometry was withheld; Phase 2, which supplied ~8000 MassiveFold models per target to test model selection strategies; and Model 6, which limited predictors to using MSAs provided by the organizers. While we evaluated group performance in these additional challenges, the insights gained were limited due to low participation and caveats in the design of experiments. We suggest improvements for the organization of these challenges and encourage broader engagement from the prediction community. The progress in monomer modeling from CASP15 to CASP16 was subtle, but more groups in CASP16 were able to outperform ColabFold, reflecting the community's improved ability in optimizing AF2 and the growing adoption of AF3. We anticipate that the recent release of the AF3 source code will stimulate future progress through user-driven optimization and innovations in model architecture. Finally, model ranking remains a persistent weakness across most groups, highlighting a critical area for future development.

在结构预测关键评估第16轮(CASP16)中对单体靶标的评估强调了单域蛋白折叠预测的问题几乎得到了解决-所有评估单元中没有错误预测目标折叠。然而,在精确建模截断序列、不规则二级结构和相互作用引起的构象变化方面仍然存在挑战。在CASP16期间,AlphaFold3 (AF3)的释放,以及它被许多组有效整合,证明了它比AlphaFold2 (AF2)的优势,特别是在置信度估计和模型选择方面。在多序列比对(msa)和基于片段的预测方面的其他改进,即选择完整序列的最佳片段进行建模,也有助于提高预测精度。来自Yang实验室的前三组在CASP16单体靶标上的表现始终优于其他组,这反映了他们强大的建模管道和AF3的成功采用。CASP16还引入了三个新的挑战:第0阶段,化学计量学被保留;第二阶段,为每个目标提供约8000个MassiveFold模型,以测试模型选择策略;模型6,它限制了预测者使用组织者提供的msa。虽然我们在这些额外的挑战中评估了小组的表现,但由于实验设计中的参与率低和注意事项,所获得的见解有限。我们建议改进这些挑战的组织,并鼓励预测社区更广泛的参与。从CASP15到CASP16的单体建模的进展是微妙的,但CASP16中的更多组能够优于ColabFold,这反映了社区优化AF2的能力提高以及AF3的越来越多的采用。我们期望最近发布的AF3源代码将通过用户驱动的优化和模型架构的创新来促进未来的发展。最后,在大多数群体中,模型排名仍然是一个持续的弱点,这突出了未来发展的一个关键领域。
{"title":"CASP16 Protein Monomer Structure Prediction Assessment.","authors":"Rongqing Yuan, Jing Zhang, Andriy Kryshtafovych, R Dustin Schaeffer, Jian Zhou, Qian Cong, Nick V Grishin","doi":"10.1002/prot.70031","DOIUrl":"10.1002/prot.70031","url":null,"abstract":"<p><p>The assessment of monomer targets in the Critical Assessment of Structure Prediction Round 16 (CASP16) underscores that the problem of single-domain protein fold prediction is nearly solved-no target folds were incorrectly predicted across all Evaluation Units. However, challenges remain in accurately modeling truncated sequences, irregular secondary structures, and interaction-induced conformational changes. The release of AlphaFold3 (AF3) during CASP16, and its effective integration by many groups, demonstrated its superiority over AlphaFold2 (AF2), particularly in confidence estimation and model selection. Additional improvements in multiple sequence alignments (MSAs) and fragment-based prediction, that is, selecting the optimal fragment of the full sequence for modeling, also contributed to enhanced prediction accuracy. The top three groups-all from the Yang lab-consistently outperformed others across CASP16 monomer targets, reflecting their robust modeling pipelines and successful adoption of AF3. CASP16 also introduced three new challenges: Phase 0, in which stoichiometry was withheld; Phase 2, which supplied ~8000 MassiveFold models per target to test model selection strategies; and Model 6, which limited predictors to using MSAs provided by the organizers. While we evaluated group performance in these additional challenges, the insights gained were limited due to low participation and caveats in the design of experiments. We suggest improvements for the organization of these challenges and encourage broader engagement from the prediction community. The progress in monomer modeling from CASP15 to CASP16 was subtle, but more groups in CASP16 were able to outperform ColabFold, reflecting the community's improved ability in optimizing AF2 and the growing adoption of AF3. We anticipate that the recent release of the AF3 source code will stimulate future progress through user-driven optimization and innovations in model architecture. Finally, model ranking remains a persistent weakness across most groups, highlighting a critical area for future development.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"86-105"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750037/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144876980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model Quality Assessment for CASP16. CASP16模型质量评价。
IF 2.8 4区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-01 Epub Date: 2025-08-22 DOI: 10.1002/prot.70037
Alisia Fadini, Gabriel Studer, Randy J Read

The CASP16 evaluation of model accuracy (EMA) experiment assessed the ability of predictors to estimate the accuracy of predicted models, with a particular emphasis on multimeric assemblies. Expanding on the CASP15 framework, CASP16 introduced a new evaluation mode (QMODE3) focused on selecting high-quality models from large-scale AlphaFold2-derived model pools generated by MassiveFold. Three primary evaluation tasks were therefore conducted: QMODE1 assessed global structure accuracy, QMODE2 focused on the accuracy of interface residues, and QMODE3 tested model selection performance. Predictors were evaluated using a diverse set of OpenStructure-based metrics, and a novel penalty-based ranking scheme was developed for QMODE3 to handle score interdependence and varying prediction quality distributions. Additionally, we explored the accuracy and utility of predicted local confidence measures now made available on a per-atom basis by methods that invoke AlphaFold3. Results showed that methods incorporating AlphaFold3-derived features-particularly per-atom pLDDT-performed best in estimating local accuracy and in utility for experimental structure solution. For QMODE3, performance varied significantly across monomeric, homomeric, and heteromeric target categories and underscored the ongoing challenge of evaluating complex assemblies.

CASP16模型准确性评估(EMA)实验评估了预测者估计预测模型准确性的能力,特别强调了多聚体组装。在CASP15框架的基础上,CASP16引入了一种新的评估模式(QMODE3),侧重于从MassiveFold生成的大规模alphafold2衍生模型池中选择高质量的模型。因此进行了三个主要的评估任务:QMODE1评估全局结构精度,QMODE2侧重于界面残留物的精度,QMODE3测试模型选择性能。使用一组不同的基于openstructure的指标对预测器进行评估,并为QMODE3开发了一种新的基于惩罚的排名方案,以处理分数相互依赖和不同的预测质量分布。此外,我们还探讨了预测的局部置信度度量的准确性和实用性,这些度量现在可以通过调用AlphaFold3的方法在每个原子的基础上获得。结果表明,结合alphafold3衍生特征的方法-特别是每个原子plddt -在估计局部精度和实验结构解决方案的实用性方面表现最好。对于QMODE3来说,性能在单体、同质和异质目标类别之间变化很大,并且强调了评估复杂组件的持续挑战。
{"title":"Model Quality Assessment for CASP16.","authors":"Alisia Fadini, Gabriel Studer, Randy J Read","doi":"10.1002/prot.70037","DOIUrl":"10.1002/prot.70037","url":null,"abstract":"<p><p>The CASP16 evaluation of model accuracy (EMA) experiment assessed the ability of predictors to estimate the accuracy of predicted models, with a particular emphasis on multimeric assemblies. Expanding on the CASP15 framework, CASP16 introduced a new evaluation mode (QMODE3) focused on selecting high-quality models from large-scale AlphaFold2-derived model pools generated by MassiveFold. Three primary evaluation tasks were therefore conducted: QMODE1 assessed global structure accuracy, QMODE2 focused on the accuracy of interface residues, and QMODE3 tested model selection performance. Predictors were evaluated using a diverse set of OpenStructure-based metrics, and a novel penalty-based ranking scheme was developed for QMODE3 to handle score interdependence and varying prediction quality distributions. Additionally, we explored the accuracy and utility of predicted local confidence measures now made available on a per-atom basis by methods that invoke AlphaFold3. Results showed that methods incorporating AlphaFold3-derived features-particularly per-atom pLDDT-performed best in estimating local accuracy and in utility for experimental structure solution. For QMODE3, performance varied significantly across monomeric, homomeric, and heteromeric target categories and underscored the ongoing challenge of evaluating complex assemblies.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"302-313"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750031/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144980309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Updates to the CASP Infrastructure in 2024. 2024年CASP基础设施的更新。
IF 2.8 4区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-01 Epub Date: 2025-09-01 DOI: 10.1002/prot.70042
Andriy Kryshtafovych, Maciej Milostan, Marc F Lensink, Sameer Velankar, Alexandre M J J Bonvin, John Moult, Krzysztof Fidelis

CASP (critical assessment of structure prediction) conducts community experiments to determine the state of the art in calculating macromolecular structures. The CASP data management system is continually evolving to address the changing needs of the experiments. For CASP16, we expanded the infrastructure to enable data handling of newly introduced categories and fully support pilot categories introduced in CASP15. This technical note also documents the integration of the CASP and CAPRI (Critical Assessment of PRedicted Interactions) systems.

CASP(结构预测的关键评估)进行社区实验,以确定计算大分子结构的最新技术。CASP数据管理系统不断发展,以满足不断变化的实验需求。对于CASP16,我们扩展了基础设施,使其能够处理新引入的类别,并完全支持CASP15中引入的试点类别。该技术说明还记录了CASP和CAPRI(预测相互作用的关键评估)系统的集成。
{"title":"Updates to the CASP Infrastructure in 2024.","authors":"Andriy Kryshtafovych, Maciej Milostan, Marc F Lensink, Sameer Velankar, Alexandre M J J Bonvin, John Moult, Krzysztof Fidelis","doi":"10.1002/prot.70042","DOIUrl":"10.1002/prot.70042","url":null,"abstract":"<p><p>CASP (critical assessment of structure prediction) conducts community experiments to determine the state of the art in calculating macromolecular structures. The CASP data management system is continually evolving to address the changing needs of the experiments. For CASP16, we expanded the infrastructure to enable data handling of newly introduced categories and fully support pilot categories introduced in CASP15. This technical note also documents the integration of the CASP and CAPRI (Critical Assessment of PRedicted Interactions) systems.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"15-24"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12422709/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144980338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accurate Biomolecular Structure Prediction in CASP16 With Optimized Inputs to State-Of-The-Art Predictors. 基于最先进预测器优化输入的CASP16精确生物分子结构预测
IF 2.8 4区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-01 Epub Date: 2025-08-05 DOI: 10.1002/prot.70030
Wenkai Wang, Yuxian Luo, Zhenling Peng, Jianyi Yang

Biomolecular structure prediction has reached an unprecedented level of accuracy, partly attributed to the use of advanced deep learning algorithms. We participated in the CASP16 experiments across the categories of protein domains, protein multimers, and RNA monomers, achieving official rankings of first, second, and fourth (top for server groups), respectively. We hypothesized that by leveraging state-of-the-art structure predictors such as AlphaFold2, AlphaFold3, trRosettaX2, and trRosettaRNA2, accurate structure predictions could be achieved through careful optimization of input information. For protein structure prediction, we enhanced the input sequences by removing intrinsically disordered regions, a simple yet effective approach that yielded accurate models for protein domains. However, fewer than 25% of the protein multimers were predicted with high quality. In RNA structure prediction, optimizing the secondary structure input for trRosettaRNA2 resulted in more accurate predictions than AlphaFold3. In summary, our prediction results in CASP16 indicate that protein domain structure prediction has achieved high accuracy. However, predicting protein multimers and RNA structures remains challenging, and we anticipate new advancements in these areas in the coming years.

生物分子结构预测的准确性达到了前所未有的水平,部分原因是使用了先进的深度学习算法。我们参与了CASP16蛋白结构域、蛋白多聚体和RNA单体的实验,分别获得了官方排名第一、第二和第四(服务器组排名第一)。我们假设,通过利用最先进的结构预测器,如AlphaFold2、AlphaFold3、trRosettaX2和trRosettaRNA2,可以通过仔细优化输入信息来实现准确的结构预测。对于蛋白质结构预测,我们通过去除内在无序区域来增强输入序列,这是一种简单而有效的方法,可以产生准确的蛋白质结构域模型。然而,不到25%的蛋白多聚体被预测为高质量。在RNA结构预测中,优化trRosettaRNA2的二级结构输入比AlphaFold3的预测更准确。综上所述,我们在CASP16上的预测结果表明,蛋白质结构域的预测达到了较高的准确性。然而,预测蛋白质多聚体和RNA结构仍然具有挑战性,我们预计在未来几年这些领域将取得新的进展。
{"title":"Accurate Biomolecular Structure Prediction in CASP16 With Optimized Inputs to State-Of-The-Art Predictors.","authors":"Wenkai Wang, Yuxian Luo, Zhenling Peng, Jianyi Yang","doi":"10.1002/prot.70030","DOIUrl":"10.1002/prot.70030","url":null,"abstract":"<p><p>Biomolecular structure prediction has reached an unprecedented level of accuracy, partly attributed to the use of advanced deep learning algorithms. We participated in the CASP16 experiments across the categories of protein domains, protein multimers, and RNA monomers, achieving official rankings of first, second, and fourth (top for server groups), respectively. We hypothesized that by leveraging state-of-the-art structure predictors such as AlphaFold2, AlphaFold3, trRosettaX2, and trRosettaRNA2, accurate structure predictions could be achieved through careful optimization of input information. For protein structure prediction, we enhanced the input sequences by removing intrinsically disordered regions, a simple yet effective approach that yielded accurate models for protein domains. However, fewer than 25% of the protein multimers were predicted with high quality. In RNA structure prediction, optimizing the secondary structure input for trRosettaRNA2 resulted in more accurate predictions than AlphaFold3. In summary, our prediction results in CASP16 indicate that protein domain structure prediction has achieved high accuracy. However, predicting protein multimers and RNA structures remains challenging, and we anticipate new advancements in these areas in the coming years.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"142-153"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144786043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graph_RG: Dominating CASP16's Small Molecule Affinity Prediction Subcategory-A Pose-Free Framework for Billion-Scale Virtual Screening. Graph_RG:支配CASP16小分子亲和预测亚类-十亿尺度虚拟筛选的无姿态框架。
IF 2.8 4区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-01 Epub Date: 2025-06-20 DOI: 10.1002/prot.70010
Haiping Zhang

Protein-ligand interaction prediction is pivotal in early-stage drug development, enabling large-scale virtual screening, drug optimization, and reverse target searching. In this work, we present Graph_RG, our top-performing model in the CASP16 small molecule track's protein-ligand affinity prediction category, achieving a N-weighted Kendall's Tau of 0.42-significantly outperforming other submissions (second-best: 0.36). Beyond accuracy, Graph_RG is noncomplex dependent, hence exhibits exceptional computational efficiency, operating > 100 000× faster than conformation-search dependent prediction methods, thus enabling billion- to 10-billion-scale screening on standard servers. We further discuss the potential improvements for Graph_RG, including dataset optimization, atomic vector representation enhancements, and model architecture upgrades. We also introduce the potential broader applications in large-scale drug screening, reverse target identification, and GPCR-specific drug discovery. We also point out the development of an interactive web platform hosting Graph_RG and its derivative models to enhance accessibility. By integrating community feedback and iterative model refinement, this initiative bridges the gap between AI-driven predictions and practical drug discovery, fostering advancements in both computational methodologies and biomedical applications.

蛋白质-配体相互作用预测在早期药物开发中至关重要,可以实现大规模的虚拟筛选、药物优化和反向靶标搜索。在这项工作中,我们提出了Graph_RG,这是我们在CASP16小分子轨道的蛋白质配体亲和预测类别中表现最好的模型,实现了0.42的n加权Kendall's Tau,显著优于其他提交的模型(第二好:0.36)。除了准确性之外,Graph_RG是非复杂依赖的,因此表现出卓越的计算效率,运行速度比构象搜索依赖的预测方法快100万倍,因此可以在标准服务器上进行十亿到100亿规模的筛选。我们进一步讨论了Graph_RG的潜在改进,包括数据集优化、原子向量表示增强和模型架构升级。我们还介绍了在大规模药物筛选,反向靶标鉴定和gpcr特异性药物发现方面的潜在更广泛的应用。我们还指出了托管Graph_RG及其衍生模型的交互式web平台的开发,以增强可访问性。通过整合社区反馈和迭代模型改进,该计划弥合了人工智能驱动的预测与实际药物发现之间的差距,促进了计算方法和生物医学应用的进步。
{"title":"Graph_RG: Dominating CASP16's Small Molecule Affinity Prediction Subcategory-A Pose-Free Framework for Billion-Scale Virtual Screening.","authors":"Haiping Zhang","doi":"10.1002/prot.70010","DOIUrl":"10.1002/prot.70010","url":null,"abstract":"<p><p>Protein-ligand interaction prediction is pivotal in early-stage drug development, enabling large-scale virtual screening, drug optimization, and reverse target searching. In this work, we present Graph_RG, our top-performing model in the CASP16 small molecule track's protein-ligand affinity prediction category, achieving a N-weighted Kendall's Tau of 0.42-significantly outperforming other submissions (second-best: 0.36). Beyond accuracy, Graph_RG is noncomplex dependent, hence exhibits exceptional computational efficiency, operating > 100 000× faster than conformation-search dependent prediction methods, thus enabling billion- to 10-billion-scale screening on standard servers. We further discuss the potential improvements for Graph_RG, including dataset optimization, atomic vector representation enhancements, and model architecture upgrades. We also introduce the potential broader applications in large-scale drug screening, reverse target identification, and GPCR-specific drug discovery. We also point out the development of an interactive web platform hosting Graph_RG and its derivative models to enhance accessibility. By integrating community feedback and iterative model refinement, this initiative bridges the gap between AI-driven predictions and practical drug discovery, fostering advancements in both computational methodologies and biomedical applications.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"286-294"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144334487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Structure Modeling Protocols for Protein Multimer and RNA in CASP16 With Enhanced MSAs, Model Ranking, and Deep Learning. CASP16蛋白多聚体和RNA的结构建模协议与增强的msa,模型排序和深度学习。
IF 2.8 4区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-01 Epub Date: 2025-08-01 DOI: 10.1002/prot.70033
Yuki Kagaya, Tsukasa Nakamura, Jacob Verburgt, Anika Jain, Genki Terashi, Pranav Punuru, Emilia Tugolukova, Joon Hong Park, Anouka Saha, David Huang, Daisuke Kihara

We present the methods and results of our protein complex and RNA structure predictions at CASP16. Our approach integrated multiple state-of-the-art deep learning models with a consensus-based scoring method. To enhance the depth of multiple sequence alignments (MSAs), we employed a large metagenomic sequence database. Model ranking was performed with a state-of-the-art consensus ranking method, to which we added more scoring terms. These predictions were further refined manually based on literature evidence. For RNA, we adopted an ensemble approach that incorporated multiple state-of-the-art methods, centered around our NuFold framework. As a result, our KiharaLab group ranked first in protein complex prediction and third in RNA structure prediction. A detailed analysis of targets that significantly differed from those of other groups highlighted both the strengths of our MSA and scoring strategies, as well as areas requiring further improvement.

我们介绍了CASP16蛋白复合物和RNA结构预测的方法和结果。我们的方法将多个最先进的深度学习模型与基于共识的评分方法集成在一起。为了提高多序列比对(msa)的深度,我们使用了一个大型宏基因组序列数据库。模型排名是用最先进的共识排名方法进行的,我们增加了更多的评分项。这些预测是在文献证据的基础上进一步人工完善的。对于RNA,我们采用了一种集成方法,结合了多种最先进的方法,以NuFold框架为中心。因此,我们的KiharaLab小组在蛋白质复合物预测方面排名第一,在RNA结构预测方面排名第三。对与其他组显著不同的目标进行了详细分析,突出了我们的MSA和评分策略的优势,以及需要进一步改进的领域。
{"title":"Structure Modeling Protocols for Protein Multimer and RNA in CASP16 With Enhanced MSAs, Model Ranking, and Deep Learning.","authors":"Yuki Kagaya, Tsukasa Nakamura, Jacob Verburgt, Anika Jain, Genki Terashi, Pranav Punuru, Emilia Tugolukova, Joon Hong Park, Anouka Saha, David Huang, Daisuke Kihara","doi":"10.1002/prot.70033","DOIUrl":"10.1002/prot.70033","url":null,"abstract":"<p><p>We present the methods and results of our protein complex and RNA structure predictions at CASP16. Our approach integrated multiple state-of-the-art deep learning models with a consensus-based scoring method. To enhance the depth of multiple sequence alignments (MSAs), we employed a large metagenomic sequence database. Model ranking was performed with a state-of-the-art consensus ranking method, to which we added more scoring terms. These predictions were further refined manually based on literature evidence. For RNA, we adopted an ensemble approach that incorporated multiple state-of-the-art methods, centered around our NuFold framework. As a result, our KiharaLab group ranked first in protein complex prediction and third in RNA structure prediction. A detailed analysis of targets that significantly differed from those of other groups highlighted both the strengths of our MSA and scoring strategies, as well as areas requiring further improvement.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"167-182"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12321240/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144765849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The RNA-Puzzles Assessments of RNA-Only Targets in CASP16. CASP16中仅rna靶点的rna谜题评估。
IF 2.8 4区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-01 Epub Date: 2025-10-03 DOI: 10.1002/prot.70052
Eric Westhof, Hao Sun, Fan Bu, Zhichao Miao

RNA-Puzzles was launched in 2011 as a collaborative effort dedicated to advancing and improving RNA 3D structure prediction. The automatic evaluation protocols for comparisons between prediction and experiment developed within RNA-Puzzles are applied to the 2024 CASP16 competition. The scores evaluate stereochemical parameters, Watson-Crick pairs, non-Watson-Crick pairs, and base stacking in addition to standard global parameters such as RMSD, TM-score, GDT, or lDDT. Several targets were particularly difficult owing to their size or multimerization. As noted in previous evaluations, although predictions that perform well on secondary structure may also achieve acceptable overall folds, they are insufficient to guarantee chemical precision or to correctly identify residues involved in non-Watson-Crick interactions. Both are essential for obtaining a valid three-dimensional architecture and for understanding the biological function of RNAs.

RNA- puzzles于2011年推出,是一项致力于推进和改进RNA 3D结构预测的合作努力。在RNA-Puzzles中开发的预测和实验比较自动评估协议应用于2024 CASP16竞赛。除了RMSD、TM-score、GDT或lDDT等标准全局参数外,这些分数还评估立体化学参数、沃森-克里克对、非沃森-克里克对和碱基堆叠。有几个目标由于其大小或多用途而特别困难。正如在以前的评价中所指出的,虽然二级结构上表现良好的预测也可能获得可接受的整体褶皱,但它们不足以保证化学精度或正确识别非沃森-克里克相互作用中涉及的残基。两者对于获得有效的三维结构和理解rna的生物学功能都是必不可少的。
{"title":"The RNA-Puzzles Assessments of RNA-Only Targets in CASP16.","authors":"Eric Westhof, Hao Sun, Fan Bu, Zhichao Miao","doi":"10.1002/prot.70052","DOIUrl":"10.1002/prot.70052","url":null,"abstract":"<p><p>RNA-Puzzles was launched in 2011 as a collaborative effort dedicated to advancing and improving RNA 3D structure prediction. The automatic evaluation protocols for comparisons between prediction and experiment developed within RNA-Puzzles are applied to the 2024 CASP16 competition. The scores evaluate stereochemical parameters, Watson-Crick pairs, non-Watson-Crick pairs, and base stacking in addition to standard global parameters such as RMSD, TM-score, GDT, or lDDT. Several targets were particularly difficult owing to their size or multimerization. As noted in previous evaluations, although predictions that perform well on secondary structure may also achieve acceptable overall folds, they are insufficient to guarantee chemical precision or to correctly identify residues involved in non-Watson-Crick interactions. Both are essential for obtaining a valid three-dimensional architecture and for understanding the biological function of RNAs.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"218-229"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750035/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145214549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MassiveFold Data for CASP16-CAPRI: A Systematic Massive Sampling Experiment. CASP16-CAPRI的海量数据:一个系统的海量采样实验。
IF 2.8 4区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY Pub Date : 2026-01-01 Epub Date: 2025-08-28 DOI: 10.1002/prot.70040
Nessim Raouraoua, Marc F Lensink, Guillaume Brysbaert

Massive sampling with AlphaFold2 has become a widely used approach in protein structure prediction. Here we present the MassiveFold CASP16-CAPRI dataset, a systematic, large-scale sampling of both monomeric and multimeric protein targets. By exploiting maximal parallelization, we produced up to 8040 models per target and shared them with the community for collaborative selection and scoring. This collective effort minimizes redundant computation and environmental impact, while granting resource-limited groups - especially those focused on scoring - access to high quality structures. In our analysis, we define an interface-difficulty classification based on DockQ metrics, showing that massive sampling yields the greatest gains on most of the challenging interfaces. Crucially, this classification can be predicted from the median ipTM scores of a routine AF2 run, enabling users to selectively deploy massive sampling only when it is most needed. Combined with a reduction of the massive sampling from 8040 to 2475 predictions, such targeted strategies dramatically cut computation time and resource use with minimal loss of accuracy. Finally, we underscore the persistent challenge of choosing optimal models from massive sampling datasets, emphasizing the need for more robust scoring methods. The MassiveFold datasets, together with AlphaFold ranking scores and CASP and CAPRI assessment metrics, are publicly available at https://github.com/GBLille/CASP16-CAPRI_MassiveFold_Data to accelerate further progress in protein structure prediction and assembly modeling.

利用AlphaFold2进行大规模采样已成为蛋白质结构预测中广泛使用的方法。在这里,我们展示了MassiveFold CASP16-CAPRI数据集,这是一个系统的、大规模的单体和多聚体蛋白靶点采样。通过利用最大的并行化,我们为每个目标生成了多达8040个模型,并与社区共享它们以进行协作选择和评分。这种集体努力最大限度地减少了冗余计算和对环境的影响,同时使资源有限的小组——特别是那些专注于得分的小组——能够获得高质量的结构。在我们的分析中,我们定义了一个基于DockQ指标的接口难度分类,显示大量采样在大多数具有挑战性的接口上产生最大的收益。至关重要的是,这种分类可以从常规AF2运行的ipTM分数中位数预测,使用户能够在最需要的时候有选择地部署大规模采样。结合将大规模采样从8040个减少到2475个预测,这种有针对性的策略显着减少了计算时间和资源使用,并将准确性损失降到最低。最后,我们强调了从大量采样数据集中选择最优模型的持续挑战,强调需要更健壮的评分方法。MassiveFold数据集以及AlphaFold排名分数和CASP和CAPRI评估指标可在https://github.com/GBLille/CASP16-CAPRI_MassiveFold_Data上公开获取,以加速蛋白质结构预测和组装建模的进一步进展。
{"title":"MassiveFold Data for CASP16-CAPRI: A Systematic Massive Sampling Experiment.","authors":"Nessim Raouraoua, Marc F Lensink, Guillaume Brysbaert","doi":"10.1002/prot.70040","DOIUrl":"10.1002/prot.70040","url":null,"abstract":"<p><p>Massive sampling with AlphaFold2 has become a widely used approach in protein structure prediction. Here we present the MassiveFold CASP16-CAPRI dataset, a systematic, large-scale sampling of both monomeric and multimeric protein targets. By exploiting maximal parallelization, we produced up to 8040 models per target and shared them with the community for collaborative selection and scoring. This collective effort minimizes redundant computation and environmental impact, while granting resource-limited groups - especially those focused on scoring - access to high quality structures. In our analysis, we define an interface-difficulty classification based on DockQ metrics, showing that massive sampling yields the greatest gains on most of the challenging interfaces. Crucially, this classification can be predicted from the median ipTM scores of a routine AF2 run, enabling users to selectively deploy massive sampling only when it is most needed. Combined with a reduction of the massive sampling from 8040 to 2475 predictions, such targeted strategies dramatically cut computation time and resource use with minimal loss of accuracy. Finally, we underscore the persistent challenge of choosing optimal models from massive sampling datasets, emphasizing the need for more robust scoring methods. The MassiveFold datasets, together with AlphaFold ranking scores and CASP and CAPRI assessment metrics, are publicly available at https://github.com/GBLille/CASP16-CAPRI_MassiveFold_Data to accelerate further progress in protein structure prediction and assembly modeling.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"425-431"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750025/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144980301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proteins-Structure Function and Bioinformatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1