Pub Date : 2026-01-01Epub Date: 2025-04-30DOI: 10.1002/prot.26833
Arne Elofsson, Rachael C Kretsch, Marcin Magnus, Gaetano T Montelione
The Critical Assessment of Structure Prediction (CASP) brings together a diverse group of scientists, from deep learning experts to NMR specialists, all aimed at developing accurate prediction algorithms that can effectively characterize the structural aspects of biomolecules relevant to their functions. Engagement within the CASP community has traditionally been limited to the prediction season and the conference, with limited discourse in the 1.5 years between CASP seasons. CASP special interest groups (SIGs) were established in 2023 to encourage continuous dialogue within the community. The online seminar series has drawn global participation from across disciplines and career stages. This has facilitated cross-disciplinary discussions fostering collaborations. The archives of these seminars have become a vital learning tool for newcomers to the field, lowering the barrier to entry.
{"title":"Engaging the Community: CASP Special Interest Groups.","authors":"Arne Elofsson, Rachael C Kretsch, Marcin Magnus, Gaetano T Montelione","doi":"10.1002/prot.26833","DOIUrl":"10.1002/prot.26833","url":null,"abstract":"<p><p>The Critical Assessment of Structure Prediction (CASP) brings together a diverse group of scientists, from deep learning experts to NMR specialists, all aimed at developing accurate prediction algorithms that can effectively characterize the structural aspects of biomolecules relevant to their functions. Engagement within the CASP community has traditionally been limited to the prediction season and the conference, with limited discourse in the 1.5 years between CASP seasons. CASP special interest groups (SIGs) were established in 2023 to encourage continuous dialogue within the community. The online seminar series has drawn global participation from across disciplines and career stages. This has facilitated cross-disciplinary discussions fostering collaborations. The archives of these seminars have become a vital learning tool for newcomers to the field, lowering the barrier to entry.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"432-434"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12353253/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144043417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-06-09DOI: 10.1002/prot.26856
Sicheng Zhang, Jun Li, Yuanzhe Zhou, Shi-Jie Chen
During the 16th Critical Assessment of Structure Prediction (CASP16), the Vfold team participated in the two RNA categories: RNA Monomers and RNA Multimers. The Vfold RNA structure prediction method is hierarchical and hybrid, incorporating physics-based models (Vfold2D and VfoldMCPX) for 2D structure prediction, template-based and molecular dynamics simulation-based models (Vfold-Pipeline, IsRNA and RNAJP) for 3D structure prediction. Additionally, Vfold integrates knowledge from templates and the state-of-the-art machine learning model AlphaFold3 into our physics-based models. This integration enhances the prediction accuracy. Here we describe the Vfold approach in CASP16 using selected targets and show how the integration of traditional structure prediction methods with machine learning models can improve RNA structure prediction accuracy.
{"title":"Enhancing RNA 3D Structure Prediction in CASP16: Integrating Physics-Based Modeling With Machine Learning for Improved Predictions.","authors":"Sicheng Zhang, Jun Li, Yuanzhe Zhou, Shi-Jie Chen","doi":"10.1002/prot.26856","DOIUrl":"10.1002/prot.26856","url":null,"abstract":"<p><p>During the 16th Critical Assessment of Structure Prediction (CASP16), the Vfold team participated in the two RNA categories: RNA Monomers and RNA Multimers. The Vfold RNA structure prediction method is hierarchical and hybrid, incorporating physics-based models (Vfold2D and VfoldMCPX) for 2D structure prediction, template-based and molecular dynamics simulation-based models (Vfold-Pipeline, IsRNA and RNAJP) for 3D structure prediction. Additionally, Vfold integrates knowledge from templates and the state-of-the-art machine learning model AlphaFold3 into our physics-based models. This integration enhances the prediction accuracy. Here we describe the Vfold approach in CASP16 using selected targets and show how the integration of traditional structure prediction methods with machine learning models can improve RNA structure prediction accuracy.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"239-248"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12354339/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144250981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-08-17DOI: 10.1002/prot.70031
Rongqing Yuan, Jing Zhang, Andriy Kryshtafovych, R Dustin Schaeffer, Jian Zhou, Qian Cong, Nick V Grishin
The assessment of monomer targets in the Critical Assessment of Structure Prediction Round 16 (CASP16) underscores that the problem of single-domain protein fold prediction is nearly solved-no target folds were incorrectly predicted across all Evaluation Units. However, challenges remain in accurately modeling truncated sequences, irregular secondary structures, and interaction-induced conformational changes. The release of AlphaFold3 (AF3) during CASP16, and its effective integration by many groups, demonstrated its superiority over AlphaFold2 (AF2), particularly in confidence estimation and model selection. Additional improvements in multiple sequence alignments (MSAs) and fragment-based prediction, that is, selecting the optimal fragment of the full sequence for modeling, also contributed to enhanced prediction accuracy. The top three groups-all from the Yang lab-consistently outperformed others across CASP16 monomer targets, reflecting their robust modeling pipelines and successful adoption of AF3. CASP16 also introduced three new challenges: Phase 0, in which stoichiometry was withheld; Phase 2, which supplied ~8000 MassiveFold models per target to test model selection strategies; and Model 6, which limited predictors to using MSAs provided by the organizers. While we evaluated group performance in these additional challenges, the insights gained were limited due to low participation and caveats in the design of experiments. We suggest improvements for the organization of these challenges and encourage broader engagement from the prediction community. The progress in monomer modeling from CASP15 to CASP16 was subtle, but more groups in CASP16 were able to outperform ColabFold, reflecting the community's improved ability in optimizing AF2 and the growing adoption of AF3. We anticipate that the recent release of the AF3 source code will stimulate future progress through user-driven optimization and innovations in model architecture. Finally, model ranking remains a persistent weakness across most groups, highlighting a critical area for future development.
{"title":"CASP16 Protein Monomer Structure Prediction Assessment.","authors":"Rongqing Yuan, Jing Zhang, Andriy Kryshtafovych, R Dustin Schaeffer, Jian Zhou, Qian Cong, Nick V Grishin","doi":"10.1002/prot.70031","DOIUrl":"10.1002/prot.70031","url":null,"abstract":"<p><p>The assessment of monomer targets in the Critical Assessment of Structure Prediction Round 16 (CASP16) underscores that the problem of single-domain protein fold prediction is nearly solved-no target folds were incorrectly predicted across all Evaluation Units. However, challenges remain in accurately modeling truncated sequences, irregular secondary structures, and interaction-induced conformational changes. The release of AlphaFold3 (AF3) during CASP16, and its effective integration by many groups, demonstrated its superiority over AlphaFold2 (AF2), particularly in confidence estimation and model selection. Additional improvements in multiple sequence alignments (MSAs) and fragment-based prediction, that is, selecting the optimal fragment of the full sequence for modeling, also contributed to enhanced prediction accuracy. The top three groups-all from the Yang lab-consistently outperformed others across CASP16 monomer targets, reflecting their robust modeling pipelines and successful adoption of AF3. CASP16 also introduced three new challenges: Phase 0, in which stoichiometry was withheld; Phase 2, which supplied ~8000 MassiveFold models per target to test model selection strategies; and Model 6, which limited predictors to using MSAs provided by the organizers. While we evaluated group performance in these additional challenges, the insights gained were limited due to low participation and caveats in the design of experiments. We suggest improvements for the organization of these challenges and encourage broader engagement from the prediction community. The progress in monomer modeling from CASP15 to CASP16 was subtle, but more groups in CASP16 were able to outperform ColabFold, reflecting the community's improved ability in optimizing AF2 and the growing adoption of AF3. We anticipate that the recent release of the AF3 source code will stimulate future progress through user-driven optimization and innovations in model architecture. Finally, model ranking remains a persistent weakness across most groups, highlighting a critical area for future development.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"86-105"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750037/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144876980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-08-22DOI: 10.1002/prot.70037
Alisia Fadini, Gabriel Studer, Randy J Read
The CASP16 evaluation of model accuracy (EMA) experiment assessed the ability of predictors to estimate the accuracy of predicted models, with a particular emphasis on multimeric assemblies. Expanding on the CASP15 framework, CASP16 introduced a new evaluation mode (QMODE3) focused on selecting high-quality models from large-scale AlphaFold2-derived model pools generated by MassiveFold. Three primary evaluation tasks were therefore conducted: QMODE1 assessed global structure accuracy, QMODE2 focused on the accuracy of interface residues, and QMODE3 tested model selection performance. Predictors were evaluated using a diverse set of OpenStructure-based metrics, and a novel penalty-based ranking scheme was developed for QMODE3 to handle score interdependence and varying prediction quality distributions. Additionally, we explored the accuracy and utility of predicted local confidence measures now made available on a per-atom basis by methods that invoke AlphaFold3. Results showed that methods incorporating AlphaFold3-derived features-particularly per-atom pLDDT-performed best in estimating local accuracy and in utility for experimental structure solution. For QMODE3, performance varied significantly across monomeric, homomeric, and heteromeric target categories and underscored the ongoing challenge of evaluating complex assemblies.
{"title":"Model Quality Assessment for CASP16.","authors":"Alisia Fadini, Gabriel Studer, Randy J Read","doi":"10.1002/prot.70037","DOIUrl":"10.1002/prot.70037","url":null,"abstract":"<p><p>The CASP16 evaluation of model accuracy (EMA) experiment assessed the ability of predictors to estimate the accuracy of predicted models, with a particular emphasis on multimeric assemblies. Expanding on the CASP15 framework, CASP16 introduced a new evaluation mode (QMODE3) focused on selecting high-quality models from large-scale AlphaFold2-derived model pools generated by MassiveFold. Three primary evaluation tasks were therefore conducted: QMODE1 assessed global structure accuracy, QMODE2 focused on the accuracy of interface residues, and QMODE3 tested model selection performance. Predictors were evaluated using a diverse set of OpenStructure-based metrics, and a novel penalty-based ranking scheme was developed for QMODE3 to handle score interdependence and varying prediction quality distributions. Additionally, we explored the accuracy and utility of predicted local confidence measures now made available on a per-atom basis by methods that invoke AlphaFold3. Results showed that methods incorporating AlphaFold3-derived features-particularly per-atom pLDDT-performed best in estimating local accuracy and in utility for experimental structure solution. For QMODE3, performance varied significantly across monomeric, homomeric, and heteromeric target categories and underscored the ongoing challenge of evaluating complex assemblies.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"302-313"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750031/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144980309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-09-01DOI: 10.1002/prot.70042
Andriy Kryshtafovych, Maciej Milostan, Marc F Lensink, Sameer Velankar, Alexandre M J J Bonvin, John Moult, Krzysztof Fidelis
CASP (critical assessment of structure prediction) conducts community experiments to determine the state of the art in calculating macromolecular structures. The CASP data management system is continually evolving to address the changing needs of the experiments. For CASP16, we expanded the infrastructure to enable data handling of newly introduced categories and fully support pilot categories introduced in CASP15. This technical note also documents the integration of the CASP and CAPRI (Critical Assessment of PRedicted Interactions) systems.
{"title":"Updates to the CASP Infrastructure in 2024.","authors":"Andriy Kryshtafovych, Maciej Milostan, Marc F Lensink, Sameer Velankar, Alexandre M J J Bonvin, John Moult, Krzysztof Fidelis","doi":"10.1002/prot.70042","DOIUrl":"10.1002/prot.70042","url":null,"abstract":"<p><p>CASP (critical assessment of structure prediction) conducts community experiments to determine the state of the art in calculating macromolecular structures. The CASP data management system is continually evolving to address the changing needs of the experiments. For CASP16, we expanded the infrastructure to enable data handling of newly introduced categories and fully support pilot categories introduced in CASP15. This technical note also documents the integration of the CASP and CAPRI (Critical Assessment of PRedicted Interactions) systems.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"15-24"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12422709/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144980338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-08-05DOI: 10.1002/prot.70030
Wenkai Wang, Yuxian Luo, Zhenling Peng, Jianyi Yang
Biomolecular structure prediction has reached an unprecedented level of accuracy, partly attributed to the use of advanced deep learning algorithms. We participated in the CASP16 experiments across the categories of protein domains, protein multimers, and RNA monomers, achieving official rankings of first, second, and fourth (top for server groups), respectively. We hypothesized that by leveraging state-of-the-art structure predictors such as AlphaFold2, AlphaFold3, trRosettaX2, and trRosettaRNA2, accurate structure predictions could be achieved through careful optimization of input information. For protein structure prediction, we enhanced the input sequences by removing intrinsically disordered regions, a simple yet effective approach that yielded accurate models for protein domains. However, fewer than 25% of the protein multimers were predicted with high quality. In RNA structure prediction, optimizing the secondary structure input for trRosettaRNA2 resulted in more accurate predictions than AlphaFold3. In summary, our prediction results in CASP16 indicate that protein domain structure prediction has achieved high accuracy. However, predicting protein multimers and RNA structures remains challenging, and we anticipate new advancements in these areas in the coming years.
{"title":"Accurate Biomolecular Structure Prediction in CASP16 With Optimized Inputs to State-Of-The-Art Predictors.","authors":"Wenkai Wang, Yuxian Luo, Zhenling Peng, Jianyi Yang","doi":"10.1002/prot.70030","DOIUrl":"10.1002/prot.70030","url":null,"abstract":"<p><p>Biomolecular structure prediction has reached an unprecedented level of accuracy, partly attributed to the use of advanced deep learning algorithms. We participated in the CASP16 experiments across the categories of protein domains, protein multimers, and RNA monomers, achieving official rankings of first, second, and fourth (top for server groups), respectively. We hypothesized that by leveraging state-of-the-art structure predictors such as AlphaFold2, AlphaFold3, trRosettaX2, and trRosettaRNA2, accurate structure predictions could be achieved through careful optimization of input information. For protein structure prediction, we enhanced the input sequences by removing intrinsically disordered regions, a simple yet effective approach that yielded accurate models for protein domains. However, fewer than 25% of the protein multimers were predicted with high quality. In RNA structure prediction, optimizing the secondary structure input for trRosettaRNA2 resulted in more accurate predictions than AlphaFold3. In summary, our prediction results in CASP16 indicate that protein domain structure prediction has achieved high accuracy. However, predicting protein multimers and RNA structures remains challenging, and we anticipate new advancements in these areas in the coming years.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"142-153"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144786043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-06-20DOI: 10.1002/prot.70010
Haiping Zhang
Protein-ligand interaction prediction is pivotal in early-stage drug development, enabling large-scale virtual screening, drug optimization, and reverse target searching. In this work, we present Graph_RG, our top-performing model in the CASP16 small molecule track's protein-ligand affinity prediction category, achieving a N-weighted Kendall's Tau of 0.42-significantly outperforming other submissions (second-best: 0.36). Beyond accuracy, Graph_RG is noncomplex dependent, hence exhibits exceptional computational efficiency, operating > 100 000× faster than conformation-search dependent prediction methods, thus enabling billion- to 10-billion-scale screening on standard servers. We further discuss the potential improvements for Graph_RG, including dataset optimization, atomic vector representation enhancements, and model architecture upgrades. We also introduce the potential broader applications in large-scale drug screening, reverse target identification, and GPCR-specific drug discovery. We also point out the development of an interactive web platform hosting Graph_RG and its derivative models to enhance accessibility. By integrating community feedback and iterative model refinement, this initiative bridges the gap between AI-driven predictions and practical drug discovery, fostering advancements in both computational methodologies and biomedical applications.
{"title":"Graph_RG: Dominating CASP16's Small Molecule Affinity Prediction Subcategory-A Pose-Free Framework for Billion-Scale Virtual Screening.","authors":"Haiping Zhang","doi":"10.1002/prot.70010","DOIUrl":"10.1002/prot.70010","url":null,"abstract":"<p><p>Protein-ligand interaction prediction is pivotal in early-stage drug development, enabling large-scale virtual screening, drug optimization, and reverse target searching. In this work, we present Graph_RG, our top-performing model in the CASP16 small molecule track's protein-ligand affinity prediction category, achieving a N-weighted Kendall's Tau of 0.42-significantly outperforming other submissions (second-best: 0.36). Beyond accuracy, Graph_RG is noncomplex dependent, hence exhibits exceptional computational efficiency, operating > 100 000× faster than conformation-search dependent prediction methods, thus enabling billion- to 10-billion-scale screening on standard servers. We further discuss the potential improvements for Graph_RG, including dataset optimization, atomic vector representation enhancements, and model architecture upgrades. We also introduce the potential broader applications in large-scale drug screening, reverse target identification, and GPCR-specific drug discovery. We also point out the development of an interactive web platform hosting Graph_RG and its derivative models to enhance accessibility. By integrating community feedback and iterative model refinement, this initiative bridges the gap between AI-driven predictions and practical drug discovery, fostering advancements in both computational methodologies and biomedical applications.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"286-294"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144334487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-08-01DOI: 10.1002/prot.70033
Yuki Kagaya, Tsukasa Nakamura, Jacob Verburgt, Anika Jain, Genki Terashi, Pranav Punuru, Emilia Tugolukova, Joon Hong Park, Anouka Saha, David Huang, Daisuke Kihara
We present the methods and results of our protein complex and RNA structure predictions at CASP16. Our approach integrated multiple state-of-the-art deep learning models with a consensus-based scoring method. To enhance the depth of multiple sequence alignments (MSAs), we employed a large metagenomic sequence database. Model ranking was performed with a state-of-the-art consensus ranking method, to which we added more scoring terms. These predictions were further refined manually based on literature evidence. For RNA, we adopted an ensemble approach that incorporated multiple state-of-the-art methods, centered around our NuFold framework. As a result, our KiharaLab group ranked first in protein complex prediction and third in RNA structure prediction. A detailed analysis of targets that significantly differed from those of other groups highlighted both the strengths of our MSA and scoring strategies, as well as areas requiring further improvement.
{"title":"Structure Modeling Protocols for Protein Multimer and RNA in CASP16 With Enhanced MSAs, Model Ranking, and Deep Learning.","authors":"Yuki Kagaya, Tsukasa Nakamura, Jacob Verburgt, Anika Jain, Genki Terashi, Pranav Punuru, Emilia Tugolukova, Joon Hong Park, Anouka Saha, David Huang, Daisuke Kihara","doi":"10.1002/prot.70033","DOIUrl":"10.1002/prot.70033","url":null,"abstract":"<p><p>We present the methods and results of our protein complex and RNA structure predictions at CASP16. Our approach integrated multiple state-of-the-art deep learning models with a consensus-based scoring method. To enhance the depth of multiple sequence alignments (MSAs), we employed a large metagenomic sequence database. Model ranking was performed with a state-of-the-art consensus ranking method, to which we added more scoring terms. These predictions were further refined manually based on literature evidence. For RNA, we adopted an ensemble approach that incorporated multiple state-of-the-art methods, centered around our NuFold framework. As a result, our KiharaLab group ranked first in protein complex prediction and third in RNA structure prediction. A detailed analysis of targets that significantly differed from those of other groups highlighted both the strengths of our MSA and scoring strategies, as well as areas requiring further improvement.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"167-182"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12321240/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144765849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-10-03DOI: 10.1002/prot.70052
Eric Westhof, Hao Sun, Fan Bu, Zhichao Miao
RNA-Puzzles was launched in 2011 as a collaborative effort dedicated to advancing and improving RNA 3D structure prediction. The automatic evaluation protocols for comparisons between prediction and experiment developed within RNA-Puzzles are applied to the 2024 CASP16 competition. The scores evaluate stereochemical parameters, Watson-Crick pairs, non-Watson-Crick pairs, and base stacking in addition to standard global parameters such as RMSD, TM-score, GDT, or lDDT. Several targets were particularly difficult owing to their size or multimerization. As noted in previous evaluations, although predictions that perform well on secondary structure may also achieve acceptable overall folds, they are insufficient to guarantee chemical precision or to correctly identify residues involved in non-Watson-Crick interactions. Both are essential for obtaining a valid three-dimensional architecture and for understanding the biological function of RNAs.
{"title":"The RNA-Puzzles Assessments of RNA-Only Targets in CASP16.","authors":"Eric Westhof, Hao Sun, Fan Bu, Zhichao Miao","doi":"10.1002/prot.70052","DOIUrl":"10.1002/prot.70052","url":null,"abstract":"<p><p>RNA-Puzzles was launched in 2011 as a collaborative effort dedicated to advancing and improving RNA 3D structure prediction. The automatic evaluation protocols for comparisons between prediction and experiment developed within RNA-Puzzles are applied to the 2024 CASP16 competition. The scores evaluate stereochemical parameters, Watson-Crick pairs, non-Watson-Crick pairs, and base stacking in addition to standard global parameters such as RMSD, TM-score, GDT, or lDDT. Several targets were particularly difficult owing to their size or multimerization. As noted in previous evaluations, although predictions that perform well on secondary structure may also achieve acceptable overall folds, they are insufficient to guarantee chemical precision or to correctly identify residues involved in non-Watson-Crick interactions. Both are essential for obtaining a valid three-dimensional architecture and for understanding the biological function of RNAs.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"218-229"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750035/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145214549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-08-28DOI: 10.1002/prot.70040
Nessim Raouraoua, Marc F Lensink, Guillaume Brysbaert
Massive sampling with AlphaFold2 has become a widely used approach in protein structure prediction. Here we present the MassiveFold CASP16-CAPRI dataset, a systematic, large-scale sampling of both monomeric and multimeric protein targets. By exploiting maximal parallelization, we produced up to 8040 models per target and shared them with the community for collaborative selection and scoring. This collective effort minimizes redundant computation and environmental impact, while granting resource-limited groups - especially those focused on scoring - access to high quality structures. In our analysis, we define an interface-difficulty classification based on DockQ metrics, showing that massive sampling yields the greatest gains on most of the challenging interfaces. Crucially, this classification can be predicted from the median ipTM scores of a routine AF2 run, enabling users to selectively deploy massive sampling only when it is most needed. Combined with a reduction of the massive sampling from 8040 to 2475 predictions, such targeted strategies dramatically cut computation time and resource use with minimal loss of accuracy. Finally, we underscore the persistent challenge of choosing optimal models from massive sampling datasets, emphasizing the need for more robust scoring methods. The MassiveFold datasets, together with AlphaFold ranking scores and CASP and CAPRI assessment metrics, are publicly available at https://github.com/GBLille/CASP16-CAPRI_MassiveFold_Data to accelerate further progress in protein structure prediction and assembly modeling.
{"title":"MassiveFold Data for CASP16-CAPRI: A Systematic Massive Sampling Experiment.","authors":"Nessim Raouraoua, Marc F Lensink, Guillaume Brysbaert","doi":"10.1002/prot.70040","DOIUrl":"10.1002/prot.70040","url":null,"abstract":"<p><p>Massive sampling with AlphaFold2 has become a widely used approach in protein structure prediction. Here we present the MassiveFold CASP16-CAPRI dataset, a systematic, large-scale sampling of both monomeric and multimeric protein targets. By exploiting maximal parallelization, we produced up to 8040 models per target and shared them with the community for collaborative selection and scoring. This collective effort minimizes redundant computation and environmental impact, while granting resource-limited groups - especially those focused on scoring - access to high quality structures. In our analysis, we define an interface-difficulty classification based on DockQ metrics, showing that massive sampling yields the greatest gains on most of the challenging interfaces. Crucially, this classification can be predicted from the median ipTM scores of a routine AF2 run, enabling users to selectively deploy massive sampling only when it is most needed. Combined with a reduction of the massive sampling from 8040 to 2475 predictions, such targeted strategies dramatically cut computation time and resource use with minimal loss of accuracy. Finally, we underscore the persistent challenge of choosing optimal models from massive sampling datasets, emphasizing the need for more robust scoring methods. The MassiveFold datasets, together with AlphaFold ranking scores and CASP and CAPRI assessment metrics, are publicly available at https://github.com/GBLille/CASP16-CAPRI_MassiveFold_Data to accelerate further progress in protein structure prediction and assembly modeling.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"425-431"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750025/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144980301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}