Pub Date : 2026-01-01Epub Date: 2025-12-11DOI: 10.1002/prot.70099
Thomas Mulvaney, Andriy Kryshtafovych, Maya Topf
Since CASP13, experimentalists have been encouraged to provide their cryo-EM data along with the derived atomic models to the CASP organizers to aid assessment. In CASP16, 38 cryo-EM datasets were provided for assessment, which represented most cryo-EM targets. The corresponding targets typically comprised a single derived atomic structure; however, that model may be only one of several valid conformations. Flexibility often manifests as low-resolution regions in a cryo-EM reconstruction, particularly in RNA but often also in protein complexes. We show that local resolution in the reconstruction correlates well with the root-mean-square fluctuations (RMSF) of residues of accurate CASP predictions. The correlation between the local resolution and pLDDT was less clear, especially when mobile domains were present. When the resolution allowed, assessment of features such as sidechains, using our variant of SMOC with local fragment alignment, indicated that even high-quality predictions have room for improvement; on the other hand, some predictions fitted the density better in specific regions, indicating modeling discrepancies in the target. In one extreme case, a submitted target had regions of low-resolution that limited unambiguous model building. In such cases, part of the target structure is essentially a prediction itself, with implications for the assessment. Experimental data remain essential for model-free assessment of predictions and offer unique analyses such as comparisons to local resolution and thus flexibility.
{"title":"Cryo-EM Analysis in CASP16.","authors":"Thomas Mulvaney, Andriy Kryshtafovych, Maya Topf","doi":"10.1002/prot.70099","DOIUrl":"10.1002/prot.70099","url":null,"abstract":"<p><p>Since CASP13, experimentalists have been encouraged to provide their cryo-EM data along with the derived atomic models to the CASP organizers to aid assessment. In CASP16, 38 cryo-EM datasets were provided for assessment, which represented most cryo-EM targets. The corresponding targets typically comprised a single derived atomic structure; however, that model may be only one of several valid conformations. Flexibility often manifests as low-resolution regions in a cryo-EM reconstruction, particularly in RNA but often also in protein complexes. We show that local resolution in the reconstruction correlates well with the root-mean-square fluctuations (RMSF) of residues of accurate CASP predictions. The correlation between the local resolution and pLDDT was less clear, especially when mobile domains were present. When the resolution allowed, assessment of features such as sidechains, using our variant of SMOC with local fragment alignment, indicated that even high-quality predictions have room for improvement; on the other hand, some predictions fitted the density better in specific regions, indicating modeling discrepancies in the target. In one extreme case, a submitted target had regions of low-resolution that limited unambiguous model building. In such cases, part of the target structure is essentially a prediction itself, with implications for the assessment. Experimental data remain essential for model-free assessment of predictions and offer unique analyses such as comparisons to local resolution and thus flexibility.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"447-459"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750033/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145745719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-08-14DOI: 10.1002/prot.70035
Alisia Fadini, Recep Adiyaman, Shaima N Alhaddad, Behnosh Behzadi, Jianlin Cheng, Xinyue Cui, Nicholas S Edmunds, Lydia Freddolino, Ahmet G Genc, Fang Liang, Dong Liu, Jian Liu, Quancheng Liu, Liam J McGuffin, Pawan Neupane, Chunxiang Peng, David R Shortle, Meng Sun, Haodong Wang, Qiqige Wuyun, Guijun Zhang, Xuanfeng Zhao, Wei Zheng, Randy J Read
Model quality assessment (MQA) remains a critical component of structural bioinformatics for both structure predictors and experimentalists seeking to use predictions for downstream applications. In CASP16, the Evaluation of Model Accuracy (EMA) category featured both global and local quality estimation for multimeric assemblies (QMODE1 and QMODE2), as well as a novel QMODE3 challenge-requiring predictors to identify the best five models from thousands generated by MassiveFold. This paper presents detailed results from several leading CASP16 EMA methods, highlighting the strengths and limitations of the approaches.
{"title":"Highlights of Model Quality Assessment in CASP16.","authors":"Alisia Fadini, Recep Adiyaman, Shaima N Alhaddad, Behnosh Behzadi, Jianlin Cheng, Xinyue Cui, Nicholas S Edmunds, Lydia Freddolino, Ahmet G Genc, Fang Liang, Dong Liu, Jian Liu, Quancheng Liu, Liam J McGuffin, Pawan Neupane, Chunxiang Peng, David R Shortle, Meng Sun, Haodong Wang, Qiqige Wuyun, Guijun Zhang, Xuanfeng Zhao, Wei Zheng, Randy J Read","doi":"10.1002/prot.70035","DOIUrl":"10.1002/prot.70035","url":null,"abstract":"<p><p>Model quality assessment (MQA) remains a critical component of structural bioinformatics for both structure predictors and experimentalists seeking to use predictions for downstream applications. In CASP16, the Evaluation of Model Accuracy (EMA) category featured both global and local quality estimation for multimeric assemblies (QMODE1 and QMODE2), as well as a novel QMODE3 challenge-requiring predictors to identify the best five models from thousands generated by MassiveFold. This paper presents detailed results from several leading CASP16 EMA methods, highlighting the strengths and limitations of the approaches.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"314-329"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750036/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144857136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-06-02DOI: 10.1002/prot.26850
Jian Liu, Pawan Neupane, Jianlin Cheng
With AlphaFold achieving high-accuracy tertiary structure prediction for most single-chain proteins (monomers), the next major challenge in protein structure prediction is to accurately model multichain protein complexes (multimers). We developed MULTICOM4, the latest version of the MULTICOM system, to improve protein complex structure prediction by integrating transformer-based AlphaFold2, diffusion model-based AlphaFold3, and our in-house techniques. These include protein complex stoichiometry prediction, diverse multiple sequence alignment (MSA) generation leveraging both sequence and structure comparison, modeling exception handling, and deep learning-based protein model quality assessment. MULTICOM4 was blindly evaluated in the 16th Critical Assessment of Techniques for Protein Structure Prediction (CASP16) in 2024. In Phase 0 of CASP16, where stoichiometry information was unavailable, MULTICOM predictors performed best, with MULTICOM_human achieving a TM-score of 0.752 and a DockQ score of 0.584 for top-ranked predictions on average. In Phase 1 of CASP16, with stoichiometry information provided, MULTICOM_human remained among the top predictors, attaining a TM-score of 0.797 and a DockQ score of 0.558 on average. The CASP16 results demonstrate that integrating complementary AlphaFold2 and AlphaFold3 with enhanced MSA inputs, comprehensive model ranking, exception handling, and accurate stoichiometry prediction can effectively improve protein complex structure prediction.
{"title":"Improving AlphaFold2- and AlphaFold3-Based Protein Complex Structure Prediction With MULTICOM4 in CASP16.","authors":"Jian Liu, Pawan Neupane, Jianlin Cheng","doi":"10.1002/prot.26850","DOIUrl":"10.1002/prot.26850","url":null,"abstract":"<p><p>With AlphaFold achieving high-accuracy tertiary structure prediction for most single-chain proteins (monomers), the next major challenge in protein structure prediction is to accurately model multichain protein complexes (multimers). We developed MULTICOM4, the latest version of the MULTICOM system, to improve protein complex structure prediction by integrating transformer-based AlphaFold2, diffusion model-based AlphaFold3, and our in-house techniques. These include protein complex stoichiometry prediction, diverse multiple sequence alignment (MSA) generation leveraging both sequence and structure comparison, modeling exception handling, and deep learning-based protein model quality assessment. MULTICOM4 was blindly evaluated in the 16th Critical Assessment of Techniques for Protein Structure Prediction (CASP16) in 2024. In Phase 0 of CASP16, where stoichiometry information was unavailable, MULTICOM predictors performed best, with MULTICOM_human achieving a TM-score of 0.752 and a DockQ score of 0.584 for top-ranked predictions on average. In Phase 1 of CASP16, with stoichiometry information provided, MULTICOM_human remained among the top predictors, attaining a TM-score of 0.797 and a DockQ score of 0.558 on average. The CASP16 results demonstrate that integrating complementary AlphaFold2 and AlphaFold3 with enhanced MSA inputs, comprehensive model ranking, exception handling, and accurate stoichiometry prediction can effectively improve protein complex structure prediction.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"131-141"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12354153/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144200947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ligand binding prediction is a critical component of structure-based drug design, gaining prominence in Critical Assessment of protein Structure Prediction (CASP) since its introduction in CASP15. In CASP16, the challenges expanded to include protein-ligand and nucleic acid-ligand binding predictions, alongside binding affinity ranking, posing greater computational and methodological demands. This study presents a sophisticated prediction strategy combining template-based docking, multiple receptor conformations, and AI-driven scoring to address these challenges. For protein-ligand systems (L1000-L4000), we leveraged structural templates from PDB, ligand similarity analysis, and tools like CoDock-Ligand and AutoDock Vina to predict binding poses. Key successes included accurate predictions for targets like SARS-CoV-2 Mpro (L4000) and Autotaxin (L3000), though challenges persisted with binding site flexibility and pose ranking. The prediction of ligand pose achieved satisfactory results, with more than 66% of the distribution having RMSD less than 3 Å. Nucleic acid-ligand predictions (e.g., ZTP riboswitch) yielded mixed results, highlighting limitations in RNA/DNA structural accuracy. Affinity prediction employed diverse methods, with machine learning-based SVR_Conjoint outperforming physics-based approaches (Kendall's Tau = 0.43). Our strategy demonstrated robustness in CASP16, yet underscored the need for advancements in handling conformational dynamics and scoring accuracy. This work provides a framework for future ligand binding prediction efforts in computational drug discovery.
配体结合预测是基于结构的药物设计的关键组成部分,自CASP15引入以来,在蛋白质结构预测的关键评估(CASP)中获得了突出地位。在CASP16中,挑战扩大到包括蛋白质-配体和核酸-配体结合预测,以及结合亲和力排序,提出了更多的计算和方法要求。该研究提出了一种复杂的预测策略,结合了基于模板的对接、多种受体构象和人工智能驱动的评分来应对这些挑战。对于蛋白质-配体系统(L1000-L4000),我们利用PDB中的结构模板、配体相似性分析以及CoDock-Ligand和AutoDock Vina等工具来预测结合姿势。关键的成功包括准确预测SARS-CoV-2 Mpro (L4000)和Autotaxin (L3000)等靶标,尽管在结合位点灵活性和位姿排序方面仍然存在挑战。配体位姿的预测结果令人满意,超过66%的分布RMSD小于3 Å。核酸配体预测(例如,ZTP核糖开关)产生了不同的结果,突出了RNA/DNA结构准确性的局限性。亲和预测采用了多种方法,基于机器学习的svr_joint优于基于物理的方法(Kendall’s Tau = 0.43)。我们的策略证明了CASP16的稳健性,但强调了在处理构象动力学和评分准确性方面的进步。这项工作为未来计算药物发现中的配体结合预测工作提供了一个框架。
{"title":"Ligand Binding Prediction on Pharmaceutical and Nucleic Acid Targets by the CoDock Group in CASP16.","authors":"Ren Kong, Zunyun Jiang, Xufeng Lu, Liangxu Xie, Shan Chang","doi":"10.1002/prot.70032","DOIUrl":"10.1002/prot.70032","url":null,"abstract":"<p><p>Ligand binding prediction is a critical component of structure-based drug design, gaining prominence in Critical Assessment of protein Structure Prediction (CASP) since its introduction in CASP15. In CASP16, the challenges expanded to include protein-ligand and nucleic acid-ligand binding predictions, alongside binding affinity ranking, posing greater computational and methodological demands. This study presents a sophisticated prediction strategy combining template-based docking, multiple receptor conformations, and AI-driven scoring to address these challenges. For protein-ligand systems (L1000-L4000), we leveraged structural templates from PDB, ligand similarity analysis, and tools like CoDock-Ligand and AutoDock Vina to predict binding poses. Key successes included accurate predictions for targets like SARS-CoV-2 Mpro (L4000) and Autotaxin (L3000), though challenges persisted with binding site flexibility and pose ranking. The prediction of ligand pose achieved satisfactory results, with more than 66% of the distribution having RMSD less than 3 Å. Nucleic acid-ligand predictions (e.g., ZTP riboswitch) yielded mixed results, highlighting limitations in RNA/DNA structural accuracy. Affinity prediction employed diverse methods, with machine learning-based SVR_Conjoint outperforming physics-based approaches (Kendall's Tau = 0.43). Our strategy demonstrated robustness in CASP16, yet underscored the need for advancements in handling conformational dynamics and scoring accuracy. This work provides a framework for future ligand binding prediction efforts in computational drug discovery.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"276-285"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144765848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-08-25DOI: 10.1002/prot.70044
Arne Elofsson
The CASP16 experiment provided the first opportunity to benchmark AlphaFold3. In contrast to AlphaFold2, AlphaFold3 can predict the structure of non-protein molecules. According to the benchmark presented by the developers, it is expected to perform slightly better than AlphaFold2 for proteins. In this study, we assess the performance of AlphaFold3 using both automatic server submissions (AF3-server) and manual predictions from the Elofsson group (Elofsson). All predictions were generated via the AlphaFold3 web server, with manual interventions applied to large targets and ligands. Compared to AlphaFold2-based methods, we found that AlphaFold3 performs slightly better for protein complexes. However, when massive sampling is applied to AlphaFold2, the difference disappears. It was also noted that, according to the official ranking from CASP, the AF3-server performs better than AlphaFold2 for easier targets, but not for harder targets. Furthermore, the performance of the AF3-server is comparable to the best methods when considering the top-ranked predictions, but slightly behind when examining the best among the five submitted models. Here, there exist targets where AF3-server, the top-ranked method, is worse than lower-ranked models, indicating that a venue for progress could be to develop better strategies for identifying the best out of the generated models. When using AF3-server to predict the stoichiometry of larger protein complexes, the accuracy is limited, especially for heteromeric targets. When analyzing the predictions including nucleic acids, it was found that, in general, the accuracy is relatively low. However, the AF3-server performance was not far behind that of the top-ranked method. In summary, AF3-server offers a user-friendly tool that provides predictions comparable to state-of-the-art methods in all categories of CASP.
{"title":"AlphaFold3 at CASP16.","authors":"Arne Elofsson","doi":"10.1002/prot.70044","DOIUrl":"10.1002/prot.70044","url":null,"abstract":"<p><p>The CASP16 experiment provided the first opportunity to benchmark AlphaFold3. In contrast to AlphaFold2, AlphaFold3 can predict the structure of non-protein molecules. According to the benchmark presented by the developers, it is expected to perform slightly better than AlphaFold2 for proteins. In this study, we assess the performance of AlphaFold3 using both automatic server submissions (AF3-server) and manual predictions from the Elofsson group (Elofsson). All predictions were generated via the AlphaFold3 web server, with manual interventions applied to large targets and ligands. Compared to AlphaFold2-based methods, we found that AlphaFold3 performs slightly better for protein complexes. However, when massive sampling is applied to AlphaFold2, the difference disappears. It was also noted that, according to the official ranking from CASP, the AF3-server performs better than AlphaFold2 for easier targets, but not for harder targets. Furthermore, the performance of the AF3-server is comparable to the best methods when considering the top-ranked predictions, but slightly behind when examining the best among the five submitted models. Here, there exist targets where AF3-server, the top-ranked method, is worse than lower-ranked models, indicating that a venue for progress could be to develop better strategies for identifying the best out of the generated models. When using AF3-server to predict the stoichiometry of larger protein complexes, the accuracy is limited, especially for heteromeric targets. When analyzing the predictions including nucleic acids, it was found that, in general, the accuracy is relatively low. However, the AF3-server performance was not far behind that of the top-ranked method. In summary, AF3-server offers a user-friendly tool that provides predictions comparable to state-of-the-art methods in all categories of CASP.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"154-166"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750027/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144980277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-10-06DOI: 10.1002/prot.70063
Keqiong Zhang, Qilong Wu, Sheng-You Huang
In the 15th Critical Assessment of Techniques for Structure Prediction (CASP15), the category of protein-ligand complexes was introduced to advance the development of protein-ligand structure prediction techniques. CASP16 further expanded this category by introducing four sets of pharmaceutical targets as super-targets. Each super-target consists of multiple protein-ligand complexes involving the same protein but different ligands. Given the outstanding performance of template-based methods in CASP15, we employed a template-guided ensemble docking strategy for ligand (LG) tasks in CASP16. MODELER, AlphaFold3, and AlphaFold-Multimer were used to generate structural ensembles for each target protein. Then, we searched the Protein Data Bank (PDB) for reliable template complexes based on sequence identity, ligand similarity, and maximum common substructure (MCS) coverage score. If templates were identified, we used LSalign to perform ligand 3D alignment. For targets without a template, XDock and MDock were used to predict the binding poses. Finally, a knowledge-based scoring function, ITScore, was employed for energy evaluation. It is shown that our method performed well in the CASP16's LG tasks, ranking 4th out of 38 participating teams.
{"title":"Protein-Ligand Structure Prediction by Template-Guided Ensemble Docking Strategy.","authors":"Keqiong Zhang, Qilong Wu, Sheng-You Huang","doi":"10.1002/prot.70063","DOIUrl":"10.1002/prot.70063","url":null,"abstract":"<p><p>In the 15th Critical Assessment of Techniques for Structure Prediction (CASP15), the category of protein-ligand complexes was introduced to advance the development of protein-ligand structure prediction techniques. CASP16 further expanded this category by introducing four sets of pharmaceutical targets as super-targets. Each super-target consists of multiple protein-ligand complexes involving the same protein but different ligands. Given the outstanding performance of template-based methods in CASP15, we employed a template-guided ensemble docking strategy for ligand (LG) tasks in CASP16. MODELER, AlphaFold3, and AlphaFold-Multimer were used to generate structural ensembles for each target protein. Then, we searched the Protein Data Bank (PDB) for reliable template complexes based on sequence identity, ligand similarity, and maximum common substructure (MCS) coverage score. If templates were identified, we used LSalign to perform ligand 3D alignment. For targets without a template, XDock and MDock were used to predict the binding poses. Finally, a knowledge-based scoring function, ITScore, was employed for energy evaluation. It is shown that our method performed well in the CASP16's LG tasks, ranking 4th out of 38 participating teams.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"267-275"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145234256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-09-28DOI: 10.1002/prot.70060
Xavier Robin, Peter Škrinjar, Andrew M Waterhouse, Gabriel Studer, Gerardo Tauriello, Janani Durairaj, Torsten Schwede
Independent, blind assessment of structure prediction methods is essential for establishing state-of-the-art performance, identifying limitations, and guiding future developments. The Continuous Automated Model EvaluatiOn (CAMEO) platform provides weekly, automated benchmarking of structure prediction servers, complementing the biennial Critical Assessment of Structure Prediction (CASP) experiments.
{"title":"Beyond Single Chains: Benchmarking Macromolecular Complex Prediction Methods With the Continuous Automated Model EvaluatiOn (CAMEO).","authors":"Xavier Robin, Peter Škrinjar, Andrew M Waterhouse, Gabriel Studer, Gerardo Tauriello, Janani Durairaj, Torsten Schwede","doi":"10.1002/prot.70060","DOIUrl":"10.1002/prot.70060","url":null,"abstract":"<p><p>Independent, blind assessment of structure prediction methods is essential for establishing state-of-the-art performance, identifying limitations, and guiding future developments. The Continuous Automated Model EvaluatiOn (CAMEO) platform provides weekly, automated benchmarking of structure prediction servers, complementing the biennial Critical Assessment of Structure Prediction (CASP) experiments.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"403-413"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750032/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145187574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-10-20DOI: 10.1002/prot.70066
Ryota Ashizawa, Sergei Kotelnikov, Omeir Khan, Stan Xiaogang Li, Ernest Glukhov, Xin Cao, Maria Lazou, Ayse Bekar-Cesaretli, Derara Hailegeorgis, Veranika Averkava, Yimin Zhu, George Jones, Hao Yu, Dmytro Kalitin, Darya Stepanenko, Kushal Koirala, Taras Patsahan, Dmitri Beglov, Mark Lukin, Diane Joseph-McCarthy, Carlos Simmerling, Alexander Tropsha, Evangelos Coutsias, Ken A Dill, Dzmitry Padhorny, Sandor Vajda, Dima Kozakov
In the CASP16 experiment, our team employed hybrid computational strategies to predict both protein-protein and protein-ligand complex structures. For protein-protein docking, we combined physics-based sampling-using ClusPro FFT docking and molecular dynamics-with AlphaFold (AF)-based sampling, followed by AF-based refinement. Our method produced numerous high-accuracy complex models, including cases where AF alone failed, underscoring the critical role of physics-based sampling alongside deep learning-based refinement. For protein-ligand docking, we integrated the ClusPro LigTBM template-based approach with a machine learning-based confidence model for rescoring. The method preserves conserved interaction fragments derived from homologous complexes, followed by local resampling using physics-based sampling and a diffusion model. Our template-based strategy achieved a mean lDDT-PLI of 0.69 across 233 targets, which was highly competitive. These results demonstrate that combining physics-based modeling with AI-driven refinement can significantly enhance the accuracy of both protein-protein and protein-ligand structure predictions.
{"title":"Modeling Protein-Protein and Protein-Ligand Interactions by the ClusPro Team in CASP16.","authors":"Ryota Ashizawa, Sergei Kotelnikov, Omeir Khan, Stan Xiaogang Li, Ernest Glukhov, Xin Cao, Maria Lazou, Ayse Bekar-Cesaretli, Derara Hailegeorgis, Veranika Averkava, Yimin Zhu, George Jones, Hao Yu, Dmytro Kalitin, Darya Stepanenko, Kushal Koirala, Taras Patsahan, Dmitri Beglov, Mark Lukin, Diane Joseph-McCarthy, Carlos Simmerling, Alexander Tropsha, Evangelos Coutsias, Ken A Dill, Dzmitry Padhorny, Sandor Vajda, Dima Kozakov","doi":"10.1002/prot.70066","DOIUrl":"10.1002/prot.70066","url":null,"abstract":"<p><p>In the CASP16 experiment, our team employed hybrid computational strategies to predict both protein-protein and protein-ligand complex structures. For protein-protein docking, we combined physics-based sampling-using ClusPro FFT docking and molecular dynamics-with AlphaFold (AF)-based sampling, followed by AF-based refinement. Our method produced numerous high-accuracy complex models, including cases where AF alone failed, underscoring the critical role of physics-based sampling alongside deep learning-based refinement. For protein-ligand docking, we integrated the ClusPro LigTBM template-based approach with a machine learning-based confidence model for rescoring. The method preserves conserved interaction fragments derived from homologous complexes, followed by local resampling using physics-based sampling and a diffusion model. Our template-based strategy achieved a mean lDDT-PLI of 0.69 across 233 targets, which was highly competitive. These results demonstrate that combining physics-based modeling with AI-driven refinement can significantly enhance the accuracy of both protein-protein and protein-ligand structure predictions.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"183-191"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750026/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145338256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-10-28DOI: 10.1002/prot.70065
Namita Dube, Theresa A Ramelot, Tiburon L Benavides, Yuanpeng J Huang, John Moult, Andriy Kryshtafovych, Gaetano T Montelione
The CASP16 Ensemble Prediction experiment assessed advances in methods for modeling proteins, nucleic acids, and their complexes in multiple conformational states. Targets included systems with experimental structures determined in two or three states, evaluated by direct comparison to experimental coordinates, as well as domain-linker-domain (D-L-D) targets assessed against statistical models generated from NMR and SAXS data. This paper focuses on the former class of multi-state targets. Ten ensembles were released as community challenges, including ligand-induced conformational changes, protein-DNA complexes, a trimeric protein, a stem-loop RNA, and multiple oligomeric states of a single RNA. For five targets, some groups produced reasonably accurate models of both reference states (best TM-score > 0.75). However, with the exception of one protein-ligand complex (T1214), where an apo structure was available as a template, predictors generally failed to capture key structural details distinguishing the states. Overall, accuracy was significantly lower than for single-state targets in other CASP experiments. The most successful approaches generated multiple AlphaFold2 models using enhanced multiple sequence alignments and sampling protocols, followed by model quality-based selection. Although the AlphaFold3 server performed well on several targets, individual groups outperformed it in specific cases. By contrast, predictions for one protein-DNA complex, three RNA targets, and multiple oligomeric RNA states consistently fell short (TM-score < 0.75). These results highlight both progress and persistent challenges in multi-state prediction. Despite recent advances, accurate modeling of conformational ensembles, particularly RNA and large multimeric assemblies, remains an important frontier for structural biology.
{"title":"Modeling Alternative Conformational States in CASP16.","authors":"Namita Dube, Theresa A Ramelot, Tiburon L Benavides, Yuanpeng J Huang, John Moult, Andriy Kryshtafovych, Gaetano T Montelione","doi":"10.1002/prot.70065","DOIUrl":"10.1002/prot.70065","url":null,"abstract":"<p><p>The CASP16 Ensemble Prediction experiment assessed advances in methods for modeling proteins, nucleic acids, and their complexes in multiple conformational states. Targets included systems with experimental structures determined in two or three states, evaluated by direct comparison to experimental coordinates, as well as domain-linker-domain (D-L-D) targets assessed against statistical models generated from NMR and SAXS data. This paper focuses on the former class of multi-state targets. Ten ensembles were released as community challenges, including ligand-induced conformational changes, protein-DNA complexes, a trimeric protein, a stem-loop RNA, and multiple oligomeric states of a single RNA. For five targets, some groups produced reasonably accurate models of both reference states (best TM-score > 0.75). However, with the exception of one protein-ligand complex (T1214), where an apo structure was available as a template, predictors generally failed to capture key structural details distinguishing the states. Overall, accuracy was significantly lower than for single-state targets in other CASP experiments. The most successful approaches generated multiple AlphaFold2 models using enhanced multiple sequence alignments and sampling protocols, followed by model quality-based selection. Although the AlphaFold3 server performed well on several targets, individual groups outperformed it in specific cases. By contrast, predictions for one protein-DNA complex, three RNA targets, and multiple oligomeric RNA states consistently fell short (TM-score < 0.75). These results highlight both progress and persistent challenges in multi-state prediction. Despite recent advances, accurate modeling of conformational ensembles, particularly RNA and large multimeric assemblies, remains an important frontier for structural biology.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"330-347"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145379883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-11-08DOI: 10.1002/prot.70079
Rachael C Kretsch, Elisa Posani, Eugene F Baulin, Janusz M Bujnicki, Giovanni Bussi, Thomas E Cheatham, Shi-Jie Chen, Arne Elofsson, Masoud Amiri Farsani, Olivia N Fisher, M Michael Gromiha, Ayush Gupta, Michiaki Hamada, K Harini, Gang Hu, David Huang, Junichi Iwakiri, Anika Jain, Yuki Kagaya, Daisuke Kihara, Sebastian Kmiecik, Sowmya Ramaswamy Krishnan, Ikuo Kurisaki, Olivier Languin-Cattoën, Jun Li, Shanshan Li, Karim Malekzadeh, Tsukasa Nakamura, Wentao Ni, Chandran Nithin, Michael Z Palo, Joon Hong Park, Smita P Pilla, Simón Poblete, Fabrizio Pucci, Pranav Punuru, Anouka Saha, Kengo Sato, Ambuj Srivastava, Genki Terashi, Emilia Tugolukova, Jacob Verburgt, Qiqige Wuyun, Gül H Zerze, Kaiming Zhang, Sicheng Zhang, Wei Zheng, Yuanzhe Zhou, Wah Chiu, David A Case, Rhiju Das
Biomolecules rely on water and ions for stable folding, but these interactions are often transient, dynamic, or disordered and thus hidden from experiments and evaluation challenges that represent biomolecules as single, ordered structures. Here, we compare blindly predicted ensembles of water and ion structure to the cryo-EM densities observed around the Tetrahymena ribozyme at 2.2-2.3 Å resolution, collected through target R1260 in the CASP16 competition. Twenty-six groups participated in this solvation "cryo-ensemble" prediction challenge, submitting over 350 million atoms in total, offering the first opportunity to compare blind predictions of dynamic solvent shell ensembles to cryo-EM density. Predicted atomic ensembles were converted to density through local alignment and these densities were compared to the cryo-EM densities using Pearson correlation, Spearman correlation, mutual information, and precision-recall curves. These predictions show that an ensemble representation is able to capture information of transient or dynamic water and ions better than traditional atomic models, but there remains a large accuracy gap to the performance ceiling set by experimental uncertainty. Overall, molecular dynamics approaches best matched the cryo-EM density, with blind predictions from bussilab_plain_md, SoutheRNA, bussilab_replex, coogs2, and coogs3 outperforming the baseline molecular dynamics prediction. This study indicates that simulations of water and ions can be quantitatively evaluated with cryo-EM maps. We propose that further community-wide blind challenges can drive and evaluate progress in modeling water, ions, and other previously hidden components of biomolecular systems.
{"title":"Blind Prediction of Complex Water and Ion Ensembles Around RNA in CASP16.","authors":"Rachael C Kretsch, Elisa Posani, Eugene F Baulin, Janusz M Bujnicki, Giovanni Bussi, Thomas E Cheatham, Shi-Jie Chen, Arne Elofsson, Masoud Amiri Farsani, Olivia N Fisher, M Michael Gromiha, Ayush Gupta, Michiaki Hamada, K Harini, Gang Hu, David Huang, Junichi Iwakiri, Anika Jain, Yuki Kagaya, Daisuke Kihara, Sebastian Kmiecik, Sowmya Ramaswamy Krishnan, Ikuo Kurisaki, Olivier Languin-Cattoën, Jun Li, Shanshan Li, Karim Malekzadeh, Tsukasa Nakamura, Wentao Ni, Chandran Nithin, Michael Z Palo, Joon Hong Park, Smita P Pilla, Simón Poblete, Fabrizio Pucci, Pranav Punuru, Anouka Saha, Kengo Sato, Ambuj Srivastava, Genki Terashi, Emilia Tugolukova, Jacob Verburgt, Qiqige Wuyun, Gül H Zerze, Kaiming Zhang, Sicheng Zhang, Wei Zheng, Yuanzhe Zhou, Wah Chiu, David A Case, Rhiju Das","doi":"10.1002/prot.70079","DOIUrl":"10.1002/prot.70079","url":null,"abstract":"<p><p>Biomolecules rely on water and ions for stable folding, but these interactions are often transient, dynamic, or disordered and thus hidden from experiments and evaluation challenges that represent biomolecules as single, ordered structures. Here, we compare blindly predicted ensembles of water and ion structure to the cryo-EM densities observed around the Tetrahymena ribozyme at 2.2-2.3 Å resolution, collected through target R1260 in the CASP16 competition. Twenty-six groups participated in this solvation \"cryo-ensemble\" prediction challenge, submitting over 350 million atoms in total, offering the first opportunity to compare blind predictions of dynamic solvent shell ensembles to cryo-EM density. Predicted atomic ensembles were converted to density through local alignment and these densities were compared to the cryo-EM densities using Pearson correlation, Spearman correlation, mutual information, and precision-recall curves. These predictions show that an ensemble representation is able to capture information of transient or dynamic water and ions better than traditional atomic models, but there remains a large accuracy gap to the performance ceiling set by experimental uncertainty. Overall, molecular dynamics approaches best matched the cryo-EM density, with blind predictions from bussilab_plain_md, SoutheRNA, bussilab_replex, coogs2, and coogs3 outperforming the baseline molecular dynamics prediction. This study indicates that simulations of water and ions can be quantitatively evaluated with cryo-EM maps. We propose that further community-wide blind challenges can drive and evaluate progress in modeling water, ions, and other previously hidden components of biomolecular systems.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"381-402"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145472539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}