Pub Date : 2026-01-01Epub Date: 2025-04-08DOI: 10.1002/prot.26827
Alex Morehead, Jian Liu, Pawan Neupane, Nabin Giri, Jianlin Cheng
Predicting the structure of ligands bound to proteins is a foundational problem in modern biotechnology and drug discovery, yet little is known about how to combine the predictions of protein-ligand structure (poses) produced by the latest deep learning methods to identify the best poses and how to accurately estimate the binding affinity between a protein target and a list of ligand candidates. Further, a blind benchmarking and assessment of protein-ligand structure and binding affinity prediction is necessary to ensure it generalizes well to new settings. Towards this end, we introduce MULTICOM_ligand, a deep learning-based protein-ligand structure and binding affinity prediction ensemble featuring structural consensus ranking for unsupervised pose ranking and a new deep generative flow matching model for joint structure and binding affinity prediction. Notably, MULTICOM_ligand ranked among the top-5 ligand prediction methods in both protein-ligand structure prediction and binding affinity prediction in the 16th Critical Assessment of Techniques for Structure Prediction (CASP16), demonstrating its efficacy and utility for real-world drug discovery efforts. The source code for MULTICOM_ligand is freely available on GitHub.
{"title":"Protein-Ligand Structure and Affinity Prediction in CASP16 Using a Geometric Deep Learning Ensemble and Flow Matching.","authors":"Alex Morehead, Jian Liu, Pawan Neupane, Nabin Giri, Jianlin Cheng","doi":"10.1002/prot.26827","DOIUrl":"10.1002/prot.26827","url":null,"abstract":"<p><p>Predicting the structure of ligands bound to proteins is a foundational problem in modern biotechnology and drug discovery, yet little is known about how to combine the predictions of protein-ligand structure (poses) produced by the latest deep learning methods to identify the best poses and how to accurately estimate the binding affinity between a protein target and a list of ligand candidates. Further, a blind benchmarking and assessment of protein-ligand structure and binding affinity prediction is necessary to ensure it generalizes well to new settings. Towards this end, we introduce MULTICOM_ligand, a deep learning-based protein-ligand structure and binding affinity prediction ensemble featuring structural consensus ranking for unsupervised pose ranking and a new deep generative flow matching model for joint structure and binding affinity prediction. Notably, MULTICOM_ligand ranked among the top-5 ligand prediction methods in both protein-ligand structure prediction and binding affinity prediction in the 16th Critical Assessment of Techniques for Structure Prediction (CASP16), demonstrating its efficacy and utility for real-world drug discovery efforts. The source code for MULTICOM_ligand is freely available on GitHub.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"295-301"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12353915/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143804833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-10-15DOI: 10.1002/prot.70078
Luciano A Abriata, Matteo Dal Peraro
The 16th Critical Assessment of Structure Prediction benchmarked advancements in biomolecular modeling, particularly in the context of AlphaFold 2 and 3 systems. Protein monomer and domain prediction is largely solved, with barely any space for further improvements at the backbone level except for very specific details, irregular secondary structures, and mutational effects that remain challenging to predict. For protein assemblies, AF-based methods, especially when expertly guided or enhanced by servers like those from the Yang, Zheng/Zhang, and Cheng lab, show progress, though complex topologies and in particular antibody-antigen interactions are still difficult. Notably, a priori knowledge of stoichiometry significantly aids assembly prediction. Protein-ligand co-folding with AF3 demonstrated strong potential for pose prediction, outperforming many participants and some dedicated docking tools in baseline tests, but several caveats hold as discussed. Ligand affinity prediction is totally unreliable. Nucleic acid structure prediction lags considerably, heavily relying on 3D templates and expert human intervention, even AF3 showing substantial limitations. Overall, on all fronts, AF3's modeling capabilities are at or close to the state of the art; additionally, it shows slight improvements over AF2 and more detailed confidence metrics than it. We guide users on tool selection, realistic accuracy expectations, and persistent challenges, emphasizing the critical role of confidence metrics in interpreting AI-generated models.
{"title":"Practical Outcomes From CASP16 for Users in Need of Biomolecular Structure Prediction.","authors":"Luciano A Abriata, Matteo Dal Peraro","doi":"10.1002/prot.70078","DOIUrl":"10.1002/prot.70078","url":null,"abstract":"<p><p>The 16th Critical Assessment of Structure Prediction benchmarked advancements in biomolecular modeling, particularly in the context of AlphaFold 2 and 3 systems. Protein monomer and domain prediction is largely solved, with barely any space for further improvements at the backbone level except for very specific details, irregular secondary structures, and mutational effects that remain challenging to predict. For protein assemblies, AF-based methods, especially when expertly guided or enhanced by servers like those from the Yang, Zheng/Zhang, and Cheng lab, show progress, though complex topologies and in particular antibody-antigen interactions are still difficult. Notably, a priori knowledge of stoichiometry significantly aids assembly prediction. Protein-ligand co-folding with AF3 demonstrated strong potential for pose prediction, outperforming many participants and some dedicated docking tools in baseline tests, but several caveats hold as discussed. Ligand affinity prediction is totally unreliable. Nucleic acid structure prediction lags considerably, heavily relying on 3D templates and expert human intervention, even AF3 showing substantial limitations. Overall, on all fronts, AF3's modeling capabilities are at or close to the state of the art; additionally, it shows slight improvements over AF2 and more detailed confidence metrics than it. We guide users on tool selection, realistic accuracy expectations, and persistent challenges, emphasizing the critical role of confidence metrics in interpreting AI-generated models.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"435-446"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750028/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145294492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-10-31DOI: 10.1002/prot.70068
Jing Zhang, Rongqing Yuan, Andriy Kryshtafovych, Jimin Pei, Rachael C Kretsch, R Dustin Schaeffer, Jian Zhou, Rhiju Das, Nick V Grishin, Qian Cong
The assessment of oligomer targets in the Critical Assessment of Structure Prediction Round 16 (CASP16) suggests that complex structure prediction remains an unsolved challenge. Even the leading groups can only predict slightly more than half of the targets to high accuracy. Most CASP16 groups relied on AlphaFold-Multimer (AFM) or AlphaFold3 (AF3) as their core modeling engines. By optimizing input MSAs, refining modeling constructs (using partial rather than full sequences), and employing massive model sampling and selection, top-performing groups were able to significantly outperform the default AFM/AF3 predictions. CASP16 also introduced two additional challenges: Phase 0, which required predictions without stoichiometry information, and Phase 2, which provided participants with thousands of models generated by MassiveFold (MF) to enable large-scale sampling for resource-limited groups. Across all phases, the MULTICOM series and Kiharalab emerged as top performers based on the quality of their best models. However, these groups did not have a strong advantage in model ranking, and thus their lead over other teams, such as Yang-Multimer and kozakovvajda, was less pronounced when evaluating only the first submitted models. Compared to CASP15, CASP16 showed moderate overall improvement, likely driven by the release of AF3 and the extensive model sampling employed by top groups. Several notable trends highlight frontiers for future development. First, the kozakovvajda group significantly outperformed others on antibody-antigen targets, achieving over a 60% success rate without relying on AFM or AF3 as their primary modeling framework, suggesting that alternative approaches may offer promising solutions for these difficult targets. Second, model ranking and selection continue to be major bottlenecks. The PEZYFoldings group demonstrated a notable advantage in selecting their best models as first models, suggesting that their pipeline for model ranking may offer important insights for the field. Finally, the Phase 0 experiment indicated moderate success in stoichiometry prediction; however, stoichiometry prediction remains challenging for high-order assemblies and targets that differ from available homologous templates. Overall, CASP16 demonstrated steady progress in multimer prediction while emphasizing the need for more effective model ranking strategies, improved stoichiometry prediction, and new modeling methods that extend beyond the current AF-based paradigm.
{"title":"Assessment of Protein Complex Predictions in CASP16: Are We Making Progress?","authors":"Jing Zhang, Rongqing Yuan, Andriy Kryshtafovych, Jimin Pei, Rachael C Kretsch, R Dustin Schaeffer, Jian Zhou, Rhiju Das, Nick V Grishin, Qian Cong","doi":"10.1002/prot.70068","DOIUrl":"10.1002/prot.70068","url":null,"abstract":"<p><p>The assessment of oligomer targets in the Critical Assessment of Structure Prediction Round 16 (CASP16) suggests that complex structure prediction remains an unsolved challenge. Even the leading groups can only predict slightly more than half of the targets to high accuracy. Most CASP16 groups relied on AlphaFold-Multimer (AFM) or AlphaFold3 (AF3) as their core modeling engines. By optimizing input MSAs, refining modeling constructs (using partial rather than full sequences), and employing massive model sampling and selection, top-performing groups were able to significantly outperform the default AFM/AF3 predictions. CASP16 also introduced two additional challenges: Phase 0, which required predictions without stoichiometry information, and Phase 2, which provided participants with thousands of models generated by MassiveFold (MF) to enable large-scale sampling for resource-limited groups. Across all phases, the MULTICOM series and Kiharalab emerged as top performers based on the quality of their best models. However, these groups did not have a strong advantage in model ranking, and thus their lead over other teams, such as Yang-Multimer and kozakovvajda, was less pronounced when evaluating only the first submitted models. Compared to CASP15, CASP16 showed moderate overall improvement, likely driven by the release of AF3 and the extensive model sampling employed by top groups. Several notable trends highlight frontiers for future development. First, the kozakovvajda group significantly outperformed others on antibody-antigen targets, achieving over a 60% success rate without relying on AFM or AF3 as their primary modeling framework, suggesting that alternative approaches may offer promising solutions for these difficult targets. Second, model ranking and selection continue to be major bottlenecks. The PEZYFoldings group demonstrated a notable advantage in selecting their best models as first models, suggesting that their pipeline for model ranking may offer important insights for the field. Finally, the Phase 0 experiment indicated moderate success in stoichiometry prediction; however, stoichiometry prediction remains challenging for high-order assemblies and targets that differ from available homologous templates. Overall, CASP16 demonstrated steady progress in multimer prediction while emphasizing the need for more effective model ranking strategies, improved stoichiometry prediction, and new modeling methods that extend beyond the current AF-based paradigm.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"106-130"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750043/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145423594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-12-11DOI: 10.1002/prot.70099
Thomas Mulvaney, Andriy Kryshtafovych, Maya Topf
Since CASP13, experimentalists have been encouraged to provide their cryo-EM data along with the derived atomic models to the CASP organizers to aid assessment. In CASP16, 38 cryo-EM datasets were provided for assessment, which represented most cryo-EM targets. The corresponding targets typically comprised a single derived atomic structure; however, that model may be only one of several valid conformations. Flexibility often manifests as low-resolution regions in a cryo-EM reconstruction, particularly in RNA but often also in protein complexes. We show that local resolution in the reconstruction correlates well with the root-mean-square fluctuations (RMSF) of residues of accurate CASP predictions. The correlation between the local resolution and pLDDT was less clear, especially when mobile domains were present. When the resolution allowed, assessment of features such as sidechains, using our variant of SMOC with local fragment alignment, indicated that even high-quality predictions have room for improvement; on the other hand, some predictions fitted the density better in specific regions, indicating modeling discrepancies in the target. In one extreme case, a submitted target had regions of low-resolution that limited unambiguous model building. In such cases, part of the target structure is essentially a prediction itself, with implications for the assessment. Experimental data remain essential for model-free assessment of predictions and offer unique analyses such as comparisons to local resolution and thus flexibility.
{"title":"Cryo-EM Analysis in CASP16.","authors":"Thomas Mulvaney, Andriy Kryshtafovych, Maya Topf","doi":"10.1002/prot.70099","DOIUrl":"10.1002/prot.70099","url":null,"abstract":"<p><p>Since CASP13, experimentalists have been encouraged to provide their cryo-EM data along with the derived atomic models to the CASP organizers to aid assessment. In CASP16, 38 cryo-EM datasets were provided for assessment, which represented most cryo-EM targets. The corresponding targets typically comprised a single derived atomic structure; however, that model may be only one of several valid conformations. Flexibility often manifests as low-resolution regions in a cryo-EM reconstruction, particularly in RNA but often also in protein complexes. We show that local resolution in the reconstruction correlates well with the root-mean-square fluctuations (RMSF) of residues of accurate CASP predictions. The correlation between the local resolution and pLDDT was less clear, especially when mobile domains were present. When the resolution allowed, assessment of features such as sidechains, using our variant of SMOC with local fragment alignment, indicated that even high-quality predictions have room for improvement; on the other hand, some predictions fitted the density better in specific regions, indicating modeling discrepancies in the target. In one extreme case, a submitted target had regions of low-resolution that limited unambiguous model building. In such cases, part of the target structure is essentially a prediction itself, with implications for the assessment. Experimental data remain essential for model-free assessment of predictions and offer unique analyses such as comparisons to local resolution and thus flexibility.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"447-459"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750033/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145745719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-08-14DOI: 10.1002/prot.70035
Alisia Fadini, Recep Adiyaman, Shaima N Alhaddad, Behnosh Behzadi, Jianlin Cheng, Xinyue Cui, Nicholas S Edmunds, Lydia Freddolino, Ahmet G Genc, Fang Liang, Dong Liu, Jian Liu, Quancheng Liu, Liam J McGuffin, Pawan Neupane, Chunxiang Peng, David R Shortle, Meng Sun, Haodong Wang, Qiqige Wuyun, Guijun Zhang, Xuanfeng Zhao, Wei Zheng, Randy J Read
Model quality assessment (MQA) remains a critical component of structural bioinformatics for both structure predictors and experimentalists seeking to use predictions for downstream applications. In CASP16, the Evaluation of Model Accuracy (EMA) category featured both global and local quality estimation for multimeric assemblies (QMODE1 and QMODE2), as well as a novel QMODE3 challenge-requiring predictors to identify the best five models from thousands generated by MassiveFold. This paper presents detailed results from several leading CASP16 EMA methods, highlighting the strengths and limitations of the approaches.
{"title":"Highlights of Model Quality Assessment in CASP16.","authors":"Alisia Fadini, Recep Adiyaman, Shaima N Alhaddad, Behnosh Behzadi, Jianlin Cheng, Xinyue Cui, Nicholas S Edmunds, Lydia Freddolino, Ahmet G Genc, Fang Liang, Dong Liu, Jian Liu, Quancheng Liu, Liam J McGuffin, Pawan Neupane, Chunxiang Peng, David R Shortle, Meng Sun, Haodong Wang, Qiqige Wuyun, Guijun Zhang, Xuanfeng Zhao, Wei Zheng, Randy J Read","doi":"10.1002/prot.70035","DOIUrl":"10.1002/prot.70035","url":null,"abstract":"<p><p>Model quality assessment (MQA) remains a critical component of structural bioinformatics for both structure predictors and experimentalists seeking to use predictions for downstream applications. In CASP16, the Evaluation of Model Accuracy (EMA) category featured both global and local quality estimation for multimeric assemblies (QMODE1 and QMODE2), as well as a novel QMODE3 challenge-requiring predictors to identify the best five models from thousands generated by MassiveFold. This paper presents detailed results from several leading CASP16 EMA methods, highlighting the strengths and limitations of the approaches.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"314-329"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750036/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144857136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-06-02DOI: 10.1002/prot.26850
Jian Liu, Pawan Neupane, Jianlin Cheng
With AlphaFold achieving high-accuracy tertiary structure prediction for most single-chain proteins (monomers), the next major challenge in protein structure prediction is to accurately model multichain protein complexes (multimers). We developed MULTICOM4, the latest version of the MULTICOM system, to improve protein complex structure prediction by integrating transformer-based AlphaFold2, diffusion model-based AlphaFold3, and our in-house techniques. These include protein complex stoichiometry prediction, diverse multiple sequence alignment (MSA) generation leveraging both sequence and structure comparison, modeling exception handling, and deep learning-based protein model quality assessment. MULTICOM4 was blindly evaluated in the 16th Critical Assessment of Techniques for Protein Structure Prediction (CASP16) in 2024. In Phase 0 of CASP16, where stoichiometry information was unavailable, MULTICOM predictors performed best, with MULTICOM_human achieving a TM-score of 0.752 and a DockQ score of 0.584 for top-ranked predictions on average. In Phase 1 of CASP16, with stoichiometry information provided, MULTICOM_human remained among the top predictors, attaining a TM-score of 0.797 and a DockQ score of 0.558 on average. The CASP16 results demonstrate that integrating complementary AlphaFold2 and AlphaFold3 with enhanced MSA inputs, comprehensive model ranking, exception handling, and accurate stoichiometry prediction can effectively improve protein complex structure prediction.
{"title":"Improving AlphaFold2- and AlphaFold3-Based Protein Complex Structure Prediction With MULTICOM4 in CASP16.","authors":"Jian Liu, Pawan Neupane, Jianlin Cheng","doi":"10.1002/prot.26850","DOIUrl":"10.1002/prot.26850","url":null,"abstract":"<p><p>With AlphaFold achieving high-accuracy tertiary structure prediction for most single-chain proteins (monomers), the next major challenge in protein structure prediction is to accurately model multichain protein complexes (multimers). We developed MULTICOM4, the latest version of the MULTICOM system, to improve protein complex structure prediction by integrating transformer-based AlphaFold2, diffusion model-based AlphaFold3, and our in-house techniques. These include protein complex stoichiometry prediction, diverse multiple sequence alignment (MSA) generation leveraging both sequence and structure comparison, modeling exception handling, and deep learning-based protein model quality assessment. MULTICOM4 was blindly evaluated in the 16th Critical Assessment of Techniques for Protein Structure Prediction (CASP16) in 2024. In Phase 0 of CASP16, where stoichiometry information was unavailable, MULTICOM predictors performed best, with MULTICOM_human achieving a TM-score of 0.752 and a DockQ score of 0.584 for top-ranked predictions on average. In Phase 1 of CASP16, with stoichiometry information provided, MULTICOM_human remained among the top predictors, attaining a TM-score of 0.797 and a DockQ score of 0.558 on average. The CASP16 results demonstrate that integrating complementary AlphaFold2 and AlphaFold3 with enhanced MSA inputs, comprehensive model ranking, exception handling, and accurate stoichiometry prediction can effectively improve protein complex structure prediction.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"131-141"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12354153/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144200947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ligand binding prediction is a critical component of structure-based drug design, gaining prominence in Critical Assessment of protein Structure Prediction (CASP) since its introduction in CASP15. In CASP16, the challenges expanded to include protein-ligand and nucleic acid-ligand binding predictions, alongside binding affinity ranking, posing greater computational and methodological demands. This study presents a sophisticated prediction strategy combining template-based docking, multiple receptor conformations, and AI-driven scoring to address these challenges. For protein-ligand systems (L1000-L4000), we leveraged structural templates from PDB, ligand similarity analysis, and tools like CoDock-Ligand and AutoDock Vina to predict binding poses. Key successes included accurate predictions for targets like SARS-CoV-2 Mpro (L4000) and Autotaxin (L3000), though challenges persisted with binding site flexibility and pose ranking. The prediction of ligand pose achieved satisfactory results, with more than 66% of the distribution having RMSD less than 3 Å. Nucleic acid-ligand predictions (e.g., ZTP riboswitch) yielded mixed results, highlighting limitations in RNA/DNA structural accuracy. Affinity prediction employed diverse methods, with machine learning-based SVR_Conjoint outperforming physics-based approaches (Kendall's Tau = 0.43). Our strategy demonstrated robustness in CASP16, yet underscored the need for advancements in handling conformational dynamics and scoring accuracy. This work provides a framework for future ligand binding prediction efforts in computational drug discovery.
配体结合预测是基于结构的药物设计的关键组成部分,自CASP15引入以来,在蛋白质结构预测的关键评估(CASP)中获得了突出地位。在CASP16中,挑战扩大到包括蛋白质-配体和核酸-配体结合预测,以及结合亲和力排序,提出了更多的计算和方法要求。该研究提出了一种复杂的预测策略,结合了基于模板的对接、多种受体构象和人工智能驱动的评分来应对这些挑战。对于蛋白质-配体系统(L1000-L4000),我们利用PDB中的结构模板、配体相似性分析以及CoDock-Ligand和AutoDock Vina等工具来预测结合姿势。关键的成功包括准确预测SARS-CoV-2 Mpro (L4000)和Autotaxin (L3000)等靶标,尽管在结合位点灵活性和位姿排序方面仍然存在挑战。配体位姿的预测结果令人满意,超过66%的分布RMSD小于3 Å。核酸配体预测(例如,ZTP核糖开关)产生了不同的结果,突出了RNA/DNA结构准确性的局限性。亲和预测采用了多种方法,基于机器学习的svr_joint优于基于物理的方法(Kendall’s Tau = 0.43)。我们的策略证明了CASP16的稳健性,但强调了在处理构象动力学和评分准确性方面的进步。这项工作为未来计算药物发现中的配体结合预测工作提供了一个框架。
{"title":"Ligand Binding Prediction on Pharmaceutical and Nucleic Acid Targets by the CoDock Group in CASP16.","authors":"Ren Kong, Zunyun Jiang, Xufeng Lu, Liangxu Xie, Shan Chang","doi":"10.1002/prot.70032","DOIUrl":"10.1002/prot.70032","url":null,"abstract":"<p><p>Ligand binding prediction is a critical component of structure-based drug design, gaining prominence in Critical Assessment of protein Structure Prediction (CASP) since its introduction in CASP15. In CASP16, the challenges expanded to include protein-ligand and nucleic acid-ligand binding predictions, alongside binding affinity ranking, posing greater computational and methodological demands. This study presents a sophisticated prediction strategy combining template-based docking, multiple receptor conformations, and AI-driven scoring to address these challenges. For protein-ligand systems (L1000-L4000), we leveraged structural templates from PDB, ligand similarity analysis, and tools like CoDock-Ligand and AutoDock Vina to predict binding poses. Key successes included accurate predictions for targets like SARS-CoV-2 Mpro (L4000) and Autotaxin (L3000), though challenges persisted with binding site flexibility and pose ranking. The prediction of ligand pose achieved satisfactory results, with more than 66% of the distribution having RMSD less than 3 Å. Nucleic acid-ligand predictions (e.g., ZTP riboswitch) yielded mixed results, highlighting limitations in RNA/DNA structural accuracy. Affinity prediction employed diverse methods, with machine learning-based SVR_Conjoint outperforming physics-based approaches (Kendall's Tau = 0.43). Our strategy demonstrated robustness in CASP16, yet underscored the need for advancements in handling conformational dynamics and scoring accuracy. This work provides a framework for future ligand binding prediction efforts in computational drug discovery.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"276-285"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144765848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-08-25DOI: 10.1002/prot.70044
Arne Elofsson
The CASP16 experiment provided the first opportunity to benchmark AlphaFold3. In contrast to AlphaFold2, AlphaFold3 can predict the structure of non-protein molecules. According to the benchmark presented by the developers, it is expected to perform slightly better than AlphaFold2 for proteins. In this study, we assess the performance of AlphaFold3 using both automatic server submissions (AF3-server) and manual predictions from the Elofsson group (Elofsson). All predictions were generated via the AlphaFold3 web server, with manual interventions applied to large targets and ligands. Compared to AlphaFold2-based methods, we found that AlphaFold3 performs slightly better for protein complexes. However, when massive sampling is applied to AlphaFold2, the difference disappears. It was also noted that, according to the official ranking from CASP, the AF3-server performs better than AlphaFold2 for easier targets, but not for harder targets. Furthermore, the performance of the AF3-server is comparable to the best methods when considering the top-ranked predictions, but slightly behind when examining the best among the five submitted models. Here, there exist targets where AF3-server, the top-ranked method, is worse than lower-ranked models, indicating that a venue for progress could be to develop better strategies for identifying the best out of the generated models. When using AF3-server to predict the stoichiometry of larger protein complexes, the accuracy is limited, especially for heteromeric targets. When analyzing the predictions including nucleic acids, it was found that, in general, the accuracy is relatively low. However, the AF3-server performance was not far behind that of the top-ranked method. In summary, AF3-server offers a user-friendly tool that provides predictions comparable to state-of-the-art methods in all categories of CASP.
{"title":"AlphaFold3 at CASP16.","authors":"Arne Elofsson","doi":"10.1002/prot.70044","DOIUrl":"10.1002/prot.70044","url":null,"abstract":"<p><p>The CASP16 experiment provided the first opportunity to benchmark AlphaFold3. In contrast to AlphaFold2, AlphaFold3 can predict the structure of non-protein molecules. According to the benchmark presented by the developers, it is expected to perform slightly better than AlphaFold2 for proteins. In this study, we assess the performance of AlphaFold3 using both automatic server submissions (AF3-server) and manual predictions from the Elofsson group (Elofsson). All predictions were generated via the AlphaFold3 web server, with manual interventions applied to large targets and ligands. Compared to AlphaFold2-based methods, we found that AlphaFold3 performs slightly better for protein complexes. However, when massive sampling is applied to AlphaFold2, the difference disappears. It was also noted that, according to the official ranking from CASP, the AF3-server performs better than AlphaFold2 for easier targets, but not for harder targets. Furthermore, the performance of the AF3-server is comparable to the best methods when considering the top-ranked predictions, but slightly behind when examining the best among the five submitted models. Here, there exist targets where AF3-server, the top-ranked method, is worse than lower-ranked models, indicating that a venue for progress could be to develop better strategies for identifying the best out of the generated models. When using AF3-server to predict the stoichiometry of larger protein complexes, the accuracy is limited, especially for heteromeric targets. When analyzing the predictions including nucleic acids, it was found that, in general, the accuracy is relatively low. However, the AF3-server performance was not far behind that of the top-ranked method. In summary, AF3-server offers a user-friendly tool that provides predictions comparable to state-of-the-art methods in all categories of CASP.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"154-166"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750027/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144980277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-10-06DOI: 10.1002/prot.70063
Keqiong Zhang, Qilong Wu, Sheng-You Huang
In the 15th Critical Assessment of Techniques for Structure Prediction (CASP15), the category of protein-ligand complexes was introduced to advance the development of protein-ligand structure prediction techniques. CASP16 further expanded this category by introducing four sets of pharmaceutical targets as super-targets. Each super-target consists of multiple protein-ligand complexes involving the same protein but different ligands. Given the outstanding performance of template-based methods in CASP15, we employed a template-guided ensemble docking strategy for ligand (LG) tasks in CASP16. MODELER, AlphaFold3, and AlphaFold-Multimer were used to generate structural ensembles for each target protein. Then, we searched the Protein Data Bank (PDB) for reliable template complexes based on sequence identity, ligand similarity, and maximum common substructure (MCS) coverage score. If templates were identified, we used LSalign to perform ligand 3D alignment. For targets without a template, XDock and MDock were used to predict the binding poses. Finally, a knowledge-based scoring function, ITScore, was employed for energy evaluation. It is shown that our method performed well in the CASP16's LG tasks, ranking 4th out of 38 participating teams.
{"title":"Protein-Ligand Structure Prediction by Template-Guided Ensemble Docking Strategy.","authors":"Keqiong Zhang, Qilong Wu, Sheng-You Huang","doi":"10.1002/prot.70063","DOIUrl":"10.1002/prot.70063","url":null,"abstract":"<p><p>In the 15th Critical Assessment of Techniques for Structure Prediction (CASP15), the category of protein-ligand complexes was introduced to advance the development of protein-ligand structure prediction techniques. CASP16 further expanded this category by introducing four sets of pharmaceutical targets as super-targets. Each super-target consists of multiple protein-ligand complexes involving the same protein but different ligands. Given the outstanding performance of template-based methods in CASP15, we employed a template-guided ensemble docking strategy for ligand (LG) tasks in CASP16. MODELER, AlphaFold3, and AlphaFold-Multimer were used to generate structural ensembles for each target protein. Then, we searched the Protein Data Bank (PDB) for reliable template complexes based on sequence identity, ligand similarity, and maximum common substructure (MCS) coverage score. If templates were identified, we used LSalign to perform ligand 3D alignment. For targets without a template, XDock and MDock were used to predict the binding poses. Finally, a knowledge-based scoring function, ITScore, was employed for energy evaluation. It is shown that our method performed well in the CASP16's LG tasks, ranking 4th out of 38 participating teams.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"267-275"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145234256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-09-28DOI: 10.1002/prot.70060
Xavier Robin, Peter Škrinjar, Andrew M Waterhouse, Gabriel Studer, Gerardo Tauriello, Janani Durairaj, Torsten Schwede
Independent, blind assessment of structure prediction methods is essential for establishing state-of-the-art performance, identifying limitations, and guiding future developments. The Continuous Automated Model EvaluatiOn (CAMEO) platform provides weekly, automated benchmarking of structure prediction servers, complementing the biennial Critical Assessment of Structure Prediction (CASP) experiments.
{"title":"Beyond Single Chains: Benchmarking Macromolecular Complex Prediction Methods With the Continuous Automated Model EvaluatiOn (CAMEO).","authors":"Xavier Robin, Peter Škrinjar, Andrew M Waterhouse, Gabriel Studer, Gerardo Tauriello, Janani Durairaj, Torsten Schwede","doi":"10.1002/prot.70060","DOIUrl":"10.1002/prot.70060","url":null,"abstract":"<p><p>Independent, blind assessment of structure prediction methods is essential for establishing state-of-the-art performance, identifying limitations, and guiding future developments. The Continuous Automated Model EvaluatiOn (CAMEO) platform provides weekly, automated benchmarking of structure prediction servers, complementing the biennial Critical Assessment of Structure Prediction (CASP) experiments.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"403-413"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750032/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145187574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}