Pub Date : 2026-01-01Epub Date: 2025-09-04DOI: 10.1002/prot.70043
Rachael C Kretsch, Reinhard Albrecht, Ebbe S Andersen, Hsuan-Ai Chen, Wah Chiu, Rhiju Das, Jeanine G Gezelle, Marcus D Hartmann, Claudia Höbartner, Yimin Hu, Shekhar Jadhav, Philip E Johnson, Christopher P Jones, Deepak Koirala, Emil L Kristoffersen, Eric Largy, Anna Lewicka, Cameron D Mackereth, Marco Marcia, Michela Nigro, Manju Ojha, Joseph A Piccirilli, Phoebe A Rice, Heewhan Shin, Anna-Lena Steckelberg, Zhaoming Su, Yoshita Srivastava, Liu Wang, Yuan Wu, Jiahao Xie, Nikolaj H Zwergius, John Moult, Andriy Kryshtafovych
Accurate biomolecular structure prediction enables the prediction of mutational effects, the speculation of function based on predicted structural homology, the analysis of ligand binding modes, experimental model building, and many other applications. Such algorithms to predict essential functional and structural features remain out of reach for biomolecular complexes containing nucleic acids. Here, we report a quantitative and qualitative evaluation of nucleic acid structures for the CASP16 blind prediction challenge by 12 of the experimental groups who provided nucleic acid targets. Blind predictions accurately model secondary structure and some aspects of tertiary structure, including reasonable global folds for some complex RNAs; however, predictions often lack accuracy in the regions of highest functional importance. All models have inaccuracies in non-canonical regions where, for example, the nucleic-acid backbone bends, deviating from an A-form helix geometry, or a base forms a non-standard hydrogen bond (not a Watson-Crick base pair). These bends and non-canonical interactions are integral to forming functionally important regions such as RNA enzymatic active sites. Additionally, the modeling of conserved and functional interfaces between nucleic acids and ligands, proteins, or other nucleic acids remains poor. For some targets, the experimental structures may not represent the only structure the biomolecular complex occupies in solution or in its functional life cycle, posing a future challenge for the community.
{"title":"Functional Relevance of CASP16 Nucleic Acid Predictions as Evaluated by Structure Providers.","authors":"Rachael C Kretsch, Reinhard Albrecht, Ebbe S Andersen, Hsuan-Ai Chen, Wah Chiu, Rhiju Das, Jeanine G Gezelle, Marcus D Hartmann, Claudia Höbartner, Yimin Hu, Shekhar Jadhav, Philip E Johnson, Christopher P Jones, Deepak Koirala, Emil L Kristoffersen, Eric Largy, Anna Lewicka, Cameron D Mackereth, Marco Marcia, Michela Nigro, Manju Ojha, Joseph A Piccirilli, Phoebe A Rice, Heewhan Shin, Anna-Lena Steckelberg, Zhaoming Su, Yoshita Srivastava, Liu Wang, Yuan Wu, Jiahao Xie, Nikolaj H Zwergius, John Moult, Andriy Kryshtafovych","doi":"10.1002/prot.70043","DOIUrl":"10.1002/prot.70043","url":null,"abstract":"<p><p>Accurate biomolecular structure prediction enables the prediction of mutational effects, the speculation of function based on predicted structural homology, the analysis of ligand binding modes, experimental model building, and many other applications. Such algorithms to predict essential functional and structural features remain out of reach for biomolecular complexes containing nucleic acids. Here, we report a quantitative and qualitative evaluation of nucleic acid structures for the CASP16 blind prediction challenge by 12 of the experimental groups who provided nucleic acid targets. Blind predictions accurately model secondary structure and some aspects of tertiary structure, including reasonable global folds for some complex RNAs; however, predictions often lack accuracy in the regions of highest functional importance. All models have inaccuracies in non-canonical regions where, for example, the nucleic-acid backbone bends, deviating from an A-form helix geometry, or a base forms a non-standard hydrogen bond (not a Watson-Crick base pair). These bends and non-canonical interactions are integral to forming functionally important regions such as RNA enzymatic active sites. Additionally, the modeling of conserved and functional interfaces between nucleic acids and ligands, proteins, or other nucleic acids remains poor. For some targets, the experimental structures may not represent the only structure the biomolecular complex occupies in solution or in its functional life cycle, posing a future challenge for the community.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"51-78"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12412911/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144994734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We report the results from the "MIEnsembles-Server" and "Zheng" groups for structure ensemble predictions in CASP16, both of which employed the EnsembleFold pipeline. Initially, multiple sequence alignments (MSAs) were generated using DeepMSA2 for proteins and rMSA for RNA targets. These MSAs were processed by newly developed deep learning methods-D-I-TASSER2 for protein monomer structure prediction, DMFold2 for protein complex structure prediction, ExFold for RNA structure prediction, and DeepProtNA for protein-nucleic acid complex structure prediction-to yield diverse structural decoys. The generated decoys were clustered into representative models corresponding to distinct conformational states using the structural clustering tool MolClust. Protein monomer targets underwent additional refinement via replica-exchange Monte Carlo (REMC) simulations with D-I-TASSER2, and these refined decoys were re-clustered with MolClust to finalize the ensemble predictions. For the 19 ensemble targets in CASP16, the final EnsembleFold models achieved an average TM-score of 0.657, representing improvements of 10.2% compared to the baseline AlphaFold3 program. Notably, EnsembleFold achieved particularly good performance for hybrid protein/nucleic-acid targets, leading to its efficacy in ensemble prediction tasks. Analysis of the resulting structural ensembles highlighted three significant insights: (i) Models derived from distinct DeepMSA2-generated MSAs typically represent different conformational states for ensemble targets; (ii) REMC simulations significantly enhance model diversity, facilitating the identification of alternative conformations; (iii) The structural clustering approach effectively identifies and selects accurate representative models for each conformational state. We further discuss potential improvements in Quality Assessment (QA) scoring methods that could further enhance the reliability and accuracy of ensemble predictions in the future.
我们报告了“MIEnsembles-Server”和“Zheng”小组在CASP16中进行结构集成预测的结果,两者都使用了EnsembleFold管道。最初,使用DeepMSA2对蛋白质和rMSA对RNA靶标生成多个序列比对(msa)。这些msa通过新开发的深度学习方法(d - i - tasser2用于蛋白质单体结构预测,DMFold2用于蛋白质复合体结构预测,ExFold用于RNA结构预测,DeepProtNA用于蛋白质-核酸复合体结构预测)进行处理,以产生不同的结构诱饵。使用结构聚类工具MolClust将生成的诱饵聚类到不同构象状态对应的代表性模型中。蛋白质单体靶标通过D-I-TASSER2的复制交换蒙特卡罗(REMC)模拟进行了进一步的改进,这些改进的诱饵用MolClust重新聚类,最终完成了集合预测。对于CASP16中的19个集成目标,最终的EnsembleFold模型实现了0.657的平均tm得分,与基线AlphaFold3程序相比,提高了10.2%。值得注意的是,EnsembleFold在杂交蛋白/核酸靶点上取得了特别好的性能,因此它在集成预测任务中非常有效。对结果结构集成的分析突出了三个重要的见解:(i)来自不同deepmsa2生成的msa的模型通常代表了集成目标的不同构象状态;(ii) REMC模拟显著增强了模型多样性,促进了替代构象的识别;(iii)结构聚类方法有效地识别和选择每个构象状态的准确代表模型。我们进一步讨论了质量评估(QA)评分方法的潜在改进,这些方法可以在未来进一步提高集合预测的可靠性和准确性。
{"title":"Alternative Conformation Prediction Using Deep Learning With Multi-MSA Strategy and Structural Clustering in CASP16.","authors":"Qiqige Wuyun, Quancheng Liu, Wentao Ni, Chunxiang Peng, Ziying Zhang, Xiaogen Zhou, Gang Hu, Lydia Freddolino, Wei Zheng","doi":"10.1002/prot.70059","DOIUrl":"10.1002/prot.70059","url":null,"abstract":"<p><p>We report the results from the \"MIEnsembles-Server\" and \"Zheng\" groups for structure ensemble predictions in CASP16, both of which employed the EnsembleFold pipeline. Initially, multiple sequence alignments (MSAs) were generated using DeepMSA2 for proteins and rMSA for RNA targets. These MSAs were processed by newly developed deep learning methods-D-I-TASSER2 for protein monomer structure prediction, DMFold2 for protein complex structure prediction, ExFold for RNA structure prediction, and DeepProtNA for protein-nucleic acid complex structure prediction-to yield diverse structural decoys. The generated decoys were clustered into representative models corresponding to distinct conformational states using the structural clustering tool MolClust. Protein monomer targets underwent additional refinement via replica-exchange Monte Carlo (REMC) simulations with D-I-TASSER2, and these refined decoys were re-clustered with MolClust to finalize the ensemble predictions. For the 19 ensemble targets in CASP16, the final EnsembleFold models achieved an average TM-score of 0.657, representing improvements of 10.2% compared to the baseline AlphaFold3 program. Notably, EnsembleFold achieved particularly good performance for hybrid protein/nucleic-acid targets, leading to its efficacy in ensemble prediction tasks. Analysis of the resulting structural ensembles highlighted three significant insights: (i) Models derived from distinct DeepMSA2-generated MSAs typically represent different conformational states for ensemble targets; (ii) REMC simulations significantly enhance model diversity, facilitating the identification of alternative conformations; (iii) The structural clustering approach effectively identifies and selects accurate representative models for each conformational state. We further discuss potential improvements in Quality Assessment (QA) scoring methods that could further enhance the reliability and accuracy of ensemble predictions in the future.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"348-361"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145180559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-10-16DOI: 10.1002/prot.70062
Allen C McBride, Feng Yu, Edward H Cheng, Aulane Mpouli, Aimee C Soe, Michal Hammel, Gaetano T Montelione, Terrence G Oas, Susan E Tsutakawa, Bruce R Donald
In CASP16, we assessed the ability of computational methods to predict the distribution of relative orientations of two domains tethered by a flexible linker. The range of interdomain distances and orientations (poses) of such domain-linker-domain (D-L-D) proteins can play an important role in protein function, allostery, aggregation, and the thermodynamics of binding. The CASP16 Conformational Ensembles Experiment included two challenges to predict the interdomain pose distribution of a Staphylococcal protein A (SpA) D-L-D construct, called ZLBT-C, in which two of SpA's five nearly identical domains are connected by either (1) a six-residue wild-type (WT) linker (kadnkf), or (2) an all-glycine (Gly6) linker. The wild-type linker has a highly conserved sequence and is thought to contribute to the energetic barrier for binding with host antibodies. Ground truth was provided by nuclear magnetic resonance (NMR) residual dipolar coupling (RDC) data on WT protein and small angle X-ray scattering (SAXS) data on both proteins in solution. Twenty-five predictor groups submitted 35 sets of predicted conformational distributions, in the form of population-weighted finite ensembles of discrete structures. Unlike traditional CASP assessments that compare predicted atomic models to experimental atomic models, the accuracy of these predictions was assessed by back-calculating NMR RDCs and SAXS curves from each ensemble of atomic models and comparing these results to respective experimental data. Accuracy was also assessed by using kernelization to compare ensembles to the continuous orientational distributions optimally fit to experimental data. In our assessment, predictions spanned a wide range of accuracy, but none were close fits to the combined NMR and SAXS data. In addition, none were able to recapitulate the observed difference between WT and Gly6 proteins, as observed in the SAXS data. These results, and our analysis, highlighted strengths and weaknesses, plus complementarity of NMR RDC and SAXS analysis.
{"title":"Predicting Pose Distribution of Protein Domains Connected by Flexible Linkers Is an Unsolved Problem.","authors":"Allen C McBride, Feng Yu, Edward H Cheng, Aulane Mpouli, Aimee C Soe, Michal Hammel, Gaetano T Montelione, Terrence G Oas, Susan E Tsutakawa, Bruce R Donald","doi":"10.1002/prot.70062","DOIUrl":"10.1002/prot.70062","url":null,"abstract":"<p><p>In CASP16, we assessed the ability of computational methods to predict the distribution of relative orientations of two domains tethered by a flexible linker. The range of interdomain distances and orientations (poses) of such domain-linker-domain (D-L-D) proteins can play an important role in protein function, allostery, aggregation, and the thermodynamics of binding. The CASP16 Conformational Ensembles Experiment included two challenges to predict the interdomain pose distribution of a Staphylococcal protein A (SpA) D-L-D construct, called ZLBT-C, in which two of SpA's five nearly identical domains are connected by either (1) a six-residue wild-type (WT) linker (kadnkf), or (2) an all-glycine (Gly6) linker. The wild-type linker has a highly conserved sequence and is thought to contribute to the energetic barrier for binding with host antibodies. Ground truth was provided by nuclear magnetic resonance (NMR) residual dipolar coupling (RDC) data on WT protein and small angle X-ray scattering (SAXS) data on both proteins in solution. Twenty-five predictor groups submitted 35 sets of predicted conformational distributions, in the form of population-weighted finite ensembles of discrete structures. Unlike traditional CASP assessments that compare predicted atomic models to experimental atomic models, the accuracy of these predictions was assessed by back-calculating NMR RDCs and SAXS curves from each ensemble of atomic models and comparing these results to respective experimental data. Accuracy was also assessed by using kernelization to compare ensembles to the continuous orientational distributions optimally fit to experimental data. In our assessment, predictions spanned a wide range of accuracy, but none were close fits to the combined NMR and SAXS data. In addition, none were able to recapitulate the observed difference between WT and Gly6 proteins, as observed in the SAXS data. These results, and our analysis, highlighted strengths and weaknesses, plus complementarity of NMR RDC and SAXS analysis.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"362-380"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12648682/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145310009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-10-30DOI: 10.1002/prot.70072
Rachael C Kretsch, Alissa M Hummer, Shujun He, Rongqing Yuan, Jing Zhang, Thomas Karagianes, Qian Cong, Andriy Kryshtafovych, Rhiju Das
Consistently accurate 3D nucleic acid structure prediction would facilitate studies of the diverse RNA and DNA molecules underlying life. In CASP16, blind predictions for 42 targets canvassing a full array of nucleic acid functions, from dopamine binding by DNA to formation of elaborate RNA nanocages, were submitted by 65 groups from 46 different labs worldwide. In contrast to concurrent protein structure predictions, performance on nucleic acids was generally poor, with no predictions of previously unseen natural RNA structures achieving TM-scores above 0.8. Even though automated server performance has improved, all top-performing groups were human expert predictors: Vfold, GuangzhouRNA-human, and KiharaLab. Good performance on one template-free modeling target (OLE RNA) and accurate global secondary structure prediction suggested that structural information can be extracted from multiple sequence alignments. However, 3D accuracy generally appeared to depend on the availability of closely related 3D structure templates, and predictions still did not achieve consistent recovery of pseudoknots, singlet Watson-Crick-Franklin pairs, non-canonical pairs, or tertiary motifs like A-minor interactions. For the first time, blind predictions of nucleic acid interactions with small molecules, proteins, and other nucleic acids could be assessed in CASP16. As with nucleic acid monomers, prediction accuracy for nucleic acid complexes was generally poor unless 3D templates were available. Accounting for template availability, there has not been a notable increase in nucleic acid modeling accuracy between previous blind challenges and CASP16.
{"title":"Assessment of Nucleic Acid Structure Prediction in CASP16.","authors":"Rachael C Kretsch, Alissa M Hummer, Shujun He, Rongqing Yuan, Jing Zhang, Thomas Karagianes, Qian Cong, Andriy Kryshtafovych, Rhiju Das","doi":"10.1002/prot.70072","DOIUrl":"10.1002/prot.70072","url":null,"abstract":"<p><p>Consistently accurate 3D nucleic acid structure prediction would facilitate studies of the diverse RNA and DNA molecules underlying life. In CASP16, blind predictions for 42 targets canvassing a full array of nucleic acid functions, from dopamine binding by DNA to formation of elaborate RNA nanocages, were submitted by 65 groups from 46 different labs worldwide. In contrast to concurrent protein structure predictions, performance on nucleic acids was generally poor, with no predictions of previously unseen natural RNA structures achieving TM-scores above 0.8. Even though automated server performance has improved, all top-performing groups were human expert predictors: Vfold, GuangzhouRNA-human, and KiharaLab. Good performance on one template-free modeling target (OLE RNA) and accurate global secondary structure prediction suggested that structural information can be extracted from multiple sequence alignments. However, 3D accuracy generally appeared to depend on the availability of closely related 3D structure templates, and predictions still did not achieve consistent recovery of pseudoknots, singlet Watson-Crick-Franklin pairs, non-canonical pairs, or tertiary motifs like A-minor interactions. For the first time, blind predictions of nucleic acid interactions with small molecules, proteins, and other nucleic acids could be assessed in CASP16. As with nucleic acid monomers, prediction accuracy for nucleic acid complexes was generally poor unless 3D templates were available. Accounting for template availability, there has not been a notable increase in nucleic acid modeling accuracy between previous blind challenges and CASP16.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"192-217"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145403029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-11-03DOI: 10.1002/prot.70076
Andriy Kryshtafovych, Torsten Schwede, Maya Topf, Krzysztof Fidelis, John Moult
CASP16 is the most recent in a series of community experiments to rigorously assess the state of the art in areas of computational structural biology. The field has advanced enormously in recent years: in early CASPs, the assessments centered around whether the methods were at all useful. Now they mostly focus on how near we are to not needing experiments. In most areas, deep learning methods dominate, particularly AlphaFold variants and associated technology. In this round, there is no significant change in overall agreement between calculated monomer protein structures and their experimental counterparts, not because of method deficiencies but because, for most proteins, agreement is likely as high as can be obtained given experimental uncertainty. For protein complexes, huge gains in accuracy were made in the previous CASP, but there still appears to be room for further improvement. In contrast to these encouraging results, for RNA structures, the deep learning methods are notably unsuccessful at present and are not superior to traditional approaches. Both approaches still produce very poor results in the absence of structural homology. For macromolecular ensembles, the small CASP target set limits conclusions, but generally, in the absence of structural templates, results tend to be poor and detailed structures of alternative conformations are usually of relatively low accuracy. For organic ligand-protein structures and affinities (important for aspects of drug design), deep learning methods are substantially more successful than traditional ones on the relatively easy CASP target set, though the results often fall short of experimental accuracy. In the less glamorous but essential area of methods for estimating the accuracy, previous results found reliable accuracy estimates at the amino acid level. The present CASP results show that the best methods are also largely effective in selecting models of protein complexes with high interface accuracy. Will upcoming method improvements overcome the remaining barriers to reaching experimental accuracy in all categories? We will have to wait until the next CASP to find out, but there are two promising trends. One is the combination of traditional physics-inspired methods and deep learning, and the other is the expected increase in training data, especially for ligand-protein complexes.
{"title":"Progress and Bottlenecks for Deep Learning in Computational Structure Biology: CASP Round XVI.","authors":"Andriy Kryshtafovych, Torsten Schwede, Maya Topf, Krzysztof Fidelis, John Moult","doi":"10.1002/prot.70076","DOIUrl":"10.1002/prot.70076","url":null,"abstract":"<p><p>CASP16 is the most recent in a series of community experiments to rigorously assess the state of the art in areas of computational structural biology. The field has advanced enormously in recent years: in early CASPs, the assessments centered around whether the methods were at all useful. Now they mostly focus on how near we are to not needing experiments. In most areas, deep learning methods dominate, particularly AlphaFold variants and associated technology. In this round, there is no significant change in overall agreement between calculated monomer protein structures and their experimental counterparts, not because of method deficiencies but because, for most proteins, agreement is likely as high as can be obtained given experimental uncertainty. For protein complexes, huge gains in accuracy were made in the previous CASP, but there still appears to be room for further improvement. In contrast to these encouraging results, for RNA structures, the deep learning methods are notably unsuccessful at present and are not superior to traditional approaches. Both approaches still produce very poor results in the absence of structural homology. For macromolecular ensembles, the small CASP target set limits conclusions, but generally, in the absence of structural templates, results tend to be poor and detailed structures of alternative conformations are usually of relatively low accuracy. For organic ligand-protein structures and affinities (important for aspects of drug design), deep learning methods are substantially more successful than traditional ones on the relatively easy CASP target set, though the results often fall short of experimental accuracy. In the less glamorous but essential area of methods for estimating the accuracy, previous results found reliable accuracy estimates at the amino acid level. The present CASP results show that the best methods are also largely effective in selecting models of protein complexes with high interface accuracy. Will upcoming method improvements overcome the remaining barriers to reaching experimental accuracy in all categories? We will have to wait until the next CASP to find out, but there are two promising trends. One is the combination of traditional physics-inspired methods and deep learning, and the other is the expected increase in training data, especially for ligand-protein complexes.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"5-14"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12703882/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145432710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-08-08DOI: 10.1002/prot.70034
Bowen Xiao, Yaohuang Shi, Lin Huang
RNA three-dimensional structures are critical for their roles in gene expression and regulation. However, predicting RNA structures remains challenging due to complex tertiary interactions, ion dependency, molecular flexibility, and the limited availability of known 3D structures. To address these challenges, our team (GuangzhouRNA-human) employed a hybrid strategy combining computational tools with expert refinement in the CASP16 RNA structure prediction challenge, achieving second place based on the sum Z-score. Our approach integrates multiple techniques through modular workflows, including template-based modeling for targets with homologous templates and ab initio prediction using deep learning tools (e.g., AlphaFold3 and DeepFoldRNA) for novel sequences. Additionally, we incorporate experimental constraints and iterative optimization to enhance prediction accuracy. For targets shorter than 200 nucleotides (nt) with homologous templates, our method demonstrated exceptional performance, achieving 75% of predictions with root-mean-square deviations (RMSD) below 5 Å, and all predictions falling under 10 Å. Furthermore, our strategy demonstrated promising results for targets without homologous templates, such as R1209, through comprehensive literature reviews and structural selection. Despite these advances, RNA structure prediction continues to face challenges, particularly in predicting complex topologies like pseudoknots and coaxial stacking. Future improvements in integrating computational tools with expert knowledge are essential to enhance the accuracy and applicability of RNA tertiary structure prediction.
{"title":"Enhancing RNA 3D Structure Prediction: A Hybrid Approach Combining Expert Knowledge and Computational Tools in CASP16.","authors":"Bowen Xiao, Yaohuang Shi, Lin Huang","doi":"10.1002/prot.70034","DOIUrl":"10.1002/prot.70034","url":null,"abstract":"<p><p>RNA three-dimensional structures are critical for their roles in gene expression and regulation. However, predicting RNA structures remains challenging due to complex tertiary interactions, ion dependency, molecular flexibility, and the limited availability of known 3D structures. To address these challenges, our team (GuangzhouRNA-human) employed a hybrid strategy combining computational tools with expert refinement in the CASP16 RNA structure prediction challenge, achieving second place based on the sum Z-score. Our approach integrates multiple techniques through modular workflows, including template-based modeling for targets with homologous templates and ab initio prediction using deep learning tools (e.g., AlphaFold3 and DeepFoldRNA) for novel sequences. Additionally, we incorporate experimental constraints and iterative optimization to enhance prediction accuracy. For targets shorter than 200 nucleotides (nt) with homologous templates, our method demonstrated exceptional performance, achieving 75% of predictions with root-mean-square deviations (RMSD) below 5 Å, and all predictions falling under 10 Å. Furthermore, our strategy demonstrated promising results for targets without homologous templates, such as R1209, through comprehensive literature reviews and structural selection. Despite these advances, RNA structure prediction continues to face challenges, particularly in predicting complex topologies like pseudoknots and coaxial stacking. Future improvements in integrating computational tools with expert knowledge are essential to enhance the accuracy and applicability of RNA tertiary structure prediction.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"230-238"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144801053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-01Epub Date: 2025-10-04DOI: 10.1002/prot.70061
Michael K Gilson, Jerome Eberhardt, Peter Škrinjar, Janani Durairaj, Xavier Robin, Andriy Kryshtafovych
The protein-ligand component of the 16th Critical Assessment of Structure Prediction (CASP16) challenged participants to predict both binding poses and affinities of small molecules to protein targets, with a focus on drug-like compounds from pharmaceutical discovery projects. Thirty research groups submitted predictions for 229 protein-ligand pose targets and 140 affinity targets across five protein systems. Among the submitted predictions, template-based pose-prediction methods did particularly well, with the best groups achieving mean LDDT-PLI values of 0.69 (scale of 0-1 with 1 best). For comparison, we also ran a set of automated baseline pose-prediction methods, including ones using deep neural networks. Of these, AlphaFold 3 did particularly well, with a mean LDDT-PLI of 0.8, thus outscoring the best CASP16 predictor. The CASP affinity predictions showed modest correlation with experimental data (maximum Kendall's τ = 0.42), well below the theoretical maximum possible given experimental uncertainty (~0.73). As seen in prior challenges, providing experimental structures did not improve affinity predictions in the second stage of the challenge, suggesting that the scoring functions used here are a key limiting factor. Overall, the accuracy achieved by CASP participants is similar to that observed in the prior Drug Design Data Resource (D3R) blinded prediction challenges. The present results highlight the progress and persistent challenges in computational protein-ligand modeling and provide valuable benchmarks for the field of computer-aided drug design.
{"title":"Assessment of Pharmaceutical Protein-Ligand Pose and Affinity Predictions in CASP16.","authors":"Michael K Gilson, Jerome Eberhardt, Peter Škrinjar, Janani Durairaj, Xavier Robin, Andriy Kryshtafovych","doi":"10.1002/prot.70061","DOIUrl":"10.1002/prot.70061","url":null,"abstract":"<p><p>The protein-ligand component of the 16th Critical Assessment of Structure Prediction (CASP16) challenged participants to predict both binding poses and affinities of small molecules to protein targets, with a focus on drug-like compounds from pharmaceutical discovery projects. Thirty research groups submitted predictions for 229 protein-ligand pose targets and 140 affinity targets across five protein systems. Among the submitted predictions, template-based pose-prediction methods did particularly well, with the best groups achieving mean LDDT-PLI values of 0.69 (scale of 0-1 with 1 best). For comparison, we also ran a set of automated baseline pose-prediction methods, including ones using deep neural networks. Of these, AlphaFold 3 did particularly well, with a mean LDDT-PLI of 0.8, thus outscoring the best CASP16 predictor. The CASP affinity predictions showed modest correlation with experimental data (maximum Kendall's τ = 0.42), well below the theoretical maximum possible given experimental uncertainty (~0.73). As seen in prior challenges, providing experimental structures did not improve affinity predictions in the second stage of the challenge, suggesting that the scoring functions used here are a key limiting factor. Overall, the accuracy achieved by CASP participants is similar to that observed in the prior Drug Design Data Resource (D3R) blinded prediction challenges. The present results highlight the progress and persistent challenges in computational protein-ligand modeling and provide valuable benchmarks for the field of computer-aided drug design.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":"249-266"},"PeriodicalIF":2.8,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12750038/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145226386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leishmaniasis, caused by Leishmania donovani, remains a major neglected tropical disease (NTD) with limited therapeutic options and the absence of a universally effective vaccine. Multi-epitope vaccines offer a promising strategy for combating this intracellular parasite by stimulating a robust and specific immune response. In this study, an immunoinformatics-driven, in silico reverse vaccinology approach was utilized to design a multi-epitope vaccine targeting key surface-exposed proteins of L. donovani, namely C-type lectin, Proteophosphoglycan (PPG4), Hydrophilic Acylated Surface Protein (HASP), Legume-like Lectin (LLL), and Kinetoplastid Membrane Protein (KMP-11). These proteins are implicated in essential processes such as parasite survival, immune modulation, and host-pathogen interactions, making them prime candidates for vaccine development. A comprehensive analysis was conducted to identify and screen B-cell and T-cell (MHC-I and MHC-II) epitopes for immunogenicity, antigenicity, and population coverage. Multi-epitope vaccines, incorporating individual proteins or chimeric constructs, were developed with IFN-gamma as an adjuvant. The vaccine constructs were prioritized based on factors such as IC50 values and immunogenic potential. Subsequently, the selected epitopes were analyzed for physicochemical properties, and secondary and tertiary structural predictions were made and validated. Molecular docking simulations were employed to examine the interaction of the vaccine constructs with immune receptors, ensuring optimal immune system activation. Based on the molecular docking score, the vaccine candidates were screened and top four constructs (vaccines based on C-type lectin, LLL, PPG and chimeric vaccine; -1048.9, -1025.8, -1291.8, and -852.1 Kcal/mol respectively) were processed through immunogenic simulation. This in silico analysis indicates that lectins are highly effective vaccine candidates. Further, top two constructs, based on the immunogenic simulations, underwent molecular dynamics simulations. In the end, the final constructs were computationally cloned in pET28a vector. This study underscores the potential of multi-epitope vaccines as a cost-effective and efficient strategy for addressing L. donovani infections, providing a foundation for subsequent experimental validation and clinical trial development.
{"title":"Multi-Epitope Vaccine Design Against Leishmania donovani: An Immunoinformatic Based In Silico Approach.","authors":"Aviral Kaushik, Manisha Pritam, Sumit Govil, Radhey Shyam Kaushal","doi":"10.1002/prot.70102","DOIUrl":"https://doi.org/10.1002/prot.70102","url":null,"abstract":"<p><p>Leishmaniasis, caused by Leishmania donovani, remains a major neglected tropical disease (NTD) with limited therapeutic options and the absence of a universally effective vaccine. Multi-epitope vaccines offer a promising strategy for combating this intracellular parasite by stimulating a robust and specific immune response. In this study, an immunoinformatics-driven, in silico reverse vaccinology approach was utilized to design a multi-epitope vaccine targeting key surface-exposed proteins of L. donovani, namely C-type lectin, Proteophosphoglycan (PPG4), Hydrophilic Acylated Surface Protein (HASP), Legume-like Lectin (LLL), and Kinetoplastid Membrane Protein (KMP-11). These proteins are implicated in essential processes such as parasite survival, immune modulation, and host-pathogen interactions, making them prime candidates for vaccine development. A comprehensive analysis was conducted to identify and screen B-cell and T-cell (MHC-I and MHC-II) epitopes for immunogenicity, antigenicity, and population coverage. Multi-epitope vaccines, incorporating individual proteins or chimeric constructs, were developed with IFN-gamma as an adjuvant. The vaccine constructs were prioritized based on factors such as IC<sub>50</sub> values and immunogenic potential. Subsequently, the selected epitopes were analyzed for physicochemical properties, and secondary and tertiary structural predictions were made and validated. Molecular docking simulations were employed to examine the interaction of the vaccine constructs with immune receptors, ensuring optimal immune system activation. Based on the molecular docking score, the vaccine candidates were screened and top four constructs (vaccines based on C-type lectin, LLL, PPG and chimeric vaccine; -1048.9, -1025.8, -1291.8, and -852.1 Kcal/mol respectively) were processed through immunogenic simulation. This in silico analysis indicates that lectins are highly effective vaccine candidates. Further, top two constructs, based on the immunogenic simulations, underwent molecular dynamics simulations. In the end, the final constructs were computationally cloned in pET28a vector. This study underscores the potential of multi-epitope vaccines as a cost-effective and efficient strategy for addressing L. donovani infections, providing a foundation for subsequent experimental validation and clinical trial development.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145859679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Navigating the Pre- and Post-AlphaFold Divide: CAPRI 8th Evaluation Meeting, February 12-14, Grenoble, FR.","authors":"Alexandre M J J Bonvin, Marc F Lensink","doi":"10.1002/prot.70058","DOIUrl":"https://doi.org/10.1002/prot.70058","url":null,"abstract":"","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145812359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Herbacetin (HC) is a naturally occurring flavonoid compound with a dual antiviral mechanism. It inhibits the polyamine biosynthetic pathway and targets the methyltransferase (MTase) enzyme of both the dengue virus (DENV) and chikungunya virus (CHIKV). However, the detailed inhibition mechanism of DENV-3 non-structural protein (NS5) MTase by HC remains unclear. This study provides structural insights into the inhibition mechanism of HC by analyzing the crystal structure of DENV-3 NS5 MTase complexed with HC and S-adenosyl-L-homocysteine. Structural analysis revealed that HC binds to the Cap 0-RNA site near the GTP binding site in the DENV-3 NS5 MTase. Additionally, the fluorescence polarization assay demonstrated that HC inhibits GTP binding with an inhibition constant (Ki) value of ~0.43 μM. This is one of the first studies that identify an inhibitor that targets the conserved RNA-binding region of NS5 MTase, suggesting its potential as a highly effective scaffold for broad-spectrum antiviral agents against orthoflaviviruses.
{"title":"Mechanistic Insights Into the Inhibition of Dengue Virus NS5 Methyltransferase by Herbacetin.","authors":"Mandar Bhutkar, Shalja Verma, Vishakha Singh, Pravindra Kumar, Shailly Tomar","doi":"10.1002/prot.70108","DOIUrl":"https://doi.org/10.1002/prot.70108","url":null,"abstract":"<p><p>Herbacetin (HC) is a naturally occurring flavonoid compound with a dual antiviral mechanism. It inhibits the polyamine biosynthetic pathway and targets the methyltransferase (MTase) enzyme of both the dengue virus (DENV) and chikungunya virus (CHIKV). However, the detailed inhibition mechanism of DENV-3 non-structural protein (NS5) MTase by HC remains unclear. This study provides structural insights into the inhibition mechanism of HC by analyzing the crystal structure of DENV-3 NS5 MTase complexed with HC and S-adenosyl-L-homocysteine. Structural analysis revealed that HC binds to the Cap 0-RNA site near the GTP binding site in the DENV-3 NS5 MTase. Additionally, the fluorescence polarization assay demonstrated that HC inhibits GTP binding with an inhibition constant (K<sub>i</sub>) value of ~0.43 μM. This is one of the first studies that identify an inhibitor that targets the conserved RNA-binding region of NS5 MTase, suggesting its potential as a highly effective scaffold for broad-spectrum antiviral agents against orthoflaviviruses.</p>","PeriodicalId":56271,"journal":{"name":"Proteins-Structure Function and Bioinformatics","volume":" ","pages":""},"PeriodicalIF":2.8,"publicationDate":"2025-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145812339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}