Pub Date : 2026-01-18DOI: 10.1021/acs.jcim.5c03021
Miriam Gulman, , , Jordan Chill, , and , Dan Thomas Major*,
Understanding protein-peptide interactions is essential for uncovering cellular signaling mechanisms and advancing therapeutic development, as these interactions play central roles in numerous biological processes. Gaining structural insight into such complexes is crucial, yet traditional methods like nuclear magnetic resonance (NMR) and X-ray crystallography are often time-consuming and experimentally demanding. Computational approaches─including physics-based docking and deep-learning (DL) structure predictors such as AlphaFold3, Boltz-2, and Chai-1─offer powerful alternatives. Accurately modeling flexible peptides that bind to shallow, surface-exposed regions remains difficult for physics-based methods, and although multiple sequence alignment-driven DL models can achieve excellent performance in well-behaved systems, they too can struggle when the peptide adopts noncanonical conformations or when sequence identity is low. In such cases, distance restraints are often required to guide the docking toward accurate and biologically meaningful solutions, yet acquiring multiple high-quality restraints is often difficult. To address the limitation of physics and DL approaches, we developed a restraint scoring function that integrates evolutionary conservation, spatial proximity, and geometric distribution to assess the informativeness of restraint sets. This enables a more accurate evaluation of docking inputs and overcomes the shortcomings of relying solely on restraint count. Building on this framework, we introduce a minimal-restraint docking strategy, capable of identifying optimized subsets of restraints that lead to high-quality structural models. We evaluate a comprehensive set of protein–peptide systems, including 43 SH3 domain complexes, 8 WW domain complexes, and 19 medium-difficulty cases from the PepPCBench benchmark. Our approach shows that model quality improves as the restraint score increases, supporting restraint score as a simple, interpretable indicator of docking success. We further identify clear, domain-specific restraint-score thresholds for the SH3 and WW systems that enable accurate model selection. Together, these results offer a scalable and efficient strategy for structure prediction in data-limited contexts and lay the groundwork for restraint-informed modeling with quantifiable confidence, as well as a powerful foundation for data-efficient machine learning-based peptide–protein docking.
{"title":"Restraint Quality, Not Quantity, Predicts Peptide–Protein Docking Outcomes","authors":"Miriam Gulman, , , Jordan Chill, , and , Dan Thomas Major*, ","doi":"10.1021/acs.jcim.5c03021","DOIUrl":"10.1021/acs.jcim.5c03021","url":null,"abstract":"<p >Understanding protein-peptide interactions is essential for uncovering cellular signaling mechanisms and advancing therapeutic development, as these interactions play central roles in numerous biological processes. Gaining structural insight into such complexes is crucial, yet traditional methods like nuclear magnetic resonance (NMR) and X-ray crystallography are often time-consuming and experimentally demanding. Computational approaches─including physics-based docking and deep-learning (DL) structure predictors such as AlphaFold3, Boltz-2, and Chai-1─offer powerful alternatives. Accurately modeling flexible peptides that bind to shallow, surface-exposed regions remains difficult for physics-based methods, and although multiple sequence alignment-driven DL models can achieve excellent performance in well-behaved systems, they too can struggle when the peptide adopts noncanonical conformations or when sequence identity is low. In such cases, distance restraints are often required to guide the docking toward accurate and biologically meaningful solutions, yet acquiring multiple high-quality restraints is often difficult. To address the limitation of physics and DL approaches, we developed a restraint scoring function that integrates evolutionary conservation, spatial proximity, and geometric distribution to assess the informativeness of restraint sets. This enables a more accurate evaluation of docking inputs and overcomes the shortcomings of relying solely on restraint count. Building on this framework, we introduce a minimal-restraint docking strategy, capable of identifying optimized subsets of restraints that lead to high-quality structural models. We evaluate a comprehensive set of protein–peptide systems, including 43 SH3 domain complexes, 8 WW domain complexes, and 19 medium-difficulty cases from the PepPCBench benchmark. Our approach shows that model quality improves as the restraint score increases, supporting restraint score as a simple, interpretable indicator of docking success. We further identify clear, domain-specific restraint-score thresholds for the SH3 and WW systems that enable accurate model selection. Together, these results offer a scalable and efficient strategy for structure prediction in data-limited contexts and lay the groundwork for restraint-informed modeling with quantifiable confidence, as well as a powerful foundation for data-efficient machine learning-based peptide–protein docking.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"66 3","pages":"1727–1741"},"PeriodicalIF":5.3,"publicationDate":"2026-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145994808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-17DOI: 10.1021/acs.jcim.5c02204
Hsu-Chun Tsai,Shi Zhang,Tai-Sung Lee,Timothy J Giese,Charles Lin,James Xu,Yinhui Yi,Darrin M York,Abir Ganguly,Albert C Pan
Relative binding free energy (RBFE) calculations, widely used to predict the potencies of congeneric small molecules binding to a protein receptor, can greatly increase the efficiency of the hit-to-lead and lead optimization stages of the drug discovery process. Traditional RBFE methods, however, cannot be easily applied to small molecules lacking a common core or binding mode, precluding their use in a challenging but crucial component of many drug discovery campaigns. In principle, an absolute binding free energy (ABFE) method can be applied to such molecules, but ABFE often suffers from high computational cost and poor statistical convergence due to the large amount of additional sampling required when compared to RBFE. Here, we introduce core-hopping binding free energy (CBFE) calculations, a computationally efficient framework for the accurate determination of relative binding free energies between small molecules with different cores, leveraging several recently developed techniques such as Alchemical Enhanced Sampling (ACES) with optimized transformation pathways and flexible λ-spacing, as well as λ-dependent Boresch restraints. We benchmark the performance of CBFE across 4 protein systems consisting of 56 small molecules, and find that the results are consistent with RBFE for a congeneric series of ligands and offer considerable improvement in computational cost and precision relative to ABFE results for a series of small molecules with diverse cores and binding modes. All CBFE-related developments are fully implemented in the GPU-accelerated AMBER free energy module (pmemd.cuda) and are available as part of the latest official AMBER release.
{"title":"A Relative Binding Free Energy Framework for Structurally Dissimilar Molecules.","authors":"Hsu-Chun Tsai,Shi Zhang,Tai-Sung Lee,Timothy J Giese,Charles Lin,James Xu,Yinhui Yi,Darrin M York,Abir Ganguly,Albert C Pan","doi":"10.1021/acs.jcim.5c02204","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02204","url":null,"abstract":"Relative binding free energy (RBFE) calculations, widely used to predict the potencies of congeneric small molecules binding to a protein receptor, can greatly increase the efficiency of the hit-to-lead and lead optimization stages of the drug discovery process. Traditional RBFE methods, however, cannot be easily applied to small molecules lacking a common core or binding mode, precluding their use in a challenging but crucial component of many drug discovery campaigns. In principle, an absolute binding free energy (ABFE) method can be applied to such molecules, but ABFE often suffers from high computational cost and poor statistical convergence due to the large amount of additional sampling required when compared to RBFE. Here, we introduce core-hopping binding free energy (CBFE) calculations, a computationally efficient framework for the accurate determination of relative binding free energies between small molecules with different cores, leveraging several recently developed techniques such as Alchemical Enhanced Sampling (ACES) with optimized transformation pathways and flexible λ-spacing, as well as λ-dependent Boresch restraints. We benchmark the performance of CBFE across 4 protein systems consisting of 56 small molecules, and find that the results are consistent with RBFE for a congeneric series of ligands and offer considerable improvement in computational cost and precision relative to ABFE results for a series of small molecules with diverse cores and binding modes. All CBFE-related developments are fully implemented in the GPU-accelerated AMBER free energy module (pmemd.cuda) and are available as part of the latest official AMBER release.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"269 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145986645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pharmaceutical pollution in aquatic environments poses a significant ecological threat due to the accumulation of bioactive compounds from human and veterinary sources. In support of the EU Green Deal's Chemicals Strategy for Sustainability, this study presents a computational framework for predicting two key environmental risk indicators in fish: bioconcentration and ecotoxicity. Bioconcentration, quantified by the bioconcentration factor (BCF), reflects a chemical's tendency to accumulate in organisms, while ecotoxicity is assessed via the median lethal concentration (LC50) over defined exposure periods. We developed two high-performing machine learning (ML) models, achieving ROC AUC scores of 94.60% for bioconcentration and 96.06% for ecotoxicity, validated across both internal and external data sets. To expand the scope of risk evaluation, we incorporated metabolite prediction using the SyGMa tool, selected after benchmarking multiple alternatives. This enables the assessment of both parent compounds and their potentially toxic metabolites. Model interpretability was enhanced through molecular fingerprint analysis, which identified structural features associated with toxicity and accumulation, informing the early stages of drug design. To support practical implementation, we introduced G.AI.A (https://gaiatox.eu/), an intuitive web platform that allows users to input Simplified Molecular Input Line Entry System (SMILES) strings for rapid prediction of environmental risk end points. The application domain of G.AI.A lies in predictive toxicology, enabling researchers and regulatory bodies to assess the toxicological profiles of small organic compounds, excluding those containing heavy metals, by analyzing their chemical structures. The platform supports batch processing and offers interactive visualizations, facilitating compound screening and early stage environmental risk assessment. By integrating predictive modeling with interpretability and usability, our framework advances green-by-design pharmaceutical development and contributes to sustainable chemical management.
{"title":"G.AI.A: An Integrated Machine-Learning Platform for Predicting Bioaccumulation and Ecotoxicity of Pharmaceuticals.","authors":"Evangelos Tsoukas,Michail Papadourakis,Eleni Chontzopoulou,Spyridon Vythoulkas,Christos Didachos,Dionisis Cavouras,Panagiotis Zoumpoulakis,Minos-Timotheos Matsoukas","doi":"10.1021/acs.jcim.5c02286","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02286","url":null,"abstract":"Pharmaceutical pollution in aquatic environments poses a significant ecological threat due to the accumulation of bioactive compounds from human and veterinary sources. In support of the EU Green Deal's Chemicals Strategy for Sustainability, this study presents a computational framework for predicting two key environmental risk indicators in fish: bioconcentration and ecotoxicity. Bioconcentration, quantified by the bioconcentration factor (BCF), reflects a chemical's tendency to accumulate in organisms, while ecotoxicity is assessed via the median lethal concentration (LC50) over defined exposure periods. We developed two high-performing machine learning (ML) models, achieving ROC AUC scores of 94.60% for bioconcentration and 96.06% for ecotoxicity, validated across both internal and external data sets. To expand the scope of risk evaluation, we incorporated metabolite prediction using the SyGMa tool, selected after benchmarking multiple alternatives. This enables the assessment of both parent compounds and their potentially toxic metabolites. Model interpretability was enhanced through molecular fingerprint analysis, which identified structural features associated with toxicity and accumulation, informing the early stages of drug design. To support practical implementation, we introduced G.AI.A (https://gaiatox.eu/), an intuitive web platform that allows users to input Simplified Molecular Input Line Entry System (SMILES) strings for rapid prediction of environmental risk end points. The application domain of G.AI.A lies in predictive toxicology, enabling researchers and regulatory bodies to assess the toxicological profiles of small organic compounds, excluding those containing heavy metals, by analyzing their chemical structures. The platform supports batch processing and offers interactive visualizations, facilitating compound screening and early stage environmental risk assessment. By integrating predictive modeling with interpretability and usability, our framework advances green-by-design pharmaceutical development and contributes to sustainable chemical management.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"57 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145986222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-16DOI: 10.1021/acs.jcim.5c01979
Samanvitha Kunigal Vijaya Shankar,Christopher P Ewels,Yann Claveau
We present a Python package, DynoPore, to study the liquids confined in cylindrical and slit-like geometries. Structural analysis functions such as density profiles and radial distribution functions are included to facilitate the understanding of the environment and local structure of liquid molecules within the confined systems. For dynamics, DynoPore includes region-resolved mean-squared displacement and lifetime functions to investigate molecular motion in different regions of the pore. For ionic systems, Dynopore also offer Nernst-Einstein and Einstein-Helfand conductivity analysis functions. By combining these structural and dynamical analysis tools in a single, user-friendly framework, DynoPore delivers a convenient and comprehensive package to analyze confined liquids.
{"title":"DynoPore─A Package to Analyze Molecular Dynamics Trajectories of Confined Liquids.","authors":"Samanvitha Kunigal Vijaya Shankar,Christopher P Ewels,Yann Claveau","doi":"10.1021/acs.jcim.5c01979","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c01979","url":null,"abstract":"We present a Python package, DynoPore, to study the liquids confined in cylindrical and slit-like geometries. Structural analysis functions such as density profiles and radial distribution functions are included to facilitate the understanding of the environment and local structure of liquid molecules within the confined systems. For dynamics, DynoPore includes region-resolved mean-squared displacement and lifetime functions to investigate molecular motion in different regions of the pore. For ionic systems, Dynopore also offer Nernst-Einstein and Einstein-Helfand conductivity analysis functions. By combining these structural and dynamical analysis tools in a single, user-friendly framework, DynoPore delivers a convenient and comprehensive package to analyze confined liquids.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"22 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145986223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-16DOI: 10.1021/acs.jcim.5c02720
Ty Balduf,Philip A Gerken,Mee Y Shelley,Mark A Watson,M Chandler Bennett,Mats Svensson,Abba E Leffler,Art Bochevarov
A multistep computational workflow that accurately assigns organic drug-like molecules to one of three atropisomer classes on the basis of computed barrier heights has been developed. The workflow identifies rotatable bonds and applies progressively more accurate types of calculations to the eligible rotational degrees of freedom. An initial energy scan with a force field (OPLS4) is followed by a similar scan that uses an energy function driven by a neural network model (QRNN-TB) trained on density functional theory (DFT) energies. The maxima corresponding to the potentially stereogenic rotatable bonds identified at this point are further processed by applying a transition state search at the QRNN-TB level of theory. Finally, ωB97X-D3/def2-TZVP(-f) DFT energies are computed for all located extrema. The accuracy of the predicted rotational barriers was benchmarked against ωB97M-V/cc-pVTZ and DLPNO-CCSD(T)/def2-TZVPP energies with excellent correlations. The automated protocol classifies organic molecules into atropisomeric classes with a greater than 90% success rate when applied to a test set of 65 molecules containing rotationally restricted torsions (68 torsions in total). We anticipate that the balance of speed and accuracy in this method will make it conducive to production use in drug discovery programs.
{"title":"Prediction of Atropisomerism for Drug-like Molecules.","authors":"Ty Balduf,Philip A Gerken,Mee Y Shelley,Mark A Watson,M Chandler Bennett,Mats Svensson,Abba E Leffler,Art Bochevarov","doi":"10.1021/acs.jcim.5c02720","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02720","url":null,"abstract":"A multistep computational workflow that accurately assigns organic drug-like molecules to one of three atropisomer classes on the basis of computed barrier heights has been developed. The workflow identifies rotatable bonds and applies progressively more accurate types of calculations to the eligible rotational degrees of freedom. An initial energy scan with a force field (OPLS4) is followed by a similar scan that uses an energy function driven by a neural network model (QRNN-TB) trained on density functional theory (DFT) energies. The maxima corresponding to the potentially stereogenic rotatable bonds identified at this point are further processed by applying a transition state search at the QRNN-TB level of theory. Finally, ωB97X-D3/def2-TZVP(-f) DFT energies are computed for all located extrema. The accuracy of the predicted rotational barriers was benchmarked against ωB97M-V/cc-pVTZ and DLPNO-CCSD(T)/def2-TZVPP energies with excellent correlations. The automated protocol classifies organic molecules into atropisomeric classes with a greater than 90% success rate when applied to a test set of 65 molecules containing rotationally restricted torsions (68 torsions in total). We anticipate that the balance of speed and accuracy in this method will make it conducive to production use in drug discovery programs.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"20 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145986224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-15DOI: 10.1021/acs.jcim.5c02396
Yi Zhang, , , Yuru Li*, , , Zhicheng Jin, , , Ye Tian, , and , Chen Su,
Single-cell multiomics technologies provide unprecedented opportunities to dissect cellular heterogeneity by capturing multidimensional information on complex cellular states and regulatory networks. However, challenges such as high dimensionality, extreme data sparsity, and modality-specific discrepancies hinder the accuracy, interpretability, and scalability of the existing integration methods. Existing integration paradigms, including horizontal, vertical, and diagonal strategies, are further limited by their inability to fully capture nonlinear biological relationships, their reliance on high-quality data, and their substantial computational demands. Here, we present scII (Dual-Threshold Adaptive Integration of Single-Cell Multiomics Data Driven by Imputation), an adaptive framework designed to integrate gene expression (scRNA-seq) and chromatin accessibility (scATAC-seq) data. Our approach is built on several key conceptual innovations: (i) scRNA-seq–guided signal imputation to enhance information integrity in scATAC-seq; (ii) a multilayer perceptron with the Maxout activation function to improve the modeling of complex nonlinear relationships and mitigate the vanishing gradient problem; (iii) a dynamic dual-threshold adaptive selection mechanism that jointly evaluates cross-modality feature similarity and classification reliability to select high-quality cells; and (iv) Bayesian Information Criterion (BIC)-based optimization to dynamically determine the number of Gaussian Mixture Model components according to data distribution, thereby eliminating reliance on manually preset parameters. Extensive experiments on multiple real-world and simulated data sets demonstrate that scII not only enables efficient integration of unpaired scRNA-seq and scATAC-seq data but also achieves accurate transfer of cell-type annotations, allowing high-precision cell-type prediction for scATAC-seq.
{"title":"scII: Dual-Threshold Adaptive Integration of Single-Cell Multiomics Data Driven by Imputation","authors":"Yi Zhang, , , Yuru Li*, , , Zhicheng Jin, , , Ye Tian, , and , Chen Su, ","doi":"10.1021/acs.jcim.5c02396","DOIUrl":"10.1021/acs.jcim.5c02396","url":null,"abstract":"<p >Single-cell multiomics technologies provide unprecedented opportunities to dissect cellular heterogeneity by capturing multidimensional information on complex cellular states and regulatory networks. However, challenges such as high dimensionality, extreme data sparsity, and modality-specific discrepancies hinder the accuracy, interpretability, and scalability of the existing integration methods. Existing integration paradigms, including horizontal, vertical, and diagonal strategies, are further limited by their inability to fully capture nonlinear biological relationships, their reliance on high-quality data, and their substantial computational demands. Here, we present scII (Dual-Threshold Adaptive Integration of Single-Cell Multiomics Data Driven by Imputation), an adaptive framework designed to integrate gene expression (scRNA-seq) and chromatin accessibility (scATAC-seq) data. Our approach is built on several key conceptual innovations: (i) scRNA-seq–guided signal imputation to enhance information integrity in scATAC-seq; (ii) a multilayer perceptron with the Maxout activation function to improve the modeling of complex nonlinear relationships and mitigate the vanishing gradient problem; (iii) a dynamic dual-threshold adaptive selection mechanism that jointly evaluates cross-modality feature similarity and classification reliability to select high-quality cells; and (iv) Bayesian Information Criterion (BIC)-based optimization to dynamically determine the number of Gaussian Mixture Model components according to data distribution, thereby eliminating reliance on manually preset parameters. Extensive experiments on multiple real-world and simulated data sets demonstrate that scII not only enables efficient integration of unpaired scRNA-seq and scATAC-seq data but also achieves accurate transfer of cell-type annotations, allowing high-precision cell-type prediction for scATAC-seq.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"66 3","pages":"1445–1456"},"PeriodicalIF":5.3,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/pdf/10.1021/acs.jcim.5c02396","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145971784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-14DOI: 10.1021/acs.jcim.5c02521
Guodong Hu*, , , Jin Qian, , , Chengfei Cai, , and , Jianzhong Chen*,
Type III CRISPR systems provide adaptive immunity against invasion of foreign nucleic acids by generating cyclic oligoadenylate (cAn) second messengers, which activate effector proteins containing CRISPR-associated Rossmann fold (CARF) domains. The apo form of CARF adopts a closed state, distinct from its cA4-bound open state conformation. To investigate the conformational transition, we performed multiple type molecular dynamics (MD) simulations, revealing a unidirectional conformational shift toward the closed state. This transition was hindered by reduced flexibility in cA4-binding residues. Notably, the conformational change primarily occurs between the two monomers, with minimal structural rearrangement within individual monomers. Comparative analysis showed that while the number of hydrogen bonds and contacts between CARF and cA4 decreases in the closed state, intermonomer interactions are strengthened. Binding free-energy calculations between the two chains of CARF further confirmed higher affinity in the closed state. Our findings support an energy-driven conformational change model, providing insights for optimizing CRISPR-based genetic manipulation tools.
{"title":"Conformational Transition of the CARF Domain Driven by Binding Free Energy","authors":"Guodong Hu*, , , Jin Qian, , , Chengfei Cai, , and , Jianzhong Chen*, ","doi":"10.1021/acs.jcim.5c02521","DOIUrl":"10.1021/acs.jcim.5c02521","url":null,"abstract":"<p >Type III CRISPR systems provide adaptive immunity against invasion of foreign nucleic acids by generating cyclic oligoadenylate (cA<sub><i>n</i></sub>) second messengers, which activate effector proteins containing CRISPR-associated Rossmann fold (CARF) domains. The apo form of CARF adopts a closed state, distinct from its cA<sub>4</sub>-bound open state conformation. To investigate the conformational transition, we performed multiple type molecular dynamics (MD) simulations, revealing a unidirectional conformational shift toward the closed state. This transition was hindered by reduced flexibility in cA<sub>4</sub>-binding residues. Notably, the conformational change primarily occurs between the two monomers, with minimal structural rearrangement within individual monomers. Comparative analysis showed that while the number of hydrogen bonds and contacts between CARF and cA<sub>4</sub> decreases in the closed state, intermonomer interactions are strengthened. Binding free-energy calculations between the two chains of CARF further confirmed higher affinity in the closed state. Our findings support an energy-driven conformational change model, providing insights for optimizing CRISPR-based genetic manipulation tools.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"66 2","pages":"1179–1189"},"PeriodicalIF":5.3,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145961555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-14DOI: 10.1021/acs.jcim.5c01858
Davide Provasi, , , Kirill Konovalov, , , Nicholas Riina, , , Olivia Cullen, , and , Marta Filizola*,
G Protein-Coupled Receptors (GPCRs) are important targets for drug discovery owing to their ability to respond to a broad range of stimuli and their involvement in numerous pathologies. Although traditional ligand-based and structure-based approaches have facilitated the development of effective therapeutics for many GPCRs, these approaches often fall short when applied to receptors with limited ligand or structural data. This limitation highlights the critical need for advanced strategies capable of accurately predicting ligand bioactivity across the entire GPCR family, especially for understudied receptor subtypes. In this study, we introduce BOLD-GPCRs (BERT-Optimized Ligand Discovery for GPCRs), a deep learning framework designed to enhance the prediction of ligand bioactivity across class A GPCRs. Accessible via a user-friendly web interface, BOLD-GPCRs employs transfer learning and leverages curated data sets of known class A GPCR ligands, receptor sequences, and signaling-relevant mutations. By integrating dense neural network classifiers with transformer-based protein language models, BOLD-GPCRs captures complex relationships between receptor sequence/function and ligand activity. Our results demonstrate that BOLD-GPCRs achieves robust predictive performance for both ligand bioactivity and mutational effects across a broad range of class A GPCRs, underscoring its potential as a valuable tool for ligand discovery, especially for poorly characterized receptors.
{"title":"BOLD-GPCRs: A Transformer-Powered App for Predicting Ligand Bioactivity and Mutational Effects across Class A GPCRs","authors":"Davide Provasi, , , Kirill Konovalov, , , Nicholas Riina, , , Olivia Cullen, , and , Marta Filizola*, ","doi":"10.1021/acs.jcim.5c01858","DOIUrl":"10.1021/acs.jcim.5c01858","url":null,"abstract":"<p >G Protein-Coupled Receptors (GPCRs) are important targets for drug discovery owing to their ability to respond to a broad range of stimuli and their involvement in numerous pathologies. Although traditional ligand-based and structure-based approaches have facilitated the development of effective therapeutics for many GPCRs, these approaches often fall short when applied to receptors with limited ligand or structural data. This limitation highlights the critical need for advanced strategies capable of accurately predicting ligand bioactivity across the entire GPCR family, especially for understudied receptor subtypes. In this study, we introduce BOLD-GPCRs (BERT-Optimized Ligand Discovery for GPCRs), a deep learning framework designed to enhance the prediction of ligand bioactivity across class A GPCRs. Accessible via a user-friendly web interface, BOLD-GPCRs employs transfer learning and leverages curated data sets of known class A GPCR ligands, receptor sequences, and signaling-relevant mutations. By integrating dense neural network classifiers with transformer-based protein language models, BOLD-GPCRs captures complex relationships between receptor sequence/function and ligand activity. Our results demonstrate that BOLD-GPCRs achieves robust predictive performance for both ligand bioactivity and mutational effects across a broad range of class A GPCRs, underscoring its potential as a valuable tool for ligand discovery, especially for poorly characterized receptors.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"66 2","pages":"855–866"},"PeriodicalIF":5.3,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145964542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-14DOI: 10.1021/acs.jcim.5c02628
Katarzyna J. Zator, , , Maria Chiara Storer, , and , Christopher A. Hunter*,
Atom surface site Interaction Points (AIP) which were previously used to predict association constants for synthetic host–guest systems has been extended to protein–ligand complexes. AIP descriptions of protein binding sites were obtained by combining a library of precomputed AIP descriptors for all protein functional groups with a graph-based substructure matching algorithm. The corresponding AIP description of ligands was obtained directly by footprinting the molecular electrostatic potential surface calculated using density functional theory. These AIP descriptions were projected onto X-ray crystal structures of protein–ligand complexes to identify pairs of AIPs that were sufficiently close in space to constitute an intermolecular interaction. The overall free energy of binding was calculated by summing the contributions of each AIP contact and associated desolvation. Application to the 94 complexes involving uncharged ligands in CASF benchmark data set showed that the method achieves a Pearson correlation coefficient of 0.76 and an RMSD of 11 kJ mol–1 for absolute free energies of binding.
{"title":"Prediction of Protein–Ligand Binding Affinities Using Atomic Surface Site Interaction Points","authors":"Katarzyna J. Zator, , , Maria Chiara Storer, , and , Christopher A. Hunter*, ","doi":"10.1021/acs.jcim.5c02628","DOIUrl":"10.1021/acs.jcim.5c02628","url":null,"abstract":"<p >Atom surface site Interaction Points (AIP) which were previously used to predict association constants for synthetic host–guest systems has been extended to protein–ligand complexes. AIP descriptions of protein binding sites were obtained by combining a library of precomputed AIP descriptors for all protein functional groups with a graph-based substructure matching algorithm. The corresponding AIP description of ligands was obtained directly by footprinting the molecular electrostatic potential surface calculated using density functional theory. These AIP descriptions were projected onto X-ray crystal structures of protein–ligand complexes to identify pairs of AIPs that were sufficiently close in space to constitute an intermolecular interaction. The overall free energy of binding was calculated by summing the contributions of each AIP contact and associated desolvation. Application to the 94 complexes involving uncharged ligands in CASF benchmark data set showed that the method achieves a Pearson correlation coefficient of 0.76 and an RMSD of 11 kJ mol<sup>–1</sup> for absolute free energies of binding.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"66 2","pages":"1097–1105"},"PeriodicalIF":5.3,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.acs.org/doi/pdf/10.1021/acs.jcim.5c02628","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145971787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-14DOI: 10.1021/acs.jcim.5c02363
Antonio M Bosch-Fernández,Willy Menacho,Rubén Pérez,Horacio V Guzman
Numerous respiratory viruses are transmitted via airborne microdroplets that frequently adhere to fomites. Understanding the behavior of these phenomenologically rich bio-material interfaces remains an open issue. Here, we tackle the complex interplay between glycans and protein conformational dynamics during adsorption onto polarizable surfaces, focusing on the potential of glycans as molecular interaction modulators. We employ molecular dynamics simulations to dissect the interactions of the Receptor Binding Domain (RBD) glycoproteins from different SARS-CoV-2 variants of concern (VoC), in both open and closed conformations, with polarizable planar interfaces. Advanced analysis using 2D space reveals distinct adsorption mechanisms depending on the initial loci of the glycan within the protein wall. Hydrophobic surfaces facilitate stable adsorption for both RBD conformations. Conversely, hydrophilic surfaces exhibit reduced adsorption, particularly for the closed-RBD, where glycans predominantly form hydrogen bonds. Glycans significantly modulate closed-RBD adsorption, either enhancing it by permanent tethering or impeding it depending on the initial conformation and protein mutations (Omicron). Results for the individual RBDs are consistent with scaled-up simulations for the complete spike ectodomain glycoprotein. Our findings unveil novel glycan-mediated adsorption phenomena and provide fundamental insights into glycoprotein-surface interactions, paving the way for understanding glycan roles in glycoprotein-fomite adsorption, protein aggregation, and recognition at polarizable biological interfaces.
{"title":"Glycans Modulate the Adsorption of RBD Glycoproteins on Polarizable Surfaces.","authors":"Antonio M Bosch-Fernández,Willy Menacho,Rubén Pérez,Horacio V Guzman","doi":"10.1021/acs.jcim.5c02363","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02363","url":null,"abstract":"Numerous respiratory viruses are transmitted via airborne microdroplets that frequently adhere to fomites. Understanding the behavior of these phenomenologically rich bio-material interfaces remains an open issue. Here, we tackle the complex interplay between glycans and protein conformational dynamics during adsorption onto polarizable surfaces, focusing on the potential of glycans as molecular interaction modulators. We employ molecular dynamics simulations to dissect the interactions of the Receptor Binding Domain (RBD) glycoproteins from different SARS-CoV-2 variants of concern (VoC), in both open and closed conformations, with polarizable planar interfaces. Advanced analysis using 2D space reveals distinct adsorption mechanisms depending on the initial loci of the glycan within the protein wall. Hydrophobic surfaces facilitate stable adsorption for both RBD conformations. Conversely, hydrophilic surfaces exhibit reduced adsorption, particularly for the closed-RBD, where glycans predominantly form hydrogen bonds. Glycans significantly modulate closed-RBD adsorption, either enhancing it by permanent tethering or impeding it depending on the initial conformation and protein mutations (Omicron). Results for the individual RBDs are consistent with scaled-up simulations for the complete spike ectodomain glycoprotein. Our findings unveil novel glycan-mediated adsorption phenomena and provide fundamental insights into glycoprotein-surface interactions, paving the way for understanding glycan roles in glycoprotein-fomite adsorption, protein aggregation, and recognition at polarizable biological interfaces.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"261 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145968605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}