Pub Date : 2026-02-05DOI: 10.1021/acs.jcim.5c02636
Filipe E P Rodrigues, Tamis Darbre, Miguel Machuqueiro
Transfection, the process of delivering genetic material into eukaryotic cells, is crucial in biotechnology and the development of treatments. Naked nucleic acids face challenges such as enzymatic degradation, poor pharmacokinetics, and immunogenicity, which can be mitigated by delivery systems such as liposomes, cationic polymers, and dendrimers that protect and enhance uptake. Peptide dendrimers, in particular, show promise as nucleic acid carriers due to their lower cytotoxicity and immunogenicity, though their mechanisms, efficiency, and optimization remain to be clarified. Here, we characterized the configurational, conformational, and protonation landscapes of different peptide dendrimers in complex with siRNA. We found that nucleic acids modulate dendrimer structure, with electrostatic interactions strengthened at low pH through enhanced protonation of the N-termini. Although experimental data show that the more hydrophobic dendrimer examined displays the highest apparent affinity for siRNA, its reduced number of lysine residues results in weaker overall binding due to diminished charge density. This higher affinity observed is likely linked to increased aggregation propensity. In contrast, the dendrimer sequence with branching residues of inverted chirality, which performs worse, shows the lowest propensity for aggregation. Our work suggests that chirality has only a negligible effect on the dendrimer-siRNA binding modes, and that such differences are subtle, particularly at the monomeric level. Overall, this work provides mechanistic insight into dendrimer-siRNA interactions and outlines potential strategies to refine dendrimer design for improved nucleic acid delivery.
{"title":"Constant-pH Molecular Dynamics of Cationic Peptide Dendrimers Binding to siRNA.","authors":"Filipe E P Rodrigues, Tamis Darbre, Miguel Machuqueiro","doi":"10.1021/acs.jcim.5c02636","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02636","url":null,"abstract":"<p><p>Transfection, the process of delivering genetic material into eukaryotic cells, is crucial in biotechnology and the development of treatments. Naked nucleic acids face challenges such as enzymatic degradation, poor pharmacokinetics, and immunogenicity, which can be mitigated by delivery systems such as liposomes, cationic polymers, and dendrimers that protect and enhance uptake. Peptide dendrimers, in particular, show promise as nucleic acid carriers due to their lower cytotoxicity and immunogenicity, though their mechanisms, efficiency, and optimization remain to be clarified. Here, we characterized the configurational, conformational, and protonation landscapes of different peptide dendrimers in complex with siRNA. We found that nucleic acids modulate dendrimer structure, with electrostatic interactions strengthened at low pH through enhanced protonation of the N-termini. Although experimental data show that the more hydrophobic dendrimer examined displays the highest apparent affinity for siRNA, its reduced number of lysine residues results in weaker overall binding due to diminished charge density. This higher affinity observed is likely linked to increased aggregation propensity. In contrast, the dendrimer sequence with branching residues of inverted chirality, which performs worse, shows the lowest propensity for aggregation. Our work suggests that chirality has only a negligible effect on the dendrimer-siRNA binding modes, and that such differences are subtle, particularly at the monomeric level. Overall, this work provides mechanistic insight into dendrimer-siRNA interactions and outlines potential strategies to refine dendrimer design for improved nucleic acid delivery.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.3,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146117231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-04DOI: 10.1021/acs.jcim.5c02790
Xiaomin Wu,Miao He,Yousi Lin
Global optimization of bimetallic and monometallic cluster structures remains computationally challenging, particularly due to the rapid increase in homotops with system size and compositional complexity. To address this issue, we present a Collaborative Differential Evolution (CDE) algorithm featuring a multisubpopulation collaborative architecture specifically designed for efficient structure prediction of diverse nanocluster systems. The framework integrates three functionally specialized subpopulations for exploration, exploitation, and balance along with adaptive operations tailored for metallic nanoclusters. This algorithm is implemented as a user-friendly online C++ toolkit. We demonstrate the versatility and robustness of our approach through comprehensive structural optimization across three distinct case studies: Pt–Pd and Cu–Au bimetallic clusters, as well as monometallic Pt clusters. The CDE algorithm consistently achieves 50–100% faster convergence and superior stability compared to conventional methods across all tested systems, establishing itself as a robust and generalizable tool for accelerating the discovery of stable configurations in diverse cluster materials.
{"title":"C++ Toolkit for Bimetallic Cluster Structure Optimization Using Collaborative Differential Evolution","authors":"Xiaomin Wu,Miao He,Yousi Lin","doi":"10.1021/acs.jcim.5c02790","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02790","url":null,"abstract":"Global optimization of bimetallic and monometallic cluster structures remains computationally challenging, particularly due to the rapid increase in homotops with system size and compositional complexity. To address this issue, we present a Collaborative Differential Evolution (CDE) algorithm featuring a multisubpopulation collaborative architecture specifically designed for efficient structure prediction of diverse nanocluster systems. The framework integrates three functionally specialized subpopulations for exploration, exploitation, and balance along with adaptive operations tailored for metallic nanoclusters. This algorithm is implemented as a user-friendly online C++ toolkit. We demonstrate the versatility and robustness of our approach through comprehensive structural optimization across three distinct case studies: Pt–Pd and Cu–Au bimetallic clusters, as well as monometallic Pt clusters. The CDE algorithm consistently achieves 50–100% faster convergence and superior stability compared to conventional methods across all tested systems, establishing itself as a robust and generalizable tool for accelerating the discovery of stable configurations in diverse cluster materials.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"1 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146111105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-04DOI: 10.1021/acs.jcim.5c03035
Franz Görlich,Julija Zavadlav
Coarse-grained (CG) modeling enables molecular simulations to reach time and length scales inaccessible to fully atomistic methods. For classical CG models, the choice of mapping, that is, how atoms are grouped into CG sites, is a major determinant of accuracy and transferability. At the same time, the emergence of machine learning potentials (MLPs) offers new opportunities to build CG models that can in principle learn the true potential of the mean force for any mapping. In this work, we systematically investigate how the choice of mapping influences the representations learned by equivariant MLPs by studying liquid hexane, amino acids, and polyalanine. We find that when the length scales of bonded and nonbonded interactions overlap, unphysical bond permutations can occur. We also demonstrate that correctly encoding species and maintaining stereochemistry are crucial, as neglecting either introduces unphysical symmetries. Our findings provide practical guidance for selecting CG mappings compatible with modern architectures and guide the development of transferable CG models.
{"title":"Mapping Still Matters: Coarse-Graining with Machine Learning Potentials","authors":"Franz Görlich,Julija Zavadlav","doi":"10.1021/acs.jcim.5c03035","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c03035","url":null,"abstract":"Coarse-grained (CG) modeling enables molecular simulations to reach time and length scales inaccessible to fully atomistic methods. For classical CG models, the choice of mapping, that is, how atoms are grouped into CG sites, is a major determinant of accuracy and transferability. At the same time, the emergence of machine learning potentials (MLPs) offers new opportunities to build CG models that can in principle learn the true potential of the mean force for any mapping. In this work, we systematically investigate how the choice of mapping influences the representations learned by equivariant MLPs by studying liquid hexane, amino acids, and polyalanine. We find that when the length scales of bonded and nonbonded interactions overlap, unphysical bond permutations can occur. We also demonstrate that correctly encoding species and maintaining stereochemistry are crucial, as neglecting either introduces unphysical symmetries. Our findings provide practical guidance for selecting CG mappings compatible with modern architectures and guide the development of transferable CG models.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"88 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146111138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-04DOI: 10.1021/acs.jcim.5c02871
Yujing Zhao,Juntao Wang,Yuxin Song,Qilei Liu,Jiaqi Lin
Ionizable lipids are fundamental to the efficacy of lipid nanoparticles (LNPs) in pivotal areas including mRNA vaccines. Their development, however, is hindered by intricate structure–property relationships and limited experimental data. To address these challenges, this study proposed a small-data-driven framework that pioneered the use of Kolmogorov–Arnold networks (KANs)─a symbolic regression-based machine learning (ML) approach─to accelerate the discovery of novel siloxane-based ionizable lipids. Using only 36 training samples, the resulting KAN model demonstrated high predictive accuracy for mRNA delivery efficiency (Qcv2 = 0.710), outperforming conventional ML models by an average absolute improvement of 0.627 in cross-validation and yielding explicit mathematical formulas. Combined with virtual screening and umbrella sampling simulations, the framework identified three candidate lipids with superior predicted performance. Molecular dynamics simulations validated that the optimal candidate achieved stronger binding affinity to the endosomal membrane, as evidenced by a 187% reduction (from −1.048 to −3.011 kcal/mol) in the binding free energy minimum compared to the best experimental control. This result aligns with the delivery efficiency predicted by the KAN model. Overall, the proposed framework establishes a data-efficient paradigm for ML-guided ionizable lipid design, bridging symbolic regression with molecular dynamics validation for next-generation LNP therapeutics.
{"title":"Accelerating Siloxane-Based Ionizable Lipid Design for LNPs with Data-Efficient Kolmogorov–Arnold Networks","authors":"Yujing Zhao,Juntao Wang,Yuxin Song,Qilei Liu,Jiaqi Lin","doi":"10.1021/acs.jcim.5c02871","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02871","url":null,"abstract":"Ionizable lipids are fundamental to the efficacy of lipid nanoparticles (LNPs) in pivotal areas including mRNA vaccines. Their development, however, is hindered by intricate structure–property relationships and limited experimental data. To address these challenges, this study proposed a small-data-driven framework that pioneered the use of Kolmogorov–Arnold networks (KANs)─a symbolic regression-based machine learning (ML) approach─to accelerate the discovery of novel siloxane-based ionizable lipids. Using only 36 training samples, the resulting KAN model demonstrated high predictive accuracy for mRNA delivery efficiency (Qcv2 = 0.710), outperforming conventional ML models by an average absolute improvement of 0.627 in cross-validation and yielding explicit mathematical formulas. Combined with virtual screening and umbrella sampling simulations, the framework identified three candidate lipids with superior predicted performance. Molecular dynamics simulations validated that the optimal candidate achieved stronger binding affinity to the endosomal membrane, as evidenced by a 187% reduction (from −1.048 to −3.011 kcal/mol) in the binding free energy minimum compared to the best experimental control. This result aligns with the delivery efficiency predicted by the KAN model. Overall, the proposed framework establishes a data-efficient paradigm for ML-guided ionizable lipid design, bridging symbolic regression with molecular dynamics validation for next-generation LNP therapeutics.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"398 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146111085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-03DOI: 10.1021/acs.jcim.5c02712
Omer Tayfuroglu,Seda Keskin
Machine-learned potentials (MLPs) have emerged as transformative tools for modeling metal–organic frameworks (MOFs), bridging the accuracy of quantum mechanics with the efficiency required for large-scale molecular simulations. By learning the potential energy surface directly from quantum-mechanical reference data, MLPs enable a unified description of the complex nature of MOFs and their interactions with guest molecules across multiple length and time scales. Recent developments have demonstrated the capability of MLPs to model intrinsic MOF properties such as lattice dynamics, thermal expansion, and mechanical response, as well as to describe adsorption thermodynamics, diffusion, and cooperative host–guest behavior in flexible frameworks. Developing reliable and transferable MLPs for MOFs remains a significant challenge due to the vast chemical and structural diversity of MOFs and the complexity of sampling guest-framework configurations. The lack of openly shared, standardized, and user-friendly MLP implementations also limits their broader adoption. This review focuses on the current progress in MLP-based modeling of MOFs, highlighting methodological advances, data-generation strategies, and active-learning protocols, while outlining the key challenges and future directions for developing transferable, accessible, and universal MLPs for the predictive design and discovery of MOFs.
{"title":"Transforming MOF Modeling with Machine-Learned Potentials: Progress and Perspectives","authors":"Omer Tayfuroglu,Seda Keskin","doi":"10.1021/acs.jcim.5c02712","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02712","url":null,"abstract":"Machine-learned potentials (MLPs) have emerged as transformative tools for modeling metal–organic frameworks (MOFs), bridging the accuracy of quantum mechanics with the efficiency required for large-scale molecular simulations. By learning the potential energy surface directly from quantum-mechanical reference data, MLPs enable a unified description of the complex nature of MOFs and their interactions with guest molecules across multiple length and time scales. Recent developments have demonstrated the capability of MLPs to model intrinsic MOF properties such as lattice dynamics, thermal expansion, and mechanical response, as well as to describe adsorption thermodynamics, diffusion, and cooperative host–guest behavior in flexible frameworks. Developing reliable and transferable MLPs for MOFs remains a significant challenge due to the vast chemical and structural diversity of MOFs and the complexity of sampling guest-framework configurations. The lack of openly shared, standardized, and user-friendly MLP implementations also limits their broader adoption. This review focuses on the current progress in MLP-based modeling of MOFs, highlighting methodological advances, data-generation strategies, and active-learning protocols, while outlining the key challenges and future directions for developing transferable, accessible, and universal MLPs for the predictive design and discovery of MOFs.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"158 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146111112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-03DOI: 10.1021/acs.jcim.5c02698
Antonios P. Sarikas,Konstantinos Gkagkas,George E. Froudakis
Because of their ultrahigh porosity and tunable chemistry, metal–organic frameworks (MOFs) have emerged as leading candidates for gas adsorption applications. Nevertheless, their combinatorial nature induces a vast chemical space, challenging traditional exploration methods. In recent years, machine learning (ML) predictive models have enabled large-scale screening, but they are typically developed for a single adsorption property. This entails that for a new property one must train a model from scratch, a process that requires large amounts of labeled data that are not always available. In our previous work, we demonstrated that combining the potential energy surface─a 3D energy image of the material─with a convolutional neural network improves sample efficiency compared to conventional ML approaches. Here we extend this framework by introducing multitask and transfer learning to foster generalization across gases and conditions, even in data-scarce scenarios. To this end, we developed RetNeXt, a multitask pretrained model on 3.2 million publicly available adsorption-related data, which can be readily adapted to new domains and adsorption tasks. RetNeXt outperforms conventional single-task transfer approaches and achieves up to a 100-fold increase in sample efficiency compared to training from scratch. As such, it can serve as a foundation for future advances in the data-driven adsorption modeling of MOFs.
{"title":"RetNeXt: A Pretrained Model for Transfer Learning Across the MOF Adsorption Space","authors":"Antonios P. Sarikas,Konstantinos Gkagkas,George E. Froudakis","doi":"10.1021/acs.jcim.5c02698","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02698","url":null,"abstract":"Because of their ultrahigh porosity and tunable chemistry, metal–organic frameworks (MOFs) have emerged as leading candidates for gas adsorption applications. Nevertheless, their combinatorial nature induces a vast chemical space, challenging traditional exploration methods. In recent years, machine learning (ML) predictive models have enabled large-scale screening, but they are typically developed for a single adsorption property. This entails that for a new property one must train a model from scratch, a process that requires large amounts of labeled data that are not always available. In our previous work, we demonstrated that combining the potential energy surface─a 3D energy image of the material─with a convolutional neural network improves sample efficiency compared to conventional ML approaches. Here we extend this framework by introducing multitask and transfer learning to foster generalization across gases and conditions, even in data-scarce scenarios. To this end, we developed RetNeXt, a multitask pretrained model on 3.2 million publicly available adsorption-related data, which can be readily adapted to new domains and adsorption tasks. RetNeXt outperforms conventional single-task transfer approaches and achieves up to a 100-fold increase in sample efficiency compared to training from scratch. As such, it can serve as a foundation for future advances in the data-driven adsorption modeling of MOFs.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"253 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146111137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Metabolic pathway design is a fundamental aspect of metabolic engineering, playing a crucial role in the microbial synthesis of high-value compounds. While metabolic engineers recognize the prevalence of branching reactions─side reactions that divert metabolic flux toward nontarget compounds─current automated pathway design tools often focus primarily on linear pathway optimization. This focus may lead to incomplete efficiency assessments and suboptimal pathway selection due to unaccounted metabolic complexity. To address this gap, we introduce a novel metabolic pathway design method, EA-MNE (Evolutionary Algorithm-based Metabolic Network Evaluation). Within the EA-MNE method, we propose a new approach for expanding linear pathways into metabolic networks and two new evaluation criteria: (1) the number of effective branching reactions, which assesses the extent of branching impacts, and (2) the network theoretical yield, which precisely quantifies yield losses caused by branching reactions. Additionally, we integrate four key criteria─the number of effective branching reactions, network theoretical yield, network toxicity, and Gibbs free energy─for metabolic pathway design. This integrated approach provides a systematic solution for addressing branching reaction challenges, significantly improving both the accuracy of pathway evaluation and the synthetic efficiency of microbial systems.
{"title":"A Novel Metabolic Pathway Design Method Based on Evolutionary Algorithms and Metabolic Network Evaluation","authors":"Xin Zhao,Xueying Sun,Tao Zhang,Shuxin Cui,Yahui Cao,Bingzhi Li,Heng Song,Shuo Zheng","doi":"10.1021/acs.jcim.5c02219","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02219","url":null,"abstract":"Metabolic pathway design is a fundamental aspect of metabolic engineering, playing a crucial role in the microbial synthesis of high-value compounds. While metabolic engineers recognize the prevalence of branching reactions─side reactions that divert metabolic flux toward nontarget compounds─current automated pathway design tools often focus primarily on linear pathway optimization. This focus may lead to incomplete efficiency assessments and suboptimal pathway selection due to unaccounted metabolic complexity. To address this gap, we introduce a novel metabolic pathway design method, EA-MNE (Evolutionary Algorithm-based Metabolic Network Evaluation). Within the EA-MNE method, we propose a new approach for expanding linear pathways into metabolic networks and two new evaluation criteria: (1) the number of effective branching reactions, which assesses the extent of branching impacts, and (2) the network theoretical yield, which precisely quantifies yield losses caused by branching reactions. Additionally, we integrate four key criteria─the number of effective branching reactions, network theoretical yield, network toxicity, and Gibbs free energy─for metabolic pathway design. This integrated approach provides a systematic solution for addressing branching reaction challenges, significantly improving both the accuracy of pathway evaluation and the synthetic efficiency of microbial systems.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"104 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146111087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-02DOI: 10.1021/acs.jcim.5c02748
Yingjun Ma,MingXu Luo,Liyu Yan,Yuanyuan Ma
Exploring potential microbe-drug associations (MDAs) not only facilitates drug discovery and clinical treatment but also contributes to a deeper understanding of microbial mechanisms. However, most MDA discoveries rely on biological experiments, which are time-consuming and costly. Therefore, developing an effective computational model to predict novel MDAs is of great importance. In this study, we propose a Variational Bayesian Multi-Kernel Adaptive Deep Fusion (VBMKADF) model for MDA prediction. We first integrate multiomics data to construct drug molecular graphs and a microbe hypergraph. Then, we perform multilayer graph convolution and hypergraph convolution to extract multilevel similarities of drugs and microbes, respectively. An attention mechanism is subsequently introduced to adaptively fuse these multilevel similarities, which are then incorporated into the Bayesian logistic matrix factorization framework to guide the generation of latent variable distributions. Additionally, we develop a variational Expectation-Maximization algorithm for adaptive inference of model hyperparameters and latent variables, which also guides the training of the deep learning model. Experimental results on two benchmark data sets across three scenarios show that, compared to other state-of-the-art methods, VBMKADF achieves higher AUPR, AUC, and F1 scores in both balanced and highly imbalanced settings. Moreover, case studies further confirm that VBMKADF can serve as an effective tool for MDA prediction.
{"title":"Variational Bayesian Multi-Kernel Adaptive Deep Fusion for Microbe-Related Drug Prediction","authors":"Yingjun Ma,MingXu Luo,Liyu Yan,Yuanyuan Ma","doi":"10.1021/acs.jcim.5c02748","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02748","url":null,"abstract":"Exploring potential microbe-drug associations (MDAs) not only facilitates drug discovery and clinical treatment but also contributes to a deeper understanding of microbial mechanisms. However, most MDA discoveries rely on biological experiments, which are time-consuming and costly. Therefore, developing an effective computational model to predict novel MDAs is of great importance. In this study, we propose a Variational Bayesian Multi-Kernel Adaptive Deep Fusion (VBMKADF) model for MDA prediction. We first integrate multiomics data to construct drug molecular graphs and a microbe hypergraph. Then, we perform multilayer graph convolution and hypergraph convolution to extract multilevel similarities of drugs and microbes, respectively. An attention mechanism is subsequently introduced to adaptively fuse these multilevel similarities, which are then incorporated into the Bayesian logistic matrix factorization framework to guide the generation of latent variable distributions. Additionally, we develop a variational Expectation-Maximization algorithm for adaptive inference of model hyperparameters and latent variables, which also guides the training of the deep learning model. Experimental results on two benchmark data sets across three scenarios show that, compared to other state-of-the-art methods, VBMKADF achieves higher AUPR, AUC, and F1 scores in both balanced and highly imbalanced settings. Moreover, case studies further confirm that VBMKADF can serve as an effective tool for MDA prediction.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"80 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146097964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-02DOI: 10.1021/acs.jcim.5c02566
ByungUk Park,Reid C. Van Lehn
Predicting protein–membrane interactions is a formidable challenge due to the subtle physicochemical features that distinguish membrane-binding regions of a protein surface as well as the scarcity of experimentally resolved membrane-bound protein conformations. Here, we present MaSIF-PMP, a geometric deep learning model that leverages molecular surface fingerprints to predict interfacial binding sites (IBSs) of peripheral membrane proteins (PMPs). MaSIF-PMP integrates geometric and chemical surface features to produce spatially resolved IBS predictions. Compared to existing models, MaSIF-PMP achieves superior performance for IBS classification, while feature ablation studies and transfer learning analyses reveal distinct determinants governing protein–membrane versus protein–protein interactions. We further show that molecular dynamics (MD) simulations can validate model predictions, refine IBS labels, and capture composition-dependent membrane binding patterns. These results establish MaSIF-PMP as an effective framework for IBS prediction and highlight the potential of incorporating conformational dynamics from MD to improve both the model accuracy and biological interpretability.
{"title":"Decoding Protein–Membrane Binding Interfaces from Surface-Fingerprint-Based Geometric Deep Learning and Molecular Dynamics Simulations","authors":"ByungUk Park,Reid C. Van Lehn","doi":"10.1021/acs.jcim.5c02566","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02566","url":null,"abstract":"Predicting protein–membrane interactions is a formidable challenge due to the subtle physicochemical features that distinguish membrane-binding regions of a protein surface as well as the scarcity of experimentally resolved membrane-bound protein conformations. Here, we present MaSIF-PMP, a geometric deep learning model that leverages molecular surface fingerprints to predict interfacial binding sites (IBSs) of peripheral membrane proteins (PMPs). MaSIF-PMP integrates geometric and chemical surface features to produce spatially resolved IBS predictions. Compared to existing models, MaSIF-PMP achieves superior performance for IBS classification, while feature ablation studies and transfer learning analyses reveal distinct determinants governing protein–membrane versus protein–protein interactions. We further show that molecular dynamics (MD) simulations can validate model predictions, refine IBS labels, and capture composition-dependent membrane binding patterns. These results establish MaSIF-PMP as an effective framework for IBS prediction and highlight the potential of incorporating conformational dynamics from MD to improve both the model accuracy and biological interpretability.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"216 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146097963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Identifying transition states (TSs), the high-energy configurations that molecules pass through during chemical reactions, is essential for understanding and designing chemical processes. However, accurately and efficiently identifying these states remains one of the most challenging problems in computational chemistry. In this work, we introduce a new generative AI approach that improves the quality of initial guesses for TS structures. Our method can be combined with a variety of existing techniques, including both machine-learning models and fast, approximate quantum methods, to refine their predictions and bring them closer to chemically accurate results. Applied to TS guesses from a state-of-the-art machine-learning model, our approach reduces the median structural error to 0.077 Å and lowers the median absolute error in reaction barrier heights to 0.40 kcal mol–1. When starting from a widely used tight-binding approximation, it increases the success rate of locating valid TSs by 41% and speeds up high-level quantum optimization by a factor of 3. By making TS searches more accurate, robust, and efficient, this method could accelerate reaction mechanism discovery and support the development of new materials, catalysts, and pharmaceuticals.
{"title":"Adaptive Transition-State Refinement with Learned Equilibrium Flows","authors":"Samir Darouich,Vinh Tong,Tanja Bien,Johannes Kästner,Mathias Niepert","doi":"10.1021/acs.jcim.5c02902","DOIUrl":"https://doi.org/10.1021/acs.jcim.5c02902","url":null,"abstract":"Identifying transition states (TSs), the high-energy configurations that molecules pass through during chemical reactions, is essential for understanding and designing chemical processes. However, accurately and efficiently identifying these states remains one of the most challenging problems in computational chemistry. In this work, we introduce a new generative AI approach that improves the quality of initial guesses for TS structures. Our method can be combined with a variety of existing techniques, including both machine-learning models and fast, approximate quantum methods, to refine their predictions and bring them closer to chemically accurate results. Applied to TS guesses from a state-of-the-art machine-learning model, our approach reduces the median structural error to 0.077 Å and lowers the median absolute error in reaction barrier heights to 0.40 kcal mol–1. When starting from a widely used tight-binding approximation, it increases the success rate of locating valid TSs by 41% and speeds up high-level quantum optimization by a factor of 3. By making TS searches more accurate, robust, and efficient, this method could accelerate reaction mechanism discovery and support the development of new materials, catalysts, and pharmaceuticals.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"3 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2026-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146097965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}